All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/9] xfs: Delayed Attributes
@ 2019-04-12 22:50 Allison Henderson
  2019-04-12 22:50 ` [PATCH 1/9] xfs: Remove all strlen in all xfs_attr_* functions for attr names Allison Henderson
                   ` (8 more replies)
  0 siblings, 9 replies; 48+ messages in thread
From: Allison Henderson @ 2019-04-12 22:50 UTC (permalink / raw)
  To: linux-xfs

Hi all,

This set is a subset of a larger series for parent pointers (v9). 
Delayed attributes allow attribute operations (set and remove) to be 
logged and committed in the same way that other delayed operations do.
This will help break up more complex operations when we later introduce
parent pointers which can be used in a number of optimizations.  Since
delayed attributes can be implemented as a stand alone feature, I've
decided to subdivide the set to help make it more manageable.

Changes since parent pointers v9:
This is mostly to update the set onto more recent code.  Some time
during v8, concerns were raised about the transaction becoming too
large, and we discussed breaking up the set operation by periodically
returning EAGAIN to cycle out the transaction during the finish routine. 
This is done in patches 7 and 8, but I dont recall them getting feedback 
as people were quite busy at the time.

Questions, comments, feedback appreciated!

Thanks all!
Allion

Allison Henderson (9):
  xfs: Remove all strlen in all xfs_attr_* functions for attr names.
  xfs: Hold inode locks in xfs_ialloc
  xfs: Add trans toggle to attr routines
  xfs: Set up infastructure for deferred attribute operations
  xfs: Add xfs_attr_set_deferred and xfs_attr_remove_deferred
  xfs: Add xfs_has_attr and subroutines
  xfs: Add attr context to log item
  xfs: Roll delayed attr operations by returning EAGAIN
  xfs: Remove roll_trans boolean

 fs/xfs/Makefile                 |   2 +
 fs/xfs/libxfs/xfs_attr.c        | 316 ++++++++++++++-------
 fs/xfs/libxfs/xfs_attr.h        |  61 ++++-
 fs/xfs/libxfs/xfs_attr_leaf.c   |  48 +++-
 fs/xfs/libxfs/xfs_attr_leaf.h   |   3 +-
 fs/xfs/libxfs/xfs_attr_remote.c |  20 --
 fs/xfs/libxfs/xfs_defer.c       |   1 +
 fs/xfs/libxfs/xfs_defer.h       |   3 +
 fs/xfs/libxfs/xfs_log_format.h  |  44 ++-
 fs/xfs/libxfs/xfs_types.h       |   1 +
 fs/xfs/scrub/common.c           |   2 +
 fs/xfs/xfs_acl.c                |  14 +-
 fs/xfs/xfs_attr_item.c          | 587 ++++++++++++++++++++++++++++++++++++++++
 fs/xfs/xfs_attr_item.h          | 103 +++++++
 fs/xfs/xfs_inode.c              |   6 +-
 fs/xfs/xfs_ioctl.c              |  15 +-
 fs/xfs/xfs_ioctl32.c            |   2 +
 fs/xfs/xfs_iops.c               |   7 +-
 fs/xfs/xfs_log_recover.c        | 172 ++++++++++++
 fs/xfs/xfs_ondisk.h             |   2 +
 fs/xfs/xfs_qm.c                 |   1 +
 fs/xfs/xfs_symlink.c            |   3 +
 fs/xfs/xfs_trans.h              |  12 +
 fs/xfs/xfs_trans_attr.c         | 250 +++++++++++++++++
 fs/xfs/xfs_xattr.c              |  11 +-
 25 files changed, 1535 insertions(+), 151 deletions(-)
 create mode 100644 fs/xfs/xfs_attr_item.c
 create mode 100644 fs/xfs/xfs_attr_item.h
 create mode 100644 fs/xfs/xfs_trans_attr.c

-- 
2.7.4

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 1/9] xfs: Remove all strlen in all xfs_attr_* functions for attr names.
  2019-04-12 22:50 [PATCH 0/9] xfs: Delayed Attributes Allison Henderson
@ 2019-04-12 22:50 ` Allison Henderson
  2019-04-14 23:02   ` Dave Chinner
  2019-04-17 15:42   ` Brian Foster
  2019-04-12 22:50 ` [PATCH 2/9] xfs: Hold inode locks in xfs_ialloc Allison Henderson
                   ` (7 subsequent siblings)
  8 siblings, 2 replies; 48+ messages in thread
From: Allison Henderson @ 2019-04-12 22:50 UTC (permalink / raw)
  To: linux-xfs

This helps to pre-simplify the extra handling of the null terminator in
delayed operations which use memcpy rather than strlen.  Later
when we introduce parent pointers, attribute names will become binary,
so strlen will not work at all.  Removing uses of strlen now will
help reduce complexities later

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_attr.c | 12 ++++++++----
 fs/xfs/libxfs/xfs_attr.h |  9 ++++++---
 fs/xfs/xfs_acl.c         | 12 +++++++-----
 fs/xfs/xfs_ioctl.c       | 13 ++++++++++---
 fs/xfs/xfs_iops.c        |  6 ++++--
 fs/xfs/xfs_xattr.c       | 10 ++++++----
 6 files changed, 41 insertions(+), 21 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index 2dd9ee2..3da6b0d 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -67,6 +67,7 @@ xfs_attr_args_init(
 	struct xfs_da_args	*args,
 	struct xfs_inode	*dp,
 	const unsigned char	*name,
+	size_t			namelen,
 	int			flags)
 {
 
@@ -79,7 +80,7 @@ xfs_attr_args_init(
 	args->dp = dp;
 	args->flags = flags;
 	args->name = name;
-	args->namelen = strlen((const char *)name);
+	args->namelen = namelen;
 	if (args->namelen >= MAXNAMELEN)
 		return -EFAULT;		/* match IRIX behaviour */
 
@@ -125,6 +126,7 @@ int
 xfs_attr_get(
 	struct xfs_inode	*ip,
 	const unsigned char	*name,
+	size_t			namelen,
 	unsigned char		*value,
 	int			*valuelenp,
 	int			flags)
@@ -138,7 +140,7 @@ xfs_attr_get(
 	if (XFS_FORCED_SHUTDOWN(ip->i_mount))
 		return -EIO;
 
-	error = xfs_attr_args_init(&args, ip, name, flags);
+	error = xfs_attr_args_init(&args, ip, name, namelen, flags);
 	if (error)
 		return error;
 
@@ -317,6 +319,7 @@ int
 xfs_attr_set(
 	struct xfs_inode	*dp,
 	const unsigned char	*name,
+	size_t			namelen,
 	unsigned char		*value,
 	int			valuelen,
 	int			flags)
@@ -333,7 +336,7 @@ xfs_attr_set(
 	if (XFS_FORCED_SHUTDOWN(dp->i_mount))
 		return -EIO;
 
-	error = xfs_attr_args_init(&args, dp, name, flags);
+	error = xfs_attr_args_init(&args, dp, name, namelen, flags);
 	if (error)
 		return error;
 
@@ -425,6 +428,7 @@ int
 xfs_attr_remove(
 	struct xfs_inode	*dp,
 	const unsigned char	*name,
+	size_t			namelen,
 	int			flags)
 {
 	struct xfs_mount	*mp = dp->i_mount;
@@ -436,7 +440,7 @@ xfs_attr_remove(
 	if (XFS_FORCED_SHUTDOWN(dp->i_mount))
 		return -EIO;
 
-	error = xfs_attr_args_init(&args, dp, name, flags);
+	error = xfs_attr_args_init(&args, dp, name, namelen, flags);
 	if (error)
 		return error;
 
diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
index 2297d84..52f63dc 100644
--- a/fs/xfs/libxfs/xfs_attr.h
+++ b/fs/xfs/libxfs/xfs_attr.h
@@ -137,11 +137,14 @@ int xfs_attr_list_int(struct xfs_attr_list_context *);
 int xfs_inode_hasattr(struct xfs_inode *ip);
 int xfs_attr_get_ilocked(struct xfs_inode *ip, struct xfs_da_args *args);
 int xfs_attr_get(struct xfs_inode *ip, const unsigned char *name,
-		 unsigned char *value, int *valuelenp, int flags);
+		 size_t namelen, unsigned char *value, int *valuelenp,
+		 int flags);
 int xfs_attr_set(struct xfs_inode *dp, const unsigned char *name,
-		 unsigned char *value, int valuelen, int flags);
+		 size_t namelen, unsigned char *value, int valuelen,
+		 int flags);
 int xfs_attr_set_args(struct xfs_da_args *args, struct xfs_buf **leaf_bp);
-int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name, int flags);
+int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name,
+		    size_t namelen, int flags);
 int xfs_attr_remove_args(struct xfs_da_args *args);
 int xfs_attr_list(struct xfs_inode *dp, char *buffer, int bufsize,
 		  int flags, struct attrlist_cursor_kern *cursor);
diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c
index 8039e35..142de8d 100644
--- a/fs/xfs/xfs_acl.c
+++ b/fs/xfs/xfs_acl.c
@@ -141,8 +141,8 @@ xfs_get_acl(struct inode *inode, int type)
 	if (!xfs_acl)
 		return ERR_PTR(-ENOMEM);
 
-	error = xfs_attr_get(ip, ea_name, (unsigned char *)xfs_acl,
-							&len, ATTR_ROOT);
+	error = xfs_attr_get(ip, ea_name, strlen(ea_name),
+			     (unsigned char *)xfs_acl, &len, ATTR_ROOT);
 	if (error) {
 		/*
 		 * If the attribute doesn't exist make sure we have a negative
@@ -192,15 +192,17 @@ __xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type)
 		len -= sizeof(struct xfs_acl_entry) *
 			 (XFS_ACL_MAX_ENTRIES(ip->i_mount) - acl->a_count);
 
-		error = xfs_attr_set(ip, ea_name, (unsigned char *)xfs_acl,
-				len, ATTR_ROOT);
+		error = xfs_attr_set(ip, ea_name, strlen(ea_name),
+				     (unsigned char *)xfs_acl, len, ATTR_ROOT);
 
 		kmem_free(xfs_acl);
 	} else {
 		/*
 		 * A NULL ACL argument means we want to remove the ACL.
 		 */
-		error = xfs_attr_remove(ip, ea_name, ATTR_ROOT);
+		error = xfs_attr_remove(ip, ea_name,
+					strlen(ea_name),
+					ATTR_ROOT);
 
 		/*
 		 * If the attribute didn't exist to start with that's fine.
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index 6ecdbb3..ab341d6 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -437,6 +437,7 @@ xfs_attrmulti_attr_get(
 {
 	unsigned char		*kbuf;
 	int			error = -EFAULT;
+	size_t			namelen;
 
 	if (*len > XFS_XATTR_SIZE_MAX)
 		return -EINVAL;
@@ -444,7 +445,9 @@ xfs_attrmulti_attr_get(
 	if (!kbuf)
 		return -ENOMEM;
 
-	error = xfs_attr_get(XFS_I(inode), name, kbuf, (int *)len, flags);
+	namelen = strlen(name);
+	error = xfs_attr_get(XFS_I(inode), name, namelen,
+			     kbuf, (int *)len, flags);
 	if (error)
 		goto out_kfree;
 
@@ -466,6 +469,7 @@ xfs_attrmulti_attr_set(
 {
 	unsigned char		*kbuf;
 	int			error;
+	size_t			namelen;
 
 	if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
 		return -EPERM;
@@ -476,7 +480,8 @@ xfs_attrmulti_attr_set(
 	if (IS_ERR(kbuf))
 		return PTR_ERR(kbuf);
 
-	error = xfs_attr_set(XFS_I(inode), name, kbuf, len, flags);
+	namelen = strlen(name);
+	error = xfs_attr_set(XFS_I(inode), name, namelen, kbuf, len, flags);
 	if (!error)
 		xfs_forget_acl(inode, name, flags);
 	kfree(kbuf);
@@ -490,10 +495,12 @@ xfs_attrmulti_attr_remove(
 	uint32_t		flags)
 {
 	int			error;
+	size_t			namelen;
 
 	if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
 		return -EPERM;
-	error = xfs_attr_remove(XFS_I(inode), name, flags);
+	namelen = strlen(name);
+	error = xfs_attr_remove(XFS_I(inode), name, namelen, flags);
 	if (!error)
 		xfs_forget_acl(inode, name, flags);
 	return error;
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index 74047bd..e73c21a 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -59,8 +59,10 @@ xfs_initxattrs(
 	int			error = 0;
 
 	for (xattr = xattr_array; xattr->name != NULL; xattr++) {
-		error = xfs_attr_set(ip, xattr->name, xattr->value,
-				      xattr->value_len, ATTR_SECURE);
+		error = xfs_attr_set(ip, xattr->name,
+				     strlen(xattr->name),
+				     xattr->value, xattr->value_len,
+				     ATTR_SECURE);
 		if (error < 0)
 			break;
 	}
diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c
index 9a63016..3013746 100644
--- a/fs/xfs/xfs_xattr.c
+++ b/fs/xfs/xfs_xattr.c
@@ -26,6 +26,7 @@ xfs_xattr_get(const struct xattr_handler *handler, struct dentry *unused,
 	int xflags = handler->flags;
 	struct xfs_inode *ip = XFS_I(inode);
 	int error, asize = size;
+	size_t namelen = strlen(name);
 
 	/* Convert Linux syscall to XFS internal ATTR flags */
 	if (!size) {
@@ -33,7 +34,7 @@ xfs_xattr_get(const struct xattr_handler *handler, struct dentry *unused,
 		value = NULL;
 	}
 
-	error = xfs_attr_get(ip, (unsigned char *)name, value, &asize, xflags);
+	error = xfs_attr_get(ip, name, namelen, value, &asize, xflags);
 	if (error)
 		return error;
 	return asize;
@@ -69,6 +70,7 @@ xfs_xattr_set(const struct xattr_handler *handler, struct dentry *unused,
 	int			xflags = handler->flags;
 	struct xfs_inode	*ip = XFS_I(inode);
 	int			error;
+	size_t			namelen = strlen(name);
 
 	/* Convert Linux syscall to XFS internal ATTR flags */
 	if (flags & XATTR_CREATE)
@@ -77,9 +79,9 @@ xfs_xattr_set(const struct xattr_handler *handler, struct dentry *unused,
 		xflags |= ATTR_REPLACE;
 
 	if (!value)
-		return xfs_attr_remove(ip, (unsigned char *)name, xflags);
-	error = xfs_attr_set(ip, (unsigned char *)name,
-				(void *)value, size, xflags);
+		return xfs_attr_remove(ip, name,
+				       namelen, xflags);
+	error = xfs_attr_set(ip, name, namelen, (void *)value, size, xflags);
 	if (!error)
 		xfs_forget_acl(inode, name, xflags);
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 2/9] xfs: Hold inode locks in xfs_ialloc
  2019-04-12 22:50 [PATCH 0/9] xfs: Delayed Attributes Allison Henderson
  2019-04-12 22:50 ` [PATCH 1/9] xfs: Remove all strlen in all xfs_attr_* functions for attr names Allison Henderson
@ 2019-04-12 22:50 ` Allison Henderson
  2019-04-17 15:44   ` Brian Foster
  2019-04-12 22:50 ` [PATCH 3/9] xfs: Add trans toggle to attr routines Allison Henderson
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 48+ messages in thread
From: Allison Henderson @ 2019-04-12 22:50 UTC (permalink / raw)
  To: linux-xfs

Modify xfs_ialloc to hold locks after return.  Caller
will be responsible for manual unlock.  We will need
this later to hold locks across parent pointer operations

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/xfs_inode.c   | 6 +++++-
 fs/xfs/xfs_qm.c      | 1 +
 fs/xfs/xfs_symlink.c | 3 +++
 3 files changed, 9 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index f643a92..30a3130 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -744,6 +744,8 @@ xfs_lookup(
  * to attach to or associate with (i.e. pip == NULL) because they
  * are not linked into the directory structure - they are attached
  * directly to the superblock - and so have no parent.
+ *
+ * Caller is responsible for unlocking the inode manually upon return
  */
 static int
 xfs_ialloc(
@@ -942,7 +944,7 @@ xfs_ialloc(
 	/*
 	 * Log the new values stuffed into the inode.
 	 */
-	xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL);
+	xfs_trans_ijoin(tp, ip, 0);
 	xfs_trans_log_inode(tp, ip, flags);
 
 	/* now that we have an i_mode we can setup the inode structure */
@@ -1264,6 +1266,7 @@ xfs_create(
 	xfs_qm_dqrele(pdqp);
 
 	*ipp = ip;
+	xfs_iunlock(ip, XFS_ILOCK_EXCL);
 	return 0;
 
  out_trans_cancel:
@@ -1359,6 +1362,7 @@ xfs_create_tmpfile(
 	xfs_qm_dqrele(pdqp);
 
 	*ipp = ip;
+	xfs_iunlock(ip, XFS_ILOCK_EXCL);
 	return 0;
 
  out_trans_cancel:
diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index 52ed790..69006e5 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -820,6 +820,7 @@ xfs_qm_qino_alloc(
 	}
 	if (need_alloc)
 		xfs_finish_inode_setup(*ip);
+	xfs_iunlock(*ip, XFS_ILOCK_EXCL);
 	return error;
 }
 
diff --git a/fs/xfs/xfs_symlink.c b/fs/xfs/xfs_symlink.c
index b2c1177..13d31fe 100644
--- a/fs/xfs/xfs_symlink.c
+++ b/fs/xfs/xfs_symlink.c
@@ -353,6 +353,7 @@ xfs_symlink(
 	xfs_qm_dqrele(pdqp);
 
 	*ipp = ip;
+	xfs_iunlock(ip, XFS_ILOCK_EXCL);
 	return 0;
 
 out_trans_cancel:
@@ -374,6 +375,8 @@ xfs_symlink(
 
 	if (unlock_dp_on_error)
 		xfs_iunlock(dp, XFS_ILOCK_EXCL);
+	if (ip)
+		xfs_iunlock(ip, XFS_ILOCK_EXCL);
 	return error;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 3/9] xfs: Add trans toggle to attr routines
  2019-04-12 22:50 [PATCH 0/9] xfs: Delayed Attributes Allison Henderson
  2019-04-12 22:50 ` [PATCH 1/9] xfs: Remove all strlen in all xfs_attr_* functions for attr names Allison Henderson
  2019-04-12 22:50 ` [PATCH 2/9] xfs: Hold inode locks in xfs_ialloc Allison Henderson
@ 2019-04-12 22:50 ` Allison Henderson
  2019-04-18 15:27   ` Brian Foster
  2019-04-12 22:50 ` [PATCH 4/9] xfs: Set up infastructure for deferred attribute operations Allison Henderson
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 48+ messages in thread
From: Allison Henderson @ 2019-04-12 22:50 UTC (permalink / raw)
  To: linux-xfs

This patch adds a roll_trans parameter to all attribute routines
that may roll a transaction. Calling functions may pass true to
roll transactions as normal, or false to hold them.

This patch is temporary and will be removed later when all code
paths have been made to pass a false value.  The temporary boolean
assists us to introduce changes across multiple smaller patches instead
of handling all affected code paths in one large patch.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/libxfs/xfs_attr.c        | 257 +++++++++++++++++++++++-----------------
 fs/xfs/libxfs/xfs_attr.h        |   5 +-
 fs/xfs/libxfs/xfs_attr_leaf.c   |  20 +++-
 fs/xfs/libxfs/xfs_attr_leaf.h   |   8 +-
 fs/xfs/libxfs/xfs_attr_remote.c |  50 ++++----
 fs/xfs/libxfs/xfs_attr_remote.h |   4 +-
 6 files changed, 203 insertions(+), 141 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index 3da6b0d..c50bbf6 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -49,15 +49,15 @@ STATIC int xfs_attr_shortform_addname(xfs_da_args_t *args);
  * Internal routines when attribute list is one block.
  */
 STATIC int xfs_attr_leaf_get(xfs_da_args_t *args);
-STATIC int xfs_attr_leaf_addname(xfs_da_args_t *args);
-STATIC int xfs_attr_leaf_removename(xfs_da_args_t *args);
+STATIC int xfs_attr_leaf_addname(xfs_da_args_t *args, bool roll_trans);
+STATIC int xfs_attr_leaf_removename(xfs_da_args_t *args, bool roll_trans);
 
 /*
  * Internal routines when attribute list is more than one block.
  */
 STATIC int xfs_attr_node_get(xfs_da_args_t *args);
-STATIC int xfs_attr_node_addname(xfs_da_args_t *args);
-STATIC int xfs_attr_node_removename(xfs_da_args_t *args);
+STATIC int xfs_attr_node_addname(xfs_da_args_t *args, bool roll_trans);
+STATIC int xfs_attr_node_removename(xfs_da_args_t *args, bool roll_trans);
 STATIC int xfs_attr_fillstate(xfs_da_state_t *state);
 STATIC int xfs_attr_refillstate(xfs_da_state_t *state);
 
@@ -196,11 +196,12 @@ xfs_attr_calc_size(
 STATIC int
 xfs_attr_try_sf_addname(
 	struct xfs_inode	*dp,
-	struct xfs_da_args	*args)
+	struct xfs_da_args	*args,
+	bool			roll_trans)
 {
 
 	struct xfs_mount	*mp = dp->i_mount;
-	int			error, error2;
+	int			error, error2 = 0;
 
 	error = xfs_attr_shortform_addname(args);
 	if (error == -ENOSPC)
@@ -216,8 +217,11 @@ xfs_attr_try_sf_addname(
 	if (mp->m_flags & XFS_MOUNT_WSYNC)
 		xfs_trans_set_sync(args->trans);
 
-	error2 = xfs_trans_commit(args->trans);
-	args->trans = NULL;
+	if (roll_trans) {
+		error2 = xfs_trans_commit(args->trans);
+		args->trans = NULL;
+	}
+
 	return error ? error : error2;
 }
 
@@ -227,10 +231,11 @@ xfs_attr_try_sf_addname(
 int
 xfs_attr_set_args(
 	struct xfs_da_args	*args,
-	struct xfs_buf          **leaf_bp)
+	struct xfs_buf          **leaf_bp,
+	bool			roll_trans)
 {
 	struct xfs_inode	*dp = args->dp;
-	int			error;
+	int			error = 0;
 
 	/*
 	 * If the attribute list is non-existent or a shortform list,
@@ -249,7 +254,7 @@ xfs_attr_set_args(
 		/*
 		 * Try to add the attr to the attribute list in the inode.
 		 */
-		error = xfs_attr_try_sf_addname(dp, args);
+		error = xfs_attr_try_sf_addname(dp, args, roll_trans);
 		if (error != -ENOSPC)
 			return error;
 
@@ -261,33 +266,35 @@ xfs_attr_set_args(
 		if (error)
 			return error;
 
-		/*
-		 * Prevent the leaf buffer from being unlocked so that a
-		 * concurrent AIL push cannot grab the half-baked leaf
-		 * buffer and run into problems with the write verifier.
-		 */
-		xfs_trans_bhold(args->trans, *leaf_bp);
+		if (roll_trans) {
+			/*
+			 * Prevent the leaf buffer from being unlocked so that a
+			 * concurrent AIL push cannot grab the half-baked leaf
+			 * buffer and run into problems with the write verifier.
+			 */
+			xfs_trans_bhold(args->trans, *leaf_bp);
 
-		error = xfs_defer_finish(&args->trans);
-		if (error)
-			return error;
+			error = xfs_defer_finish(&args->trans);
+			if (error)
+				return error;
 
-		/*
-		 * Commit the leaf transformation.  We'll need another
-		 * (linked) transaction to add the new attribute to the
-		 * leaf.
-		 */
-		error = xfs_trans_roll_inode(&args->trans, dp);
-		if (error)
-			return error;
-		xfs_trans_bjoin(args->trans, *leaf_bp);
-		*leaf_bp = NULL;
+			/*
+			 * Commit the leaf transformation.  We'll need another
+			 * (linked) transaction to add the new attribute to the
+			 * leaf.
+			 */
+			error = xfs_trans_roll_inode(&args->trans, dp);
+			if (error)
+				return error;
+			xfs_trans_bjoin(args->trans, *leaf_bp);
+			*leaf_bp = NULL;
+		}
 	}
 
 	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
-		error = xfs_attr_leaf_addname(args);
+		error = xfs_attr_leaf_addname(args, roll_trans);
 	else
-		error = xfs_attr_node_addname(args);
+		error = xfs_attr_node_addname(args, roll_trans);
 	return error;
 }
 
@@ -296,7 +303,8 @@ xfs_attr_set_args(
  */
 int
 xfs_attr_remove_args(
-	struct xfs_da_args      *args)
+	struct xfs_da_args      *args,
+	bool                    roll_trans)
 {
 	struct xfs_inode	*dp = args->dp;
 	int			error;
@@ -307,9 +315,9 @@ xfs_attr_remove_args(
 		ASSERT(dp->i_afp->if_flags & XFS_IFINLINE);
 		error = xfs_attr_shortform_remove(args);
 	} else if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
-		error = xfs_attr_leaf_removename(args);
+		error = xfs_attr_leaf_removename(args, roll_trans);
 	} else {
-		error = xfs_attr_node_removename(args);
+		error = xfs_attr_node_removename(args, roll_trans);
 	}
 
 	return error;
@@ -384,7 +392,7 @@ xfs_attr_set(
 		goto out_trans_cancel;
 
 	xfs_trans_ijoin(args.trans, dp, 0);
-	error = xfs_attr_set_args(&args, &leaf_bp);
+	error = xfs_attr_set_args(&args, &leaf_bp, true);
 	if (error)
 		goto out_release_leaf;
 	if (!args.trans) {
@@ -473,7 +481,8 @@ xfs_attr_remove(
 	 */
 	xfs_trans_ijoin(args.trans, dp, 0);
 
-	error = xfs_attr_remove_args(&args);
+	error = xfs_attr_remove_args(&args, true);
+
 	if (error)
 		goto out;
 
@@ -563,7 +572,8 @@ xfs_attr_shortform_addname(xfs_da_args_t *args)
  */
 STATIC int
 xfs_attr_leaf_addname(
-	struct xfs_da_args	*args)
+	struct xfs_da_args	*args,
+	bool			roll_trans)
 {
 	struct xfs_inode	*dp;
 	struct xfs_buf		*bp;
@@ -628,32 +638,37 @@ xfs_attr_leaf_addname(
 		error = xfs_attr3_leaf_to_node(args);
 		if (error)
 			return error;
-		error = xfs_defer_finish(&args->trans);
-		if (error)
-			return error;
 
-		/*
-		 * Commit the current trans (including the inode) and start
-		 * a new one.
-		 */
-		error = xfs_trans_roll_inode(&args->trans, dp);
-		if (error)
-			return error;
+		if (roll_trans) {
+			error = xfs_defer_finish(&args->trans);
+			if (error)
+				return error;
+
+			/*
+			 * Commit the current trans (including the inode) and
+			 * start a new one.
+			 */
+			error = xfs_trans_roll_inode(&args->trans, dp);
+			if (error)
+				return error;
+		}
 
 		/*
 		 * Fob the whole rest of the problem off on the Btree code.
 		 */
-		error = xfs_attr_node_addname(args);
+		error = xfs_attr_node_addname(args, roll_trans);
 		return error;
 	}
 
-	/*
-	 * Commit the transaction that added the attr name so that
-	 * later routines can manage their own transactions.
-	 */
-	error = xfs_trans_roll_inode(&args->trans, dp);
-	if (error)
-		return error;
+	if (roll_trans) {
+		/*
+		 * Commit the transaction that added the attr name so that
+		 * later routines can manage their own transactions.
+		 */
+		error = xfs_trans_roll_inode(&args->trans, dp);
+		if (error)
+			return error;
+	}
 
 	/*
 	 * If there was an out-of-line value, allocate the blocks we
@@ -662,7 +677,7 @@ xfs_attr_leaf_addname(
 	 * maximum size of a transaction and/or hit a deadlock.
 	 */
 	if (args->rmtblkno > 0) {
-		error = xfs_attr_rmtval_set(args);
+		error = xfs_attr_rmtval_set(args, roll_trans);
 		if (error)
 			return error;
 	}
@@ -678,7 +693,7 @@ xfs_attr_leaf_addname(
 		 * In a separate transaction, set the incomplete flag on the
 		 * "old" attr and clear the incomplete flag on the "new" attr.
 		 */
-		error = xfs_attr3_leaf_flipflags(args);
+		error = xfs_attr3_leaf_flipflags(args, roll_trans);
 		if (error)
 			return error;
 
@@ -692,7 +707,7 @@ xfs_attr_leaf_addname(
 		args->rmtblkcnt = args->rmtblkcnt2;
 		args->rmtvaluelen = args->rmtvaluelen2;
 		if (args->rmtblkno) {
-			error = xfs_attr_rmtval_remove(args);
+			error = xfs_attr_rmtval_remove(args, roll_trans);
 			if (error)
 				return error;
 		}
@@ -716,21 +731,25 @@ xfs_attr_leaf_addname(
 			/* bp is gone due to xfs_da_shrink_inode */
 			if (error)
 				return error;
-			error = xfs_defer_finish(&args->trans);
-			if (error)
-				return error;
+
+			if (roll_trans) {
+				error = xfs_defer_finish(&args->trans);
+				if (error)
+					return error;
+			}
 		}
 
 		/*
 		 * Commit the remove and start the next trans in series.
 		 */
-		error = xfs_trans_roll_inode(&args->trans, dp);
+		if (roll_trans)
+			error = xfs_trans_roll_inode(&args->trans, dp);
 
 	} else if (args->rmtblkno > 0) {
 		/*
 		 * Added a "remote" value, just clear the incomplete flag.
 		 */
-		error = xfs_attr3_leaf_clearflag(args);
+		error = xfs_attr3_leaf_clearflag(args, roll_trans);
 	}
 	return error;
 }
@@ -743,7 +762,8 @@ xfs_attr_leaf_addname(
  */
 STATIC int
 xfs_attr_leaf_removename(
-	struct xfs_da_args	*args)
+	struct xfs_da_args	*args,
+	bool roll_trans)
 {
 	struct xfs_inode	*dp;
 	struct xfs_buf		*bp;
@@ -776,9 +796,12 @@ xfs_attr_leaf_removename(
 		/* bp is gone due to xfs_da_shrink_inode */
 		if (error)
 			return error;
-		error = xfs_defer_finish(&args->trans);
-		if (error)
-			return error;
+
+		if (roll_trans) {
+			error = xfs_defer_finish(&args->trans);
+			if (error)
+				return error;
+		}
 	}
 	return 0;
 }
@@ -831,7 +854,8 @@ xfs_attr_leaf_get(xfs_da_args_t *args)
  */
 STATIC int
 xfs_attr_node_addname(
-	struct xfs_da_args	*args)
+	struct xfs_da_args	*args,
+	bool			roll_trans)
 {
 	struct xfs_da_state	*state;
 	struct xfs_da_state_blk	*blk;
@@ -899,17 +923,20 @@ xfs_attr_node_addname(
 			error = xfs_attr3_leaf_to_node(args);
 			if (error)
 				goto out;
-			error = xfs_defer_finish(&args->trans);
-			if (error)
-				goto out;
 
-			/*
-			 * Commit the node conversion and start the next
-			 * trans in the chain.
-			 */
-			error = xfs_trans_roll_inode(&args->trans, dp);
-			if (error)
-				goto out;
+			if (roll_trans) {
+				error = xfs_defer_finish(&args->trans);
+				if (error)
+					goto out;
+
+				/*
+				 * Commit the node conversion and start the next
+				 * trans in the chain.
+				 */
+				error = xfs_trans_roll_inode(&args->trans, dp);
+				if (error)
+					goto out;
+			}
 
 			goto restart;
 		}
@@ -923,9 +950,13 @@ xfs_attr_node_addname(
 		error = xfs_da3_split(state);
 		if (error)
 			goto out;
-		error = xfs_defer_finish(&args->trans);
-		if (error)
-			goto out;
+
+		if (roll_trans) {
+			error = xfs_defer_finish(&args->trans);
+			if (error)
+				goto out;
+		}
+
 	} else {
 		/*
 		 * Addition succeeded, update Btree hashvals.
@@ -944,9 +975,11 @@ xfs_attr_node_addname(
 	 * Commit the leaf addition or btree split and start the next
 	 * trans in the chain.
 	 */
-	error = xfs_trans_roll_inode(&args->trans, dp);
-	if (error)
-		goto out;
+	if (roll_trans) {
+		error = xfs_trans_roll_inode(&args->trans, dp);
+		if (error)
+			goto out;
+	}
 
 	/*
 	 * If there was an out-of-line value, allocate the blocks we
@@ -955,7 +988,7 @@ xfs_attr_node_addname(
 	 * maximum size of a transaction and/or hit a deadlock.
 	 */
 	if (args->rmtblkno > 0) {
-		error = xfs_attr_rmtval_set(args);
+		error = xfs_attr_rmtval_set(args, roll_trans);
 		if (error)
 			return error;
 	}
@@ -971,7 +1004,7 @@ xfs_attr_node_addname(
 		 * In a separate transaction, set the incomplete flag on the
 		 * "old" attr and clear the incomplete flag on the "new" attr.
 		 */
-		error = xfs_attr3_leaf_flipflags(args);
+		error = xfs_attr3_leaf_flipflags(args, roll_trans);
 		if (error)
 			goto out;
 
@@ -985,7 +1018,7 @@ xfs_attr_node_addname(
 		args->rmtblkcnt = args->rmtblkcnt2;
 		args->rmtvaluelen = args->rmtvaluelen2;
 		if (args->rmtblkno) {
-			error = xfs_attr_rmtval_remove(args);
+			error = xfs_attr_rmtval_remove(args, roll_trans);
 			if (error)
 				return error;
 		}
@@ -1019,9 +1052,11 @@ xfs_attr_node_addname(
 			error = xfs_da3_join(state);
 			if (error)
 				goto out;
-			error = xfs_defer_finish(&args->trans);
-			if (error)
-				goto out;
+			if (roll_trans) {
+				error = xfs_defer_finish(&args->trans);
+				if (error)
+					goto out;
+			}
 		}
 
 		/*
@@ -1035,7 +1070,7 @@ xfs_attr_node_addname(
 		/*
 		 * Added a "remote" value, just clear the incomplete flag.
 		 */
-		error = xfs_attr3_leaf_clearflag(args);
+		error = xfs_attr3_leaf_clearflag(args, roll_trans);
 		if (error)
 			goto out;
 	}
@@ -1058,7 +1093,8 @@ xfs_attr_node_addname(
  */
 STATIC int
 xfs_attr_node_removename(
-	struct xfs_da_args	*args)
+	struct xfs_da_args	*args,
+	bool			roll_trans)
 {
 	struct xfs_da_state	*state;
 	struct xfs_da_state_blk	*blk;
@@ -1108,10 +1144,10 @@ xfs_attr_node_removename(
 		 * Mark the attribute as INCOMPLETE, then bunmapi() the
 		 * remote value.
 		 */
-		error = xfs_attr3_leaf_setflag(args);
+		error = xfs_attr3_leaf_setflag(args, roll_trans);
 		if (error)
 			goto out;
-		error = xfs_attr_rmtval_remove(args);
+		error = xfs_attr_rmtval_remove(args, roll_trans);
 		if (error)
 			goto out;
 
@@ -1139,15 +1175,19 @@ xfs_attr_node_removename(
 		error = xfs_da3_join(state);
 		if (error)
 			goto out;
-		error = xfs_defer_finish(&args->trans);
-		if (error)
-			goto out;
-		/*
-		 * Commit the Btree join operation and start a new trans.
-		 */
-		error = xfs_trans_roll_inode(&args->trans, dp);
-		if (error)
-			goto out;
+
+		if (roll_trans) {
+			error = xfs_defer_finish(&args->trans);
+			if (error)
+				goto out;
+			/*
+			 * Commit the Btree join operation and start
+			 * a new trans.
+			 */
+			error = xfs_trans_roll_inode(&args->trans, dp);
+			if (error)
+				goto out;
+		}
 	}
 
 	/*
@@ -1170,9 +1210,12 @@ xfs_attr_node_removename(
 			/* bp is gone due to xfs_da_shrink_inode */
 			if (error)
 				goto out;
-			error = xfs_defer_finish(&args->trans);
-			if (error)
-				goto out;
+
+			if (roll_trans) {
+				error = xfs_defer_finish(&args->trans);
+				if (error)
+					goto out;
+			}
 		} else
 			xfs_trans_brelse(args->trans, bp);
 	}
diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
index 52f63dc..f0e91bf 100644
--- a/fs/xfs/libxfs/xfs_attr.h
+++ b/fs/xfs/libxfs/xfs_attr.h
@@ -142,10 +142,11 @@ int xfs_attr_get(struct xfs_inode *ip, const unsigned char *name,
 int xfs_attr_set(struct xfs_inode *dp, const unsigned char *name,
 		 size_t namelen, unsigned char *value, int valuelen,
 		 int flags);
-int xfs_attr_set_args(struct xfs_da_args *args, struct xfs_buf **leaf_bp);
+int xfs_attr_set_args(struct xfs_da_args *args, struct xfs_buf **leaf_bp,
+		 bool roll_trans);
 int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name,
 		    size_t namelen, int flags);
-int xfs_attr_remove_args(struct xfs_da_args *args);
+int xfs_attr_remove_args(struct xfs_da_args *args, bool roll_trans);
 int xfs_attr_list(struct xfs_inode *dp, char *buffer, int bufsize,
 		  int flags, struct attrlist_cursor_kern *cursor);
 bool xfs_attr_namecheck(const void *name, size_t length);
diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c
index 1f6e396..128bfe9 100644
--- a/fs/xfs/libxfs/xfs_attr_leaf.c
+++ b/fs/xfs/libxfs/xfs_attr_leaf.c
@@ -2637,7 +2637,8 @@ xfs_attr_leaf_newentsize(
  */
 int
 xfs_attr3_leaf_clearflag(
-	struct xfs_da_args	*args)
+	struct xfs_da_args	*args,
+	bool			roll_trans)
 {
 	struct xfs_attr_leafblock *leaf;
 	struct xfs_attr_leaf_entry *entry;
@@ -2698,7 +2699,9 @@ xfs_attr3_leaf_clearflag(
 	/*
 	 * Commit the flag value change and start the next trans in series.
 	 */
-	return xfs_trans_roll_inode(&args->trans, args->dp);
+	if (roll_trans)
+		error = xfs_trans_roll_inode(&args->trans, args->dp);
+	return error;
 }
 
 /*
@@ -2706,7 +2709,8 @@ xfs_attr3_leaf_clearflag(
  */
 int
 xfs_attr3_leaf_setflag(
-	struct xfs_da_args	*args)
+	struct xfs_da_args	*args,
+	bool			roll_trans)
 {
 	struct xfs_attr_leafblock *leaf;
 	struct xfs_attr_leaf_entry *entry;
@@ -2749,7 +2753,9 @@ xfs_attr3_leaf_setflag(
 	/*
 	 * Commit the flag value change and start the next trans in series.
 	 */
-	return xfs_trans_roll_inode(&args->trans, args->dp);
+	if (roll_trans)
+		error = xfs_trans_roll_inode(&args->trans, args->dp);
+	return error;
 }
 
 /*
@@ -2761,7 +2767,8 @@ xfs_attr3_leaf_setflag(
  */
 int
 xfs_attr3_leaf_flipflags(
-	struct xfs_da_args	*args)
+	struct xfs_da_args	*args,
+	bool			roll_trans)
 {
 	struct xfs_attr_leafblock *leaf1;
 	struct xfs_attr_leafblock *leaf2;
@@ -2867,7 +2874,8 @@ xfs_attr3_leaf_flipflags(
 	/*
 	 * Commit the flag value change and start the next trans in series.
 	 */
-	error = xfs_trans_roll_inode(&args->trans, args->dp);
+	if (roll_trans)
+		error = xfs_trans_roll_inode(&args->trans, args->dp);
 
 	return error;
 }
diff --git a/fs/xfs/libxfs/xfs_attr_leaf.h b/fs/xfs/libxfs/xfs_attr_leaf.h
index 7b74e18..9d830ec 100644
--- a/fs/xfs/libxfs/xfs_attr_leaf.h
+++ b/fs/xfs/libxfs/xfs_attr_leaf.h
@@ -49,10 +49,10 @@ void	xfs_attr_fork_remove(struct xfs_inode *ip, struct xfs_trans *tp);
  */
 int	xfs_attr3_leaf_to_node(struct xfs_da_args *args);
 int	xfs_attr3_leaf_to_shortform(struct xfs_buf *bp,
-				   struct xfs_da_args *args, int forkoff);
-int	xfs_attr3_leaf_clearflag(struct xfs_da_args *args);
-int	xfs_attr3_leaf_setflag(struct xfs_da_args *args);
-int	xfs_attr3_leaf_flipflags(struct xfs_da_args *args);
+			struct xfs_da_args *args, int forkoff);
+int	xfs_attr3_leaf_clearflag(struct xfs_da_args *args, bool roll_trans);
+int	xfs_attr3_leaf_setflag(struct xfs_da_args *args, bool roll_trans);
+int	xfs_attr3_leaf_flipflags(struct xfs_da_args *args, bool roll_trans);
 
 /*
  * Routines used for growing the Btree.
diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
index 65ff600..18fbd22 100644
--- a/fs/xfs/libxfs/xfs_attr_remote.c
+++ b/fs/xfs/libxfs/xfs_attr_remote.c
@@ -435,7 +435,8 @@ xfs_attr_rmtval_get(
  */
 int
 xfs_attr_rmtval_set(
-	struct xfs_da_args	*args)
+	struct xfs_da_args	*args,
+	bool			roll_trans)
 {
 	struct xfs_inode	*dp = args->dp;
 	struct xfs_mount	*mp = dp->i_mount;
@@ -488,9 +489,12 @@ xfs_attr_rmtval_set(
 				  &nmap);
 		if (error)
 			return error;
-		error = xfs_defer_finish(&args->trans);
-		if (error)
-			return error;
+
+		if (roll_trans) {
+			error = xfs_defer_finish(&args->trans);
+			if (error)
+				return error;
+		}
 
 		ASSERT(nmap == 1);
 		ASSERT((map.br_startblock != DELAYSTARTBLOCK) &&
@@ -498,12 +502,14 @@ xfs_attr_rmtval_set(
 		lblkno += map.br_blockcount;
 		blkcnt -= map.br_blockcount;
 
-		/*
-		 * Start the next trans in the chain.
-		 */
-		error = xfs_trans_roll_inode(&args->trans, dp);
-		if (error)
-			return error;
+		if (roll_trans) {
+			/*
+			 * Start the next trans in the chain.
+			 */
+			error = xfs_trans_roll_inode(&args->trans, dp);
+			if (error)
+				return error;
+		}
 	}
 
 	/*
@@ -563,7 +569,8 @@ xfs_attr_rmtval_set(
  */
 int
 xfs_attr_rmtval_remove(
-	struct xfs_da_args	*args)
+	struct xfs_da_args	*args,
+	bool			roll_trans)
 {
 	struct xfs_mount	*mp = args->dp->i_mount;
 	xfs_dablk_t		lblkno;
@@ -625,16 +632,19 @@ xfs_attr_rmtval_remove(
 				    XFS_BMAPI_ATTRFORK, 1, &done);
 		if (error)
 			return error;
-		error = xfs_defer_finish(&args->trans);
-		if (error)
-			return error;
 
-		/*
-		 * Close out trans and start the next one in the chain.
-		 */
-		error = xfs_trans_roll_inode(&args->trans, args->dp);
-		if (error)
-			return error;
+		if (roll_trans) {
+			error = xfs_defer_finish(&args->trans);
+			if (error)
+				return error;
+
+			/*
+			 * Close out trans and start the next one in the chain.
+			 */
+			error = xfs_trans_roll_inode(&args->trans, args->dp);
+			if (error)
+				return error;
+		}
 	}
 	return 0;
 }
diff --git a/fs/xfs/libxfs/xfs_attr_remote.h b/fs/xfs/libxfs/xfs_attr_remote.h
index 9d20b66..c7c073d 100644
--- a/fs/xfs/libxfs/xfs_attr_remote.h
+++ b/fs/xfs/libxfs/xfs_attr_remote.h
@@ -9,7 +9,7 @@
 int xfs_attr3_rmt_blocks(struct xfs_mount *mp, int attrlen);
 
 int xfs_attr_rmtval_get(struct xfs_da_args *args);
-int xfs_attr_rmtval_set(struct xfs_da_args *args);
-int xfs_attr_rmtval_remove(struct xfs_da_args *args);
+int xfs_attr_rmtval_set(struct xfs_da_args *args, bool roll_trans);
+int xfs_attr_rmtval_remove(struct xfs_da_args *args, bool roll_trans);
 
 #endif /* __XFS_ATTR_REMOTE_H__ */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 4/9] xfs: Set up infastructure for deferred attribute operations
  2019-04-12 22:50 [PATCH 0/9] xfs: Delayed Attributes Allison Henderson
                   ` (2 preceding siblings ...)
  2019-04-12 22:50 ` [PATCH 3/9] xfs: Add trans toggle to attr routines Allison Henderson
@ 2019-04-12 22:50 ` Allison Henderson
  2019-04-18 15:48   ` Brian Foster
  2019-04-12 22:50 ` [PATCH 5/9] xfs: Add xfs_attr_set_deferred and xfs_attr_remove_deferred Allison Henderson
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 48+ messages in thread
From: Allison Henderson @ 2019-04-12 22:50 UTC (permalink / raw)
  To: linux-xfs

This patch adds two new log item types for setting or
removing attributes as deferred operations.  The
xfs_attri_log_item logs an intent to set or remove an
attribute.  The corresponding xfs_attrd_log_item holds
a reference to the xfs_attri_log_item and is freed once
the transaction is done.  Both log items use a generic
xfs_attr_log_format structure that contains the attribute
name, value, flags, inode, and an op_flag that indicates
if the operations is a set or remove.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/Makefile                |   2 +
 fs/xfs/libxfs/xfs_attr.c       |   5 +-
 fs/xfs/libxfs/xfs_attr.h       |  25 ++
 fs/xfs/libxfs/xfs_defer.c      |   1 +
 fs/xfs/libxfs/xfs_defer.h      |   3 +
 fs/xfs/libxfs/xfs_log_format.h |  44 +++-
 fs/xfs/libxfs/xfs_types.h      |   1 +
 fs/xfs/xfs_attr_item.c         | 558 +++++++++++++++++++++++++++++++++++++++++
 fs/xfs/xfs_attr_item.h         | 103 ++++++++
 fs/xfs/xfs_log_recover.c       | 172 +++++++++++++
 fs/xfs/xfs_ondisk.h            |   2 +
 fs/xfs/xfs_trans.h             |  10 +
 fs/xfs/xfs_trans_attr.c        | 240 ++++++++++++++++++
 13 files changed, 1162 insertions(+), 4 deletions(-)

diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile
index 7f96bda..022e0b4 100644
--- a/fs/xfs/Makefile
+++ b/fs/xfs/Makefile
@@ -97,6 +97,7 @@ xfs-y				+= xfs_log.o \
 				   xfs_bmap_item.o \
 				   xfs_buf_item.o \
 				   xfs_extfree_item.o \
+				   xfs_attr_item.o \
 				   xfs_icreate_item.o \
 				   xfs_inode_item.o \
 				   xfs_refcount_item.o \
@@ -106,6 +107,7 @@ xfs-y				+= xfs_log.o \
 				   xfs_trans_bmap.o \
 				   xfs_trans_buf.o \
 				   xfs_trans_extfree.o \
+				   xfs_trans_attr.o \
 				   xfs_trans_inode.o \
 				   xfs_trans_refcount.o \
 				   xfs_trans_rmap.o \
diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index c50bbf6..fadd485 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -29,6 +29,7 @@
 #include "xfs_quota.h"
 #include "xfs_trans_space.h"
 #include "xfs_trace.h"
+#include "xfs_attr_item.h"
 
 /*
  * xfs_attr.c
@@ -62,7 +63,7 @@ STATIC int xfs_attr_fillstate(xfs_da_state_t *state);
 STATIC int xfs_attr_refillstate(xfs_da_state_t *state);
 
 
-STATIC int
+int
 xfs_attr_args_init(
 	struct xfs_da_args	*args,
 	struct xfs_inode	*dp,
@@ -160,7 +161,7 @@ xfs_attr_get(
 /*
  * Calculate how many blocks we need for the new attribute,
  */
-STATIC int
+int
 xfs_attr_calc_size(
 	struct xfs_da_args	*args,
 	int			*local)
diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
index f0e91bf..92d9a15 100644
--- a/fs/xfs/libxfs/xfs_attr.h
+++ b/fs/xfs/libxfs/xfs_attr.h
@@ -78,6 +78,28 @@ typedef struct attrlist_ent {	/* data from attr_list() */
 } attrlist_ent_t;
 
 /*
+ * List of attrs to commit later.
+ */
+struct xfs_attr_item {
+	struct xfs_inode  *xattri_ip;
+	uint32_t	  xattri_op_flags;
+	void		  *xattri_value;      /* attr value */
+	uint32_t	  xattri_value_len;   /* length of value */
+	void		  *xattri_name;	      /* attr name */
+	uint32_t	  xattri_name_len;    /* length of name */
+	uint32_t	  xattri_flags;       /* attr flags */
+	struct list_head  xattri_list;
+
+	/*
+	 * A byte array follows the header containing the file name and
+	 * attribute value.
+	 */
+};
+
+#define XFS_ATTR_ITEM_SIZEOF(namelen, valuelen)	\
+	(sizeof(struct xfs_attr_item) + (namelen) + (valuelen))
+
+/*
  * Given a pointer to the (char*) buffer containing the attr_list() result,
  * and an index, return a pointer to the indicated attribute in the buffer.
  */
@@ -150,5 +172,8 @@ int xfs_attr_remove_args(struct xfs_da_args *args, bool roll_trans);
 int xfs_attr_list(struct xfs_inode *dp, char *buffer, int bufsize,
 		  int flags, struct attrlist_cursor_kern *cursor);
 bool xfs_attr_namecheck(const void *name, size_t length);
+int xfs_attr_args_init(struct xfs_da_args *args, struct xfs_inode *dp,
+			const unsigned char *name, size_t namelen, int flags);
+int xfs_attr_calc_size(struct xfs_da_args *args, int *local);
 
 #endif	/* __XFS_ATTR_H__ */
diff --git a/fs/xfs/libxfs/xfs_defer.c b/fs/xfs/libxfs/xfs_defer.c
index 94f0042..fb444bd 100644
--- a/fs/xfs/libxfs/xfs_defer.c
+++ b/fs/xfs/libxfs/xfs_defer.c
@@ -178,6 +178,7 @@ static const struct xfs_defer_op_type *defer_op_types[] = {
 	[XFS_DEFER_OPS_TYPE_RMAP]	= &xfs_rmap_update_defer_type,
 	[XFS_DEFER_OPS_TYPE_FREE]	= &xfs_extent_free_defer_type,
 	[XFS_DEFER_OPS_TYPE_AGFL_FREE]	= &xfs_agfl_free_defer_type,
+	[XFS_DEFER_OPS_TYPE_ATTR]	= &xfs_attr_defer_type,
 };
 
 /*
diff --git a/fs/xfs/libxfs/xfs_defer.h b/fs/xfs/libxfs/xfs_defer.h
index 7c28d76..b9ff7b9 100644
--- a/fs/xfs/libxfs/xfs_defer.h
+++ b/fs/xfs/libxfs/xfs_defer.h
@@ -17,6 +17,7 @@ enum xfs_defer_ops_type {
 	XFS_DEFER_OPS_TYPE_RMAP,
 	XFS_DEFER_OPS_TYPE_FREE,
 	XFS_DEFER_OPS_TYPE_AGFL_FREE,
+	XFS_DEFER_OPS_TYPE_ATTR,
 	XFS_DEFER_OPS_TYPE_MAX,
 };
 
@@ -60,5 +61,7 @@ extern const struct xfs_defer_op_type xfs_refcount_update_defer_type;
 extern const struct xfs_defer_op_type xfs_rmap_update_defer_type;
 extern const struct xfs_defer_op_type xfs_extent_free_defer_type;
 extern const struct xfs_defer_op_type xfs_agfl_free_defer_type;
+extern const struct xfs_defer_op_type xfs_attr_defer_type;
+
 
 #endif /* __XFS_DEFER_H__ */
diff --git a/fs/xfs/libxfs/xfs_log_format.h b/fs/xfs/libxfs/xfs_log_format.h
index e5f97c6..76d42e6 100644
--- a/fs/xfs/libxfs/xfs_log_format.h
+++ b/fs/xfs/libxfs/xfs_log_format.h
@@ -117,7 +117,12 @@ struct xfs_unmount_log_format {
 #define XLOG_REG_TYPE_CUD_FORMAT	24
 #define XLOG_REG_TYPE_BUI_FORMAT	25
 #define XLOG_REG_TYPE_BUD_FORMAT	26
-#define XLOG_REG_TYPE_MAX		26
+#define XLOG_REG_TYPE_ATTRI_FORMAT	27
+#define XLOG_REG_TYPE_ATTRD_FORMAT	28
+#define XLOG_REG_TYPE_ATTR_NAME	29
+#define XLOG_REG_TYPE_ATTR_VALUE	30
+#define XLOG_REG_TYPE_MAX		31
+
 
 /*
  * Flags to log operation header
@@ -240,6 +245,8 @@ typedef struct xfs_trans_header {
 #define	XFS_LI_CUD		0x1243
 #define	XFS_LI_BUI		0x1244	/* bmbt update intent */
 #define	XFS_LI_BUD		0x1245
+#define	XFS_LI_ATTRI		0x1246  /* attr set/remove intent*/
+#define	XFS_LI_ATTRD		0x1247  /* attr set/remove done */
 
 #define XFS_LI_TYPE_DESC \
 	{ XFS_LI_EFI,		"XFS_LI_EFI" }, \
@@ -255,7 +262,9 @@ typedef struct xfs_trans_header {
 	{ XFS_LI_CUI,		"XFS_LI_CUI" }, \
 	{ XFS_LI_CUD,		"XFS_LI_CUD" }, \
 	{ XFS_LI_BUI,		"XFS_LI_BUI" }, \
-	{ XFS_LI_BUD,		"XFS_LI_BUD" }
+	{ XFS_LI_BUD,		"XFS_LI_BUD" }, \
+	{ XFS_LI_ATTRI,		"XFS_LI_ATTRI" }, \
+	{ XFS_LI_ATTRD,		"XFS_LI_ATTRD" }
 
 /*
  * Inode Log Item Format definitions.
@@ -853,4 +862,35 @@ struct xfs_icreate_log {
 	__be32		icl_gen;	/* inode generation number to use */
 };
 
+/*
+ * Flags for deferred attribute operations.
+ * Upper bits are flags, lower byte is type code
+ */
+#define XFS_ATTR_OP_FLAGS_SET		1	/* Set the attribute */
+#define XFS_ATTR_OP_FLAGS_REMOVE	2	/* Remove the attribute */
+#define XFS_ATTR_OP_FLAGS_TYPE_MASK	0x0FF	/* Flags type mask */
+
+/*
+ * This is the structure used to lay out an attr log item in the
+ * log.
+ */
+struct xfs_attri_log_format {
+	uint16_t	alfi_type;	/* attri log item type */
+	uint16_t	alfi_size;	/* size of this item */
+	uint32_t	__pad;		/* pad to 64 bit aligned */
+	uint64_t	alfi_id;	/* attri identifier */
+	xfs_ino_t       alfi_ino;	/* the inode for this attr operation */
+	uint32_t        alfi_op_flags;	/* marks the op as a set or remove */
+	uint32_t        alfi_name_len;	/* attr name length */
+	uint32_t        alfi_value_len;	/* attr value length */
+	uint32_t        alfi_attr_flags;/* attr flags */
+};
+
+struct xfs_attrd_log_format {
+	uint16_t	alfd_type;	/* attrd log item type */
+	uint16_t	alfd_size;	/* size of this item */
+	uint32_t	__pad;		/* pad to 64 bit aligned */
+	uint64_t	alfd_alf_id;	/* id of corresponding attrd */
+};
+
 #endif /* __XFS_LOG_FORMAT_H__ */
diff --git a/fs/xfs/libxfs/xfs_types.h b/fs/xfs/libxfs/xfs_types.h
index c5a2540..15e928a 100644
--- a/fs/xfs/libxfs/xfs_types.h
+++ b/fs/xfs/libxfs/xfs_types.h
@@ -11,6 +11,7 @@ typedef uint32_t	prid_t;		/* project ID */
 typedef uint32_t	xfs_agblock_t;	/* blockno in alloc. group */
 typedef uint32_t	xfs_agino_t;	/* inode # within allocation grp */
 typedef uint32_t	xfs_extlen_t;	/* extent length in blocks */
+typedef uint32_t	xfs_attrlen_t;	/* attr length */
 typedef uint32_t	xfs_agnumber_t;	/* allocation group number */
 typedef int32_t		xfs_extnum_t;	/* # of extents in a file */
 typedef int16_t		xfs_aextnum_t;	/* # extents in an attribute fork */
diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
new file mode 100644
index 0000000..0ea19b4
--- /dev/null
+++ b/fs/xfs/xfs_attr_item.c
@@ -0,0 +1,558 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * Copyright (C) 2019 Oracle.  All Rights Reserved.
+ * Author: Allison Henderson <allison.henderson@oracle.com>
+ */
+#include "xfs.h"
+#include "xfs_fs.h"
+#include "xfs_format.h"
+#include "xfs_log_format.h"
+#include "xfs_trans_resv.h"
+#include "xfs_bit.h"
+#include "xfs_mount.h"
+#include "xfs_trans.h"
+#include "xfs_trans_priv.h"
+#include "xfs_buf_item.h"
+#include "xfs_attr_item.h"
+#include "xfs_log.h"
+#include "xfs_btree.h"
+#include "xfs_rmap.h"
+#include "xfs_inode.h"
+#include "xfs_icache.h"
+#include "xfs_attr.h"
+#include "xfs_shared.h"
+#include "xfs_da_format.h"
+#include "xfs_da_btree.h"
+
+static inline struct xfs_attri_log_item *ATTRI_ITEM(struct xfs_log_item *lip)
+{
+	return container_of(lip, struct xfs_attri_log_item, item);
+}
+
+void
+xfs_attri_item_free(
+	struct xfs_attri_log_item	*attrip)
+{
+	kmem_free(attrip->item.li_lv_shadow);
+	kmem_free(attrip);
+}
+
+/*
+ * This returns the number of iovecs needed to log the given attri item.
+ * We only need 1 iovec for an attri item.  It just logs the attr_log_format
+ * structure.
+ */
+static inline int
+xfs_attri_item_sizeof(
+	struct xfs_attri_log_item *attrip)
+{
+	return sizeof(struct xfs_attri_log_format);
+}
+
+STATIC void
+xfs_attri_item_size(
+	struct xfs_log_item	*lip,
+	int			*nvecs,
+	int			*nbytes)
+{
+	struct xfs_attri_log_item       *attrip = ATTRI_ITEM(lip);
+
+	*nvecs += 1;
+	*nbytes += xfs_attri_item_sizeof(attrip);
+
+	if (attrip->name_len > 0) {
+		*nvecs += 1;
+		*nbytes += ATTR_NVEC_SIZE(attrip->name_len);
+	}
+
+	if (attrip->value_len > 0) {
+		*nvecs += 1;
+		*nbytes += ATTR_NVEC_SIZE(attrip->value_len);
+	}
+}
+
+/*
+ * This is called to fill in the vector of log iovecs for the
+ * given attri log item. We use only 1 iovec, and we point that
+ * at the attri_log_format structure embedded in the attri item.
+ * It is at this point that we assert that all of the attr
+ * slots in the attri item have been filled.
+ */
+STATIC void
+xfs_attri_item_format(
+	struct xfs_log_item	*lip,
+	struct xfs_log_vec	*lv)
+{
+	struct xfs_attri_log_item	*attrip = ATTRI_ITEM(lip);
+	struct xfs_log_iovec	*vecp = NULL;
+
+	attrip->format.alfi_type = XFS_LI_ATTRI;
+	attrip->format.alfi_size = 1;
+	if (attrip->name_len > 0)
+		attrip->format.alfi_size++;
+	if (attrip->value_len > 0)
+		attrip->format.alfi_size++;
+
+	xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTRI_FORMAT,
+			&attrip->format,
+			xfs_attri_item_sizeof(attrip));
+	if (attrip->name_len > 0)
+		xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTR_NAME,
+				attrip->name, ATTR_NVEC_SIZE(attrip->name_len));
+
+	if (attrip->value_len > 0)
+		xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTR_VALUE,
+				attrip->value,
+				ATTR_NVEC_SIZE(attrip->value_len));
+}
+
+
+/*
+ * Pinning has no meaning for an attri item, so just return.
+ */
+STATIC void
+xfs_attri_item_pin(
+	struct xfs_log_item	*lip)
+{
+}
+
+/*
+ * The unpin operation is the last place an ATTRI is manipulated in the log. It
+ * is either inserted in the AIL or aborted in the event of a log I/O error. In
+ * either case, the ATTRI transaction has been successfully committed to make it
+ * this far. Therefore, we expect whoever committed the ATTRI to either
+ * construct and commit the ATTRD or drop the ATTRD's reference in the event of
+ * error. Simply drop the log's ATTRI reference now that the log is done with
+ * it.
+ */
+STATIC void
+xfs_attri_item_unpin(
+	struct xfs_log_item	*lip,
+	int			remove)
+{
+	struct xfs_attri_log_item	*attrip = ATTRI_ITEM(lip);
+
+	xfs_attri_release(attrip);
+}
+
+/*
+ * attri items have no locking or pushing.  However, since ATTRIs are pulled
+ * from the AIL when their corresponding ATTRDs are committed to disk, their
+ * situation is very similar to being pinned.  Return XFS_ITEM_PINNED so that
+ * the caller will eventually flush the log.  This should help in getting the
+ * ATTRI out of the AIL.
+ */
+STATIC uint
+xfs_attri_item_push(
+	struct xfs_log_item	*lip,
+	struct list_head	*buffer_list)
+{
+	return XFS_ITEM_PINNED;
+}
+
+/*
+ * The ATTRI has been either committed or aborted if the transaction has been
+ * cancelled. If the transaction was cancelled, an ATTRD isn't going to be
+ * constructed and thus we free the ATTRI here directly.
+ */
+STATIC void
+xfs_attri_item_unlock(
+	struct xfs_log_item	*lip)
+{
+	if (test_bit(XFS_LI_ABORTED, &lip->li_flags))
+		xfs_attri_release(ATTRI_ITEM(lip));
+}
+
+/*
+ * The ATTRI is logged only once and cannot be moved in the log, so simply
+ * return the lsn at which it's been logged.
+ */
+STATIC xfs_lsn_t
+xfs_attri_item_committed(
+	struct xfs_log_item	*lip,
+	xfs_lsn_t		lsn)
+{
+	return lsn;
+}
+
+STATIC void
+xfs_attri_item_committing(
+	struct xfs_log_item	*lip,
+	xfs_lsn_t		lsn)
+{
+}
+
+/*
+ * This is the ops vector shared by all attri log items.
+ */
+static const struct xfs_item_ops xfs_attri_item_ops = {
+	.iop_size	= xfs_attri_item_size,
+	.iop_format	= xfs_attri_item_format,
+	.iop_pin	= xfs_attri_item_pin,
+	.iop_unpin	= xfs_attri_item_unpin,
+	.iop_unlock	= xfs_attri_item_unlock,
+	.iop_committed	= xfs_attri_item_committed,
+	.iop_push	= xfs_attri_item_push,
+	.iop_committing = xfs_attri_item_committing
+};
+
+
+/*
+ * Allocate and initialize an attri item
+ */
+struct xfs_attri_log_item *
+xfs_attri_init(
+	struct xfs_mount	*mp)
+
+{
+	struct xfs_attri_log_item	*attrip;
+	uint			size;
+
+	size = (uint)(sizeof(struct xfs_attri_log_item));
+	attrip = kmem_zalloc(size, KM_SLEEP);
+
+	xfs_log_item_init(mp, &(attrip->item), XFS_LI_ATTRI,
+			  &xfs_attri_item_ops);
+	attrip->format.alfi_id = (uintptr_t)(void *)attrip;
+	atomic_set(&attrip->refcount, 2);
+
+	return attrip;
+}
+
+/*
+ * Copy an attr format buffer from the given buf, and into the destination
+ * attr format structure.
+ */
+int
+xfs_attri_copy_format(struct xfs_log_iovec *buf,
+		      struct xfs_attri_log_format *dst_attr_fmt)
+{
+	struct xfs_attri_log_format *src_attr_fmt = buf->i_addr;
+	uint len = sizeof(struct xfs_attri_log_format);
+
+	if (buf->i_len == len) {
+		memcpy((char *)dst_attr_fmt, (char *)src_attr_fmt, len);
+		return 0;
+	}
+	return -EFSCORRUPTED;
+}
+
+/*
+ * Copy an attr format buffer from the given buf, and into the destination
+ * attr format structure.
+ */
+int
+xfs_attrd_copy_format(struct xfs_log_iovec *buf,
+		      struct xfs_attrd_log_format *dst_attr_fmt)
+{
+	struct xfs_attrd_log_format *src_attr_fmt = buf->i_addr;
+	uint len = sizeof(struct xfs_attrd_log_format);
+
+	if (buf->i_len == len) {
+		memcpy((char *)dst_attr_fmt, (char *)src_attr_fmt, len);
+		return 0;
+	}
+	return -EFSCORRUPTED;
+}
+
+/*
+ * Freeing the attrip requires that we remove it from the AIL if it has already
+ * been placed there. However, the ATTRI may not yet have been placed in the
+ * AIL when called by xfs_attri_release() from ATTRD processing due to the
+ * ordering of committed vs unpin operations in bulk insert operations. Hence
+ * the reference count to ensure only the last caller frees the ATTRI.
+ */
+void
+xfs_attri_release(
+	struct xfs_attri_log_item	*attrip)
+{
+	ASSERT(atomic_read(&attrip->refcount) > 0);
+	if (atomic_dec_and_test(&attrip->refcount)) {
+		xfs_trans_ail_remove(&attrip->item, SHUTDOWN_LOG_IO_ERROR);
+		xfs_attri_item_free(attrip);
+	}
+}
+
+static inline struct xfs_attrd_log_item *ATTRD_ITEM(struct xfs_log_item *lip)
+{
+	return container_of(lip, struct xfs_attrd_log_item, item);
+}
+
+STATIC void
+xfs_attrd_item_free(struct xfs_attrd_log_item *attrdp)
+{
+	kmem_free(attrdp->item.li_lv_shadow);
+	kmem_free(attrdp);
+}
+
+/*
+ * This returns the number of iovecs needed to log the given attrd item.
+ * We only need 1 iovec for an attrd item.  It just logs the attr_log_format
+ * structure.
+ */
+static inline int
+xfs_attrd_item_sizeof(
+	struct xfs_attrd_log_item *attrdp)
+{
+	return sizeof(struct xfs_attrd_log_format);
+}
+
+STATIC void
+xfs_attrd_item_size(
+	struct xfs_log_item	*lip,
+	int			*nvecs,
+	int			*nbytes)
+{
+	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
+	*nvecs += 1;
+	*nbytes += xfs_attrd_item_sizeof(attrdp);
+}
+
+/*
+ * This is called to fill in the vector of log iovecs for the
+ * given attrd log item. We use only 1 iovec, and we point that
+ * at the attr_log_format structure embedded in the attrd item.
+ * It is at this point that we assert that all of the attr
+ * slots in the attrd item have been filled.
+ */
+STATIC void
+xfs_attrd_item_format(
+	struct xfs_log_item	*lip,
+	struct xfs_log_vec	*lv)
+{
+	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
+	struct xfs_log_iovec	*vecp = NULL;
+
+	attrdp->format.alfd_type = XFS_LI_ATTRD;
+	attrdp->format.alfd_size = 1;
+
+	xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTRD_FORMAT,
+			&attrdp->format,
+			xfs_attrd_item_sizeof(attrdp));
+}
+
+/*
+ * Pinning has no meaning for an attrd item, so just return.
+ */
+STATIC void
+xfs_attrd_item_pin(
+	struct xfs_log_item	*lip)
+{
+}
+
+/*
+ * Since pinning has no meaning for an attrd item, unpinning does
+ * not either.
+ */
+STATIC void
+xfs_attrd_item_unpin(
+	struct xfs_log_item	*lip,
+	int			remove)
+{
+}
+
+/*
+ * There isn't much you can do to push on an attrd item.  It is simply stuck
+ * waiting for the log to be flushed to disk.
+ */
+STATIC uint
+xfs_attrd_item_push(
+	struct xfs_log_item	*lip,
+	struct list_head	*buffer_list)
+{
+	return XFS_ITEM_PINNED;
+}
+
+/*
+ * The ATTRD is either committed or aborted if the transaction is cancelled. If
+ * the transaction is cancelled, drop our reference to the ATTRI and free the
+ * ATTRD.
+ */
+STATIC void
+xfs_attrd_item_unlock(
+	struct xfs_log_item	*lip)
+{
+	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
+
+	if (test_bit(XFS_LI_ABORTED, &lip->li_flags)) {
+		xfs_attri_release(attrdp->attrip);
+		xfs_attrd_item_free(attrdp);
+	}
+}
+
+/*
+ * When the attrd item is committed to disk, all we need to do is delete our
+ * reference to our partner attri item and then free ourselves. Since we're
+ * freeing ourselves we must return -1 to keep the transaction code from
+ * further referencing this item.
+ */
+STATIC xfs_lsn_t
+xfs_attrd_item_committed(
+	struct xfs_log_item	*lip,
+	xfs_lsn_t		lsn)
+{
+	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
+
+	/*
+	 * Drop the ATTRI reference regardless of whether the ATTRD has been
+	 * aborted. Once the ATTRD transaction is constructed, it is the sole
+	 * responsibility of the ATTRD to release the ATTRI (even if the ATTRI
+	 * is aborted due to log I/O error).
+	 */
+	xfs_attri_release(attrdp->attrip);
+	xfs_attrd_item_free(attrdp);
+
+	return (xfs_lsn_t)-1;
+}
+
+STATIC void
+xfs_attrd_item_committing(
+	struct xfs_log_item	*lip,
+	xfs_lsn_t		lsn)
+{
+}
+
+/*
+ * This is the ops vector shared by all attrd log items.
+ */
+static const struct xfs_item_ops xfs_attrd_item_ops = {
+	.iop_size	= xfs_attrd_item_size,
+	.iop_format	= xfs_attrd_item_format,
+	.iop_pin	= xfs_attrd_item_pin,
+	.iop_unpin	= xfs_attrd_item_unpin,
+	.iop_unlock	= xfs_attrd_item_unlock,
+	.iop_committed	= xfs_attrd_item_committed,
+	.iop_push	= xfs_attrd_item_push,
+	.iop_committing = xfs_attrd_item_committing
+};
+
+/*
+ * Allocate and initialize an attrd item
+ */
+struct xfs_attrd_log_item *
+xfs_attrd_init(
+	struct xfs_mount	*mp,
+	struct xfs_attri_log_item	*attrip)
+
+{
+	struct xfs_attrd_log_item	*attrdp;
+	uint			size;
+
+	size = (uint)(sizeof(struct xfs_attrd_log_item));
+	attrdp = kmem_zalloc(size, KM_SLEEP);
+
+	xfs_log_item_init(mp, &attrdp->item, XFS_LI_ATTRD,
+			  &xfs_attrd_item_ops);
+	attrdp->attrip = attrip;
+	attrdp->format.alfd_alf_id = attrip->format.alfi_id;
+
+	return attrdp;
+}
+
+/*
+ * Process an attr intent item that was recovered from
+ * the log.  We need to delete the attr that it describes.
+ */
+int
+xfs_attri_recover(
+	struct xfs_mount		*mp,
+	struct xfs_attri_log_item	*attrip)
+{
+	struct xfs_inode		*ip;
+	struct xfs_attrd_log_item	*attrdp;
+	struct xfs_da_args		args;
+	struct xfs_attri_log_format	*attrp;
+	struct xfs_trans_res		tres;
+	int				local;
+	int				error = 0;
+	int				rsvd = 0;
+
+	ASSERT(!test_bit(XFS_ATTRI_RECOVERED, &attrip->flags));
+
+	/*
+	 * First check the validity of the attr described by the
+	 * ATTRI.  If any are bad, then assume that all are bad and
+	 * just toss the ATTRI.
+	 */
+	attrp = &attrip->format;
+	if (
+	    /*
+	     * Must have either XFS_ATTR_OP_FLAGS_SET or
+	     * XFS_ATTR_OP_FLAGS_REMOVE set
+	     */
+	    !(attrp->alfi_op_flags == XFS_ATTR_OP_FLAGS_SET ||
+		attrp->alfi_op_flags == XFS_ATTR_OP_FLAGS_REMOVE) ||
+
+	    /* Check size of value and name lengths */
+	    (attrp->alfi_value_len > XATTR_SIZE_MAX ||
+		attrp->alfi_name_len > XATTR_NAME_MAX) ||
+
+	    /*
+	     * If the XFS_ATTR_OP_FLAGS_SET flag is set,
+	     * there must also be a name and value
+	     */
+	    (attrp->alfi_op_flags == XFS_ATTR_OP_FLAGS_SET &&
+		(attrp->alfi_value_len == 0 || attrp->alfi_name_len == 0)) ||
+
+	    /*
+	     * If the XFS_ATTR_OP_FLAGS_REMOVE flag is set,
+	     * there must also be a name
+	     */
+	    (attrp->alfi_op_flags == XFS_ATTR_OP_FLAGS_REMOVE &&
+		(attrp->alfi_name_len == 0))
+	) {
+		/*
+		 * This will pull the ATTRI from the AIL and
+		 * free the memory associated with it.
+		 */
+		set_bit(XFS_ATTRI_RECOVERED, &attrip->flags);
+		xfs_attri_release(attrip);
+		return -EIO;
+	}
+
+	attrp = &attrip->format;
+	error = xfs_iget(mp, 0, attrp->alfi_ino, 0, 0, &ip);
+	if (error)
+		return error;
+
+	error = xfs_attr_args_init(&args, ip, attrip->name,
+			attrp->alfi_name_len, attrp->alfi_attr_flags);
+	if (error)
+		return error;
+
+	args.hashval = xfs_da_hashname(args.name, args.namelen);
+	args.value = attrip->value;
+	args.valuelen = attrp->alfi_value_len;
+	args.op_flags = XFS_DA_OP_OKNOENT;
+	args.total = xfs_attr_calc_size(&args, &local);
+
+	tres.tr_logres = M_RES(mp)->tr_attrsetm.tr_logres +
+			M_RES(mp)->tr_attrsetrt.tr_logres * args.total;
+	tres.tr_logcount = XFS_ATTRSET_LOG_COUNT;
+	tres.tr_logflags = XFS_TRANS_PERM_LOG_RES;
+
+	error = xfs_trans_alloc(mp, &tres, args.total,  0,
+				rsvd ? XFS_TRANS_RESERVE : 0, &args.trans);
+	if (error)
+		return error;
+	attrdp = xfs_trans_get_attrd(args.trans, attrip);
+
+	xfs_ilock(ip, XFS_ILOCK_EXCL);
+
+	xfs_trans_ijoin(args.trans, ip, 0);
+	error = xfs_trans_attr(&args, attrdp, attrp->alfi_op_flags);
+	if (error)
+		goto abort_error;
+
+
+	set_bit(XFS_ATTRI_RECOVERED, &attrip->flags);
+	xfs_trans_log_inode(args.trans, ip, XFS_ILOG_CORE | XFS_ILOG_ADATA);
+	error = xfs_trans_commit(args.trans);
+	xfs_iunlock(ip, XFS_ILOCK_EXCL);
+	return error;
+
+abort_error:
+	xfs_trans_cancel(args.trans);
+	xfs_iunlock(ip, XFS_ILOCK_EXCL);
+	return error;
+}
diff --git a/fs/xfs/xfs_attr_item.h b/fs/xfs/xfs_attr_item.h
new file mode 100644
index 0000000..fce7515
--- /dev/null
+++ b/fs/xfs/xfs_attr_item.h
@@ -0,0 +1,103 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * Copyright (C) 2019 Oracle.  All Rights Reserved.
+ * Author: Allison Henderson <allison.henderson@oracle.com>
+ */
+#ifndef	__XFS_ATTR_ITEM_H__
+#define	__XFS_ATTR_ITEM_H__
+
+/* kernel only ATTRI/ATTRD definitions */
+
+struct xfs_mount;
+struct kmem_zone;
+
+/*
+ * Max number of attrs in fast allocation path.
+ */
+#define XFS_ATTRI_MAX_FAST_ATTRS        1
+
+
+/*
+ * Define ATTR flag bits. Manipulated by set/clear/test_bit operators.
+ */
+#define	XFS_ATTRI_RECOVERED	1
+
+
+/* nvecs must be in multiples of 4 */
+#define ATTR_NVEC_SIZE(size) (size == sizeof(int32_t) ? sizeof(int32_t) : \
+				size + sizeof(int32_t) - \
+				(size % sizeof(int32_t)))
+
+/*
+ * This is the "attr intention" log item.  It is used to log the fact
+ * that some attrs need to be processed.  It is used in conjunction with the
+ * "attr done" log item described below.
+ *
+ * The ATTRI is reference counted so that it is not freed prior to both the
+ * ATTRI and ATTRD being committed and unpinned. This ensures the ATTRI is
+ * inserted into the AIL even in the event of out of order ATTRI/ATTRD
+ * processing. In other words, an ATTRI is born with two references:
+ *
+ *      1.) an ATTRI held reference to track ATTRI AIL insertion
+ *      2.) an ATTRD held reference to track ATTRD commit
+ *
+ * On allocation, both references are the responsibility of the caller. Once
+ * the ATTRI is added to and dirtied in a transaction, ownership of reference
+ * one transfers to the transaction. The reference is dropped once the ATTRI is
+ * inserted to the AIL or in the event of failure along the way (e.g., commit
+ * failure, log I/O error, etc.). Note that the caller remains responsible for
+ * the ATTRD reference under all circumstances to this point. The caller has no
+ * means to detect failure once the transaction is committed, however.
+ * Therefore, an ATTRD is required after this point, even in the event of
+ * unrelated failure.
+ *
+ * Once an ATTRD is allocated and dirtied in a transaction, reference two
+ * transfers to the transaction. The ATTRD reference is dropped once it reaches
+ * the unpin handler. Similar to the ATTRI, the reference also drops in the
+ * event of commit failure or log I/O errors. Note that the ATTRD is not
+ * inserted in the AIL, so at this point both the ATTI and ATTRD are freed.
+ */
+struct xfs_attri_log_item {
+	xfs_log_item_t			item;
+	atomic_t			refcount;
+	unsigned long			flags;	/* misc flags */
+	int				name_len;
+	void				*name;
+	int				value_len;
+	void				*value;
+	struct xfs_attri_log_format	format;
+};
+
+/*
+ * This is the "attr done" log item.  It is used to log
+ * the fact that some attrs earlier mentioned in an attri item
+ * have been freed.
+ */
+struct xfs_attrd_log_item {
+	struct xfs_log_item		item;
+	struct xfs_attri_log_item	*attrip;
+	struct xfs_attrd_log_format	format;
+};
+
+/*
+ * Max number of attrs in fast allocation path.
+ */
+#define	XFS_ATTRD_MAX_FAST_ATTRS	1
+
+extern struct kmem_zone	*xfs_attri_zone;
+extern struct kmem_zone	*xfs_attrd_zone;
+
+struct xfs_attri_log_item	*xfs_attri_init(struct xfs_mount *mp);
+struct xfs_attrd_log_item	*xfs_attrd_init(struct xfs_mount *mp,
+					struct xfs_attri_log_item *attrip);
+int xfs_attri_copy_format(struct xfs_log_iovec *buf,
+			   struct xfs_attri_log_format *dst_attri_fmt);
+int xfs_attrd_copy_format(struct xfs_log_iovec *buf,
+			   struct xfs_attrd_log_format *dst_attrd_fmt);
+void			xfs_attri_item_free(struct xfs_attri_log_item *attrip);
+void			xfs_attri_release(struct xfs_attri_log_item *attrip);
+
+int			xfs_attri_recover(struct xfs_mount *mp,
+					struct xfs_attri_log_item *attrip);
+
+#endif	/* __XFS_ATTR_ITEM_H__ */
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 3371d1f..101ab5e 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -22,6 +22,7 @@
 #include "xfs_log_recover.h"
 #include "xfs_inode_item.h"
 #include "xfs_extfree_item.h"
+#include "xfs_attr_item.h"
 #include "xfs_trans_priv.h"
 #include "xfs_alloc.h"
 #include "xfs_ialloc.h"
@@ -1965,6 +1966,8 @@ xlog_recover_reorder_trans(
 		case XFS_LI_CUD:
 		case XFS_LI_BUI:
 		case XFS_LI_BUD:
+		case XFS_LI_ATTRI:
+		case XFS_LI_ATTRD:
 			trace_xfs_log_recover_item_reorder_tail(log,
 							trans, item, pass);
 			list_move_tail(&item->ri_list, &inode_list);
@@ -3504,6 +3507,117 @@ xlog_recover_efd_pass2(
 	return 0;
 }
 
+STATIC int
+xlog_recover_attri_pass2(
+	struct xlog                     *log,
+	struct xlog_recover_item        *item,
+	xfs_lsn_t                       lsn)
+{
+	int                             error;
+	struct xfs_mount                *mp = log->l_mp;
+	struct xfs_attri_log_item       *attrip;
+	struct xfs_attri_log_format     *attri_formatp;
+	char				*name = NULL;
+	char				*value = NULL;
+	int				region = 0;
+
+	attri_formatp = item->ri_buf[region].i_addr;
+
+	attrip = xfs_attri_init(mp);
+	error = xfs_attri_copy_format(&item->ri_buf[region], &attrip->format);
+	if (error) {
+		xfs_attri_item_free(attrip);
+		return error;
+	}
+
+	attrip->name_len = attri_formatp->alfi_name_len;
+	attrip->value_len = attri_formatp->alfi_value_len;
+	attrip = kmem_realloc(attrip, sizeof(struct xfs_attri_log_item) +
+				attrip->name_len + attrip->value_len, KM_SLEEP);
+
+	if (attrip->name_len > 0) {
+		region++;
+		name = ((char *)attrip) + sizeof(struct xfs_attri_log_item);
+		memcpy(name, item->ri_buf[region].i_addr,
+			attrip->name_len);
+		attrip->name = name;
+	}
+
+	if (attrip->value_len > 0) {
+		region++;
+		value = ((char *)attrip) + sizeof(struct xfs_attri_log_item) +
+			attrip->name_len;
+		memcpy(value, item->ri_buf[region].i_addr,
+			attrip->value_len);
+		attrip->value = value;
+	}
+
+	spin_lock(&log->l_ailp->ail_lock);
+	/*
+	 * The ATTRI has two references. One for the ATTRD and one for ATTRI to
+	 * ensure it makes it into the AIL. Insert the ATTRI into the AIL
+	 * directly and drop the ATTRI reference. Note that
+	 * xfs_trans_ail_update() drops the AIL lock.
+	 */
+	xfs_trans_ail_update(log->l_ailp, &attrip->item, lsn);
+	xfs_attri_release(attrip);
+	return 0;
+}
+
+
+/*
+ * This routine is called when an ATTRD format structure is found in a committed
+ * transaction in the log. Its purpose is to cancel the corresponding ATTRI if
+ * it was still in the log. To do this it searches the AIL for the ATTRI with
+ * an id equal to that in the ATTRD format structure. If we find it we drop
+ * the ATTRD reference, which removes the ATTRI from the AIL and frees it.
+ */
+STATIC int
+xlog_recover_attrd_pass2(
+	struct xlog                     *log,
+	struct xlog_recover_item        *item)
+{
+	struct xfs_attrd_log_format	*attrd_formatp;
+	struct xfs_attri_log_item	*attrip = NULL;
+	struct xfs_log_item		*lip;
+	uint64_t			attri_id;
+	struct xfs_ail_cursor		cur;
+	struct xfs_ail			*ailp = log->l_ailp;
+
+	attrd_formatp = item->ri_buf[0].i_addr;
+	ASSERT((item->ri_buf[0].i_len ==
+				(sizeof(struct xfs_attrd_log_format))));
+	attri_id = attrd_formatp->alfd_alf_id;
+
+	/*
+	 * Search for the ATTRI with the id in the ATTRD format structure in the
+	 * AIL.
+	 */
+	spin_lock(&ailp->ail_lock);
+	lip = xfs_trans_ail_cursor_first(ailp, &cur, 0);
+	while (lip != NULL) {
+		if (lip->li_type == XFS_LI_ATTRI) {
+			attrip = (struct xfs_attri_log_item *)lip;
+			if (attrip->format.alfi_id == attri_id) {
+				/*
+				 * Drop the ATTRD reference to the ATTRI. This
+				 * removes the ATTRI from the AIL and frees it.
+				 */
+				spin_unlock(&ailp->ail_lock);
+				xfs_attri_release(attrip);
+				spin_lock(&ailp->ail_lock);
+				break;
+			}
+		}
+		lip = xfs_trans_ail_cursor_next(ailp, &cur);
+	}
+
+	xfs_trans_ail_cursor_done(&cur);
+	spin_unlock(&ailp->ail_lock);
+
+	return 0;
+}
+
 /*
  * This routine is called to create an in-core extent rmap update
  * item from the rui format structure which was logged on disk.
@@ -4055,6 +4169,8 @@ xlog_recover_ra_pass2(
 		break;
 	case XFS_LI_EFI:
 	case XFS_LI_EFD:
+	case XFS_LI_ATTRI:
+	case XFS_LI_ATTRD:
 	case XFS_LI_QUOTAOFF:
 	case XFS_LI_RUI:
 	case XFS_LI_RUD:
@@ -4083,6 +4199,8 @@ xlog_recover_commit_pass1(
 	case XFS_LI_INODE:
 	case XFS_LI_EFI:
 	case XFS_LI_EFD:
+	case XFS_LI_ATTRI:
+	case XFS_LI_ATTRD:
 	case XFS_LI_DQUOT:
 	case XFS_LI_ICREATE:
 	case XFS_LI_RUI:
@@ -4121,6 +4239,10 @@ xlog_recover_commit_pass2(
 		return xlog_recover_efi_pass2(log, item, trans->r_lsn);
 	case XFS_LI_EFD:
 		return xlog_recover_efd_pass2(log, item);
+	case XFS_LI_ATTRI:
+		return xlog_recover_attri_pass2(log, item, trans->r_lsn);
+	case XFS_LI_ATTRD:
+		return xlog_recover_attrd_pass2(log, item);
 	case XFS_LI_RUI:
 		return xlog_recover_rui_pass2(log, item, trans->r_lsn);
 	case XFS_LI_RUD:
@@ -4682,6 +4804,48 @@ xlog_recover_cancel_efi(
 	spin_lock(&ailp->ail_lock);
 }
 
+/* Release the ATTRI since we're cancelling everything. */
+STATIC void
+xlog_recover_cancel_attri(
+	struct xfs_mount                *mp,
+	struct xfs_ail                  *ailp,
+	struct xfs_log_item             *lip)
+{
+	struct xfs_attri_log_item         *attrip;
+
+	attrip = container_of(lip, struct xfs_attri_log_item, item);
+
+	spin_unlock(&ailp->ail_lock);
+	xfs_attri_release(attrip);
+	spin_lock(&ailp->ail_lock);
+}
+
+
+/* Recover the ATTRI if necessary. */
+STATIC int
+xlog_recover_process_attri(
+	struct xfs_mount                *mp,
+	struct xfs_ail                  *ailp,
+	struct xfs_log_item             *lip)
+{
+	struct xfs_attri_log_item       *attrip;
+	int                             error;
+
+	/*
+	 * Skip ATTRIs that we've already processed.
+	 */
+	attrip = container_of(lip, struct xfs_attri_log_item, item);
+	if (test_bit(XFS_ATTRI_RECOVERED, &attrip->flags))
+		return 0;
+
+	spin_unlock(&ailp->ail_lock);
+	error = xfs_attri_recover(mp, attrip);
+	spin_lock(&ailp->ail_lock);
+
+	return error;
+}
+
+
 /* Recover the RUI if necessary. */
 STATIC int
 xlog_recover_process_rui(
@@ -4810,6 +4974,7 @@ static inline bool xlog_item_is_intent(struct xfs_log_item *lip)
 	case XFS_LI_RUI:
 	case XFS_LI_CUI:
 	case XFS_LI_BUI:
+	case XFS_LI_ATTRI:
 		return true;
 	default:
 		return false;
@@ -4928,6 +5093,10 @@ xlog_recover_process_intents(
 		case XFS_LI_EFI:
 			error = xlog_recover_process_efi(log->l_mp, ailp, lip);
 			break;
+		case XFS_LI_ATTRI:
+			error = xlog_recover_process_attri(log->l_mp,
+							   ailp, lip);
+			break;
 		case XFS_LI_RUI:
 			error = xlog_recover_process_rui(log->l_mp, ailp, lip);
 			break;
@@ -4994,6 +5163,9 @@ xlog_recover_cancel_intents(
 		case XFS_LI_BUI:
 			xlog_recover_cancel_bui(log->l_mp, ailp, lip);
 			break;
+		case XFS_LI_ATTRI:
+			xlog_recover_cancel_attri(log->l_mp, ailp, lip);
+			break;
 		}
 
 		lip = xfs_trans_ail_cursor_next(ailp, &cur);
diff --git a/fs/xfs/xfs_ondisk.h b/fs/xfs/xfs_ondisk.h
index c8ba98f..ddd04b5 100644
--- a/fs/xfs/xfs_ondisk.h
+++ b/fs/xfs/xfs_ondisk.h
@@ -125,6 +125,8 @@ xfs_check_ondisk_structs(void)
 	XFS_CHECK_STRUCT_SIZE(struct xfs_inode_log_format,	56);
 	XFS_CHECK_STRUCT_SIZE(struct xfs_qoff_logformat,	20);
 	XFS_CHECK_STRUCT_SIZE(struct xfs_trans_header,		16);
+	XFS_CHECK_STRUCT_SIZE(struct xfs_attri_log_format,	40);
+	XFS_CHECK_STRUCT_SIZE(struct xfs_attrd_log_format,	16);
 
 	/*
 	 * The v5 superblock format extended several v4 header structures with
diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
index c6e1c57..7bb9d8e 100644
--- a/fs/xfs/xfs_trans.h
+++ b/fs/xfs/xfs_trans.h
@@ -26,6 +26,9 @@ struct xfs_cui_log_item;
 struct xfs_cud_log_item;
 struct xfs_bui_log_item;
 struct xfs_bud_log_item;
+struct xfs_attrd_log_item;
+struct xfs_attri_log_item;
+struct xfs_da_args;
 
 typedef struct xfs_log_item {
 	struct list_head		li_ail;		/* AIL pointers */
@@ -231,6 +234,13 @@ int		xfs_trans_free_extent(struct xfs_trans *,
 				      xfs_extlen_t,
 				      const struct xfs_owner_info *,
 				      bool);
+struct xfs_attrd_log_item *
+xfs_trans_get_attrd(struct xfs_trans *tp,
+		    struct xfs_attri_log_item *attrip);
+int xfs_trans_attr(struct xfs_da_args *args,
+		   struct xfs_attrd_log_item *attrdp,
+		   uint32_t attr_op_flags);
+
 int		xfs_trans_commit(struct xfs_trans *);
 int		xfs_trans_roll(struct xfs_trans **);
 int		xfs_trans_roll_inode(struct xfs_trans **, struct xfs_inode *);
diff --git a/fs/xfs/xfs_trans_attr.c b/fs/xfs/xfs_trans_attr.c
new file mode 100644
index 0000000..3679348
--- /dev/null
+++ b/fs/xfs/xfs_trans_attr.c
@@ -0,0 +1,240 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * Copyright (C) 2019 Oracle.  All Rights Reserved.
+ * Author: Allison Henderson <allison.henderson@oracle.com>
+ */
+#include "xfs.h"
+#include "xfs_fs.h"
+#include "xfs_shared.h"
+#include "xfs_format.h"
+#include "xfs_log_format.h"
+#include "xfs_trans_resv.h"
+#include "xfs_bit.h"
+#include "xfs_mount.h"
+#include "xfs_defer.h"
+#include "xfs_trans.h"
+#include "xfs_trans_priv.h"
+#include "xfs_attr_item.h"
+#include "xfs_alloc.h"
+#include "xfs_bmap.h"
+#include "xfs_trace.h"
+#include "libxfs/xfs_da_format.h"
+#include "xfs_da_btree.h"
+#include "xfs_attr.h"
+#include "xfs_inode.h"
+#include "xfs_icache.h"
+#include "xfs_quota.h"
+
+/*
+ * This routine is called to allocate an "attr free done"
+ * log item.
+ */
+struct xfs_attrd_log_item *
+xfs_trans_get_attrd(struct xfs_trans		*tp,
+		  struct xfs_attri_log_item	*attrip)
+{
+	struct xfs_attrd_log_item			*attrdp;
+
+	ASSERT(tp != NULL);
+
+	attrdp = xfs_attrd_init(tp->t_mountp, attrip);
+	ASSERT(attrdp != NULL);
+
+	/*
+	 * Get a log_item_desc to point at the new item.
+	 */
+	xfs_trans_add_item(tp, &attrdp->item);
+	return attrdp;
+}
+
+/*
+ * Delete an attr and log it to the ATTRD. Note that the transaction is marked
+ * dirty regardless of whether the attr delete succeeds or fails to support the
+ * ATTRI/ATTRD lifecycle rules.
+ */
+int
+xfs_trans_attr(
+	struct xfs_da_args		*args,
+	struct xfs_attrd_log_item	*attrdp,
+	uint32_t			op_flags)
+{
+	int				error;
+	struct xfs_buf			*leaf_bp = NULL;
+
+	error = xfs_qm_dqattach_locked(args->dp, 0);
+	if (error)
+		return error;
+
+	switch (op_flags) {
+	case XFS_ATTR_OP_FLAGS_SET:
+		args->op_flags |= XFS_DA_OP_ADDNAME;
+		error = xfs_attr_set_args(args, &leaf_bp, false);
+		break;
+	case XFS_ATTR_OP_FLAGS_REMOVE:
+		ASSERT(XFS_IFORK_Q((args->dp)));
+		error = xfs_attr_remove_args(args, false);
+		break;
+	default:
+		error = -EFSCORRUPTED;
+	}
+
+	if (error) {
+		if (leaf_bp)
+			xfs_trans_brelse(args->trans, leaf_bp);
+	}
+
+	/*
+	 * Mark the transaction dirty, even on error. This ensures the
+	 * transaction is aborted, which:
+	 *
+	 * 1.) releases the ATTRI and frees the ATTRD
+	 * 2.) shuts down the filesystem
+	 */
+	args->trans->t_flags |= XFS_TRANS_DIRTY;
+	set_bit(XFS_LI_DIRTY, &attrdp->item.li_flags);
+
+	attrdp->attrip->name = (void *)args->name;
+	attrdp->attrip->value = (void *)args->value;
+	attrdp->attrip->name_len = args->namelen;
+	attrdp->attrip->value_len = args->valuelen;
+
+	return error;
+}
+
+static int
+xfs_attr_diff_items(
+	void				*priv,
+	struct list_head		*a,
+	struct list_head		*b)
+{
+	return 0;
+}
+
+/* Get an ATTRI. */
+STATIC void *
+xfs_attr_create_intent(
+	struct xfs_trans		*tp,
+	unsigned int			count)
+{
+	struct xfs_attri_log_item		*attrip;
+
+	ASSERT(tp != NULL);
+	ASSERT(count == 1);
+
+	attrip = xfs_attri_init(tp->t_mountp);
+	ASSERT(attrip != NULL);
+
+	/*
+	 * Get a log_item_desc to point at the new item.
+	 */
+	xfs_trans_add_item(tp, &attrip->item);
+	return attrip;
+}
+
+/* Log an attr to the intent item. */
+STATIC void
+xfs_attr_log_item(
+	struct xfs_trans		*tp,
+	void				*intent,
+	struct list_head		*item)
+{
+	struct xfs_attri_log_item	*attrip = intent;
+	struct xfs_attr_item		*attr;
+	struct xfs_attri_log_format	*attrp;
+	char				*name_value;
+
+	attr = container_of(item, struct xfs_attr_item, xattri_list);
+	name_value = ((char *)attr) + sizeof(struct xfs_attr_item);
+
+	tp->t_flags |= XFS_TRANS_DIRTY;
+	set_bit(XFS_LI_DIRTY, &attrip->item.li_flags);
+
+	attrp = &attrip->format;
+	attrp->alfi_ino = attr->xattri_ip->i_ino;
+	attrp->alfi_op_flags = attr->xattri_op_flags;
+	attrp->alfi_value_len = attr->xattri_value_len;
+	attrp->alfi_name_len = attr->xattri_name_len;
+	attrp->alfi_attr_flags = attr->xattri_flags;
+
+	attrip->name = name_value;
+	attrip->value = &name_value[attr->xattri_name_len];
+	attrip->name_len = attr->xattri_name_len;
+	attrip->value_len = attr->xattri_value_len;
+}
+
+/* Get an ATTRD so we can process all the attrs. */
+STATIC void *
+xfs_attr_create_done(
+	struct xfs_trans		*tp,
+	void				*intent,
+	unsigned int			count)
+{
+	return xfs_trans_get_attrd(tp, intent);
+}
+
+/* Process an attr. */
+STATIC int
+xfs_attr_finish_item(
+	struct xfs_trans		*tp,
+	struct list_head		*item,
+	void				*done_item,
+	void				**state)
+{
+	struct xfs_attr_item		*attr;
+	char				*name_value;
+	int				error;
+	int				local;
+	struct xfs_da_args		args;
+
+	attr = container_of(item, struct xfs_attr_item, xattri_list);
+	name_value = ((char *)attr) + sizeof(struct xfs_attr_item);
+
+	error = xfs_attr_args_init(&args, attr->xattri_ip, name_value,
+				   attr->xattri_name_len, attr->xattri_flags);
+	if (error)
+		goto out;
+
+	args.hashval = xfs_da_hashname(args.name, args.namelen);
+	args.value = &name_value[attr->xattri_name_len];
+	args.valuelen = attr->xattri_value_len;
+	args.op_flags = XFS_DA_OP_OKNOENT;
+	args.total = xfs_attr_calc_size(&args, &local);
+	args.trans = tp;
+
+	error = xfs_trans_attr(&args, done_item,
+			attr->xattri_op_flags);
+out:
+	kmem_free(attr);
+	return error;
+}
+
+/* Abort all pending ATTRs. */
+STATIC void
+xfs_attr_abort_intent(
+	void				*intent)
+{
+	xfs_attri_release(intent);
+}
+
+/* Cancel an attr */
+STATIC void
+xfs_attr_cancel_item(
+	struct list_head		*item)
+{
+	struct xfs_attr_item	*attr;
+
+	attr = container_of(item, struct xfs_attr_item, xattri_list);
+	kmem_free(attr);
+}
+
+const struct xfs_defer_op_type xfs_attr_defer_type = {
+	.max_items	= XFS_ATTRI_MAX_FAST_ATTRS,
+	.diff_items	= xfs_attr_diff_items,
+	.create_intent	= xfs_attr_create_intent,
+	.abort_intent	= xfs_attr_abort_intent,
+	.log_item	= xfs_attr_log_item,
+	.create_done	= xfs_attr_create_done,
+	.finish_item	= xfs_attr_finish_item,
+	.cancel_item	= xfs_attr_cancel_item,
+};
+
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 5/9] xfs: Add xfs_attr_set_deferred and xfs_attr_remove_deferred
  2019-04-12 22:50 [PATCH 0/9] xfs: Delayed Attributes Allison Henderson
                   ` (3 preceding siblings ...)
  2019-04-12 22:50 ` [PATCH 4/9] xfs: Set up infastructure for deferred attribute operations Allison Henderson
@ 2019-04-12 22:50 ` Allison Henderson
  2019-04-18 15:49   ` Brian Foster
  2019-04-12 22:50 ` [PATCH 6/9] xfs: Add xfs_has_attr and subroutines Allison Henderson
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 48+ messages in thread
From: Allison Henderson @ 2019-04-12 22:50 UTC (permalink / raw)
  To: linux-xfs

These routines set up set and start a new deferred attribute
operation.  These functions are meant to be called by other
code needing to initiate a deferred attribute operation.  We
will use these routines later in the parent pointer patches.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/libxfs/xfs_attr.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++
 fs/xfs/libxfs/xfs_attr.h |  7 +++++
 2 files changed, 87 insertions(+)

diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index fadd485..c3477fa7 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -30,6 +30,7 @@
 #include "xfs_trans_space.h"
 #include "xfs_trace.h"
 #include "xfs_attr_item.h"
+#include "xfs_attr.h"
 
 /*
  * xfs_attr.c
@@ -429,6 +430,52 @@ xfs_attr_set(
 	goto out_unlock;
 }
 
+/* Sets an attribute for an inode as a deferred operation */
+int
+xfs_attr_set_deferred(
+	struct xfs_inode	*dp,
+	struct xfs_trans	*tp,
+	const unsigned char	*name,
+	unsigned int		namelen,
+	const unsigned char	*value,
+	unsigned int		valuelen,
+	int			flags)
+{
+
+	struct xfs_attr_item	*new;
+	char			*name_value;
+
+	/*
+	 * All set operations must have a name
+	 * but not necessarily a value.
+	 * Generic 062
+	 */
+	if (!namelen) {
+		ASSERT(0);
+		return -EFSCORRUPTED;
+	}
+
+	new = kmem_alloc(XFS_ATTR_ITEM_SIZEOF(namelen, valuelen),
+			 KM_SLEEP|KM_NOFS);
+	name_value = ((char *)new) + sizeof(struct xfs_attr_item);
+	memset(new, 0, XFS_ATTR_ITEM_SIZEOF(namelen, valuelen));
+	new->xattri_ip = dp;
+	new->xattri_op_flags = XFS_ATTR_OP_FLAGS_SET;
+	new->xattri_name_len = namelen;
+	new->xattri_value_len = valuelen;
+	new->xattri_flags = flags;
+	memcpy(&name_value[0], name, namelen);
+	new->xattri_name = name_value;
+	new->xattri_value = name_value + namelen;
+
+	if (valuelen > 0)
+		memcpy(&name_value[namelen], value, valuelen);
+
+	xfs_defer_add(tp, XFS_DEFER_OPS_TYPE_ATTR, &new->xattri_list);
+
+	return 0;
+}
+
 /*
  * Generic handler routine to remove a name from an attribute list.
  * Transitions attribute list from Btree to shortform as necessary.
@@ -513,6 +560,39 @@ xfs_attr_remove(
 	return error;
 }
 
+/* Removes an attribute for an inode as a deferred operation */
+int
+xfs_attr_remove_deferred(
+	struct xfs_inode        *dp,
+	struct xfs_trans	*tp,
+	const unsigned char	*name,
+	unsigned int		namelen,
+	int                     flags)
+{
+
+	struct xfs_attr_item	*new;
+	char			*name_value;
+
+	if (!namelen) {
+		ASSERT(0);
+		return -EFSCORRUPTED;
+	}
+
+	new = kmem_alloc(XFS_ATTR_ITEM_SIZEOF(namelen, 0), KM_SLEEP|KM_NOFS);
+	name_value = ((char *)new) + sizeof(struct xfs_attr_item);
+	memset(new, 0, XFS_ATTR_ITEM_SIZEOF(namelen, 0));
+	new->xattri_ip = dp;
+	new->xattri_op_flags = XFS_ATTR_OP_FLAGS_REMOVE;
+	new->xattri_name_len = namelen;
+	new->xattri_value_len = 0;
+	new->xattri_flags = flags;
+	memcpy(name_value, name, namelen);
+
+	xfs_defer_add(tp, XFS_DEFER_OPS_TYPE_ATTR, &new->xattri_list);
+
+	return 0;
+}
+
 /*========================================================================
  * External routines when attribute list is inside the inode
  *========================================================================*/
diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
index 92d9a15..83b3621 100644
--- a/fs/xfs/libxfs/xfs_attr.h
+++ b/fs/xfs/libxfs/xfs_attr.h
@@ -175,5 +175,12 @@ bool xfs_attr_namecheck(const void *name, size_t length);
 int xfs_attr_args_init(struct xfs_da_args *args, struct xfs_inode *dp,
 			const unsigned char *name, size_t namelen, int flags);
 int xfs_attr_calc_size(struct xfs_da_args *args, int *local);
+int xfs_attr_set_deferred(struct xfs_inode *dp, struct xfs_trans *tp,
+			  const unsigned char *name, unsigned int name_len,
+			  const unsigned char *value, unsigned int valuelen,
+			  int flags);
+int xfs_attr_remove_deferred(struct xfs_inode *dp, struct xfs_trans *tp,
+			    const unsigned char *name, unsigned int namelen,
+			    int flags);
 
 #endif	/* __XFS_ATTR_H__ */
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 6/9] xfs: Add xfs_has_attr and subroutines
  2019-04-12 22:50 [PATCH 0/9] xfs: Delayed Attributes Allison Henderson
                   ` (4 preceding siblings ...)
  2019-04-12 22:50 ` [PATCH 5/9] xfs: Add xfs_attr_set_deferred and xfs_attr_remove_deferred Allison Henderson
@ 2019-04-12 22:50 ` Allison Henderson
  2019-04-15  2:46   ` Su Yue
  2019-04-22 13:00   ` Brian Foster
  2019-04-12 22:50 ` [PATCH 7/9] xfs: Add attr context to log item Allison Henderson
                   ` (2 subsequent siblings)
  8 siblings, 2 replies; 48+ messages in thread
From: Allison Henderson @ 2019-04-12 22:50 UTC (permalink / raw)
  To: linux-xfs

This patch adds a new functions to check for the existence of
an attribute.  Subroutines are also added to handle the cases
of leaf blocks, nodes or shortform.  We will need this later
for delayed attributes since delayed operations cannot return
error codes.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/libxfs/xfs_attr.c      | 78 +++++++++++++++++++++++++++++++++++++++++++
 fs/xfs/libxfs/xfs_attr.h      |  1 +
 fs/xfs/libxfs/xfs_attr_leaf.c | 33 ++++++++++++++++++
 fs/xfs/libxfs/xfs_attr_leaf.h |  1 +
 4 files changed, 113 insertions(+)

diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index c3477fa7..0042708 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -53,6 +53,7 @@ STATIC int xfs_attr_shortform_addname(xfs_da_args_t *args);
 STATIC int xfs_attr_leaf_get(xfs_da_args_t *args);
 STATIC int xfs_attr_leaf_addname(xfs_da_args_t *args, bool roll_trans);
 STATIC int xfs_attr_leaf_removename(xfs_da_args_t *args, bool roll_trans);
+STATIC int xfs_leaf_has_attr(xfs_da_args_t *args);
 
 /*
  * Internal routines when attribute list is more than one block.
@@ -60,6 +61,7 @@ STATIC int xfs_attr_leaf_removename(xfs_da_args_t *args, bool roll_trans);
 STATIC int xfs_attr_node_get(xfs_da_args_t *args);
 STATIC int xfs_attr_node_addname(xfs_da_args_t *args, bool roll_trans);
 STATIC int xfs_attr_node_removename(xfs_da_args_t *args, bool roll_trans);
+STATIC int xfs_attr_node_hasname(xfs_da_args_t *args);
 STATIC int xfs_attr_fillstate(xfs_da_state_t *state);
 STATIC int xfs_attr_refillstate(xfs_da_state_t *state);
 
@@ -301,6 +303,29 @@ xfs_attr_set_args(
 }
 
 /*
+ * Return successful if attr is found, or ENOATTR if not
+ */
+int
+xfs_has_attr(
+	struct xfs_da_args      *args)
+{
+	struct xfs_inode        *dp = args->dp;
+	int                     error;
+
+	if (!xfs_inode_hasattr(dp))
+		error = -ENOATTR;
+	else if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL) {
+		ASSERT(dp->i_afp->if_flags & XFS_IFINLINE);
+		error = xfs_shortform_has_attr(args);
+	} else if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
+		error = xfs_leaf_has_attr(args);
+	else
+		error = xfs_attr_node_hasname(args);
+
+	return error;
+}
+
+/*
  * Remove the attribute specified in @args.
  */
 int
@@ -836,6 +861,29 @@ xfs_attr_leaf_addname(
 }
 
 /*
+ * Return successful if attr is found, or ENOATTR if not
+ */
+STATIC int
+xfs_leaf_has_attr(
+	struct xfs_da_args      *args)
+{
+	struct xfs_buf          *bp;
+	int                     error = 0;
+
+	args->blkno = 0;
+	error = xfs_attr3_leaf_read(args->trans, args->dp,
+			args->blkno, -1, &bp);
+	if (error)
+		return error;
+
+	error = xfs_attr3_leaf_lookup_int(bp, args);
+	error = (error == -ENOATTR) ? -ENOATTR : 0;
+	xfs_trans_brelse(args->trans, bp);
+
+	return error;
+}
+
+/*
  * Remove a name from the leaf attribute list structure
  *
  * This leaf block cannot have a "remote" value, we only call this routine
@@ -1166,6 +1214,36 @@ xfs_attr_node_addname(
 }
 
 /*
+ * Return successful if attr is found, or ENOATTR if not
+ */
+STATIC int
+xfs_attr_node_hasname(
+	struct xfs_da_args	*args)
+{
+	struct xfs_da_state	*state;
+	struct xfs_inode	*dp;
+	int			retval, error;
+
+	/*
+	 * Tie a string around our finger to remind us where we are.
+	 */
+	dp = args->dp;
+	state = xfs_da_state_alloc();
+	state->args = args;
+	state->mp = dp->i_mount;
+
+	/*
+	 * Search to see if name exists, and get back a pointer to it.
+	 */
+	error = xfs_da3_node_lookup_int(state, &retval);
+	if (error || (retval != -EEXIST)) {
+		if (error == 0)
+			error = retval;
+	}
+	return error;
+}
+
+/*
  * Remove a name from a B-tree attribute list.
  *
  * This will involve walking down the Btree, and may involve joining
diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
index 83b3621..974c963 100644
--- a/fs/xfs/libxfs/xfs_attr.h
+++ b/fs/xfs/libxfs/xfs_attr.h
@@ -168,6 +168,7 @@ int xfs_attr_set_args(struct xfs_da_args *args, struct xfs_buf **leaf_bp,
 		 bool roll_trans);
 int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name,
 		    size_t namelen, int flags);
+int xfs_has_attr(struct xfs_da_args *args);
 int xfs_attr_remove_args(struct xfs_da_args *args, bool roll_trans);
 int xfs_attr_list(struct xfs_inode *dp, char *buffer, int bufsize,
 		  int flags, struct attrlist_cursor_kern *cursor);
diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c
index 128bfe9..e9f2f53 100644
--- a/fs/xfs/libxfs/xfs_attr_leaf.c
+++ b/fs/xfs/libxfs/xfs_attr_leaf.c
@@ -622,6 +622,39 @@ xfs_attr_fork_remove(
 }
 
 /*
+ * Return successful if attr is found, or ENOATTR if not
+ */
+int
+xfs_shortform_has_attr(
+	struct xfs_da_args	 *args)
+{
+	struct xfs_attr_shortform *sf;
+	struct xfs_attr_sf_entry *sfe;
+	int			base = sizeof(struct xfs_attr_sf_hdr);
+	int			size = 0;
+	int			end;
+	int			i;
+
+	sf = (struct xfs_attr_shortform *)args->dp->i_afp->if_u1.if_data;
+	sfe = &sf->list[0];
+	end = sf->hdr.count;
+	for (i = 0; i < end; sfe = XFS_ATTR_SF_NEXTENTRY(sfe),
+			base += size, i++) {
+		size = XFS_ATTR_SF_ENTSIZE(sfe);
+		if (sfe->namelen != args->namelen)
+			continue;
+		if (memcmp(sfe->nameval, args->name, args->namelen) != 0)
+			continue;
+		if (!xfs_attr_namesp_match(args->flags, sfe->flags))
+			continue;
+		break;
+	}
+	if (i == end)
+		return -ENOATTR;
+	return 0;
+}
+
+/*
  * Remove an attribute from the shortform attribute list structure.
  */
 int
diff --git a/fs/xfs/libxfs/xfs_attr_leaf.h b/fs/xfs/libxfs/xfs_attr_leaf.h
index 9d830ec..98dd169 100644
--- a/fs/xfs/libxfs/xfs_attr_leaf.h
+++ b/fs/xfs/libxfs/xfs_attr_leaf.h
@@ -39,6 +39,7 @@ int	xfs_attr_shortform_getvalue(struct xfs_da_args *args);
 int	xfs_attr_shortform_to_leaf(struct xfs_da_args *args,
 			struct xfs_buf **leaf_bp);
 int	xfs_attr_shortform_remove(struct xfs_da_args *args);
+int	xfs_shortform_has_attr(struct xfs_da_args *args);
 int	xfs_attr_shortform_allfit(struct xfs_buf *bp, struct xfs_inode *dp);
 int	xfs_attr_shortform_bytesfit(struct xfs_inode *dp, int bytes);
 xfs_failaddr_t xfs_attr_shortform_verify(struct xfs_inode *ip);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 7/9] xfs: Add attr context to log item
  2019-04-12 22:50 [PATCH 0/9] xfs: Delayed Attributes Allison Henderson
                   ` (5 preceding siblings ...)
  2019-04-12 22:50 ` [PATCH 6/9] xfs: Add xfs_has_attr and subroutines Allison Henderson
@ 2019-04-12 22:50 ` Allison Henderson
  2019-04-15 22:50   ` Darrick J. Wong
  2019-04-22 13:03   ` Brian Foster
  2019-04-12 22:50 ` [PATCH 8/9] xfs: Roll delayed attr operations by returning EAGAIN Allison Henderson
  2019-04-12 22:50 ` [PATCH 9/9] xfs: Remove roll_trans boolean Allison Henderson
  8 siblings, 2 replies; 48+ messages in thread
From: Allison Henderson @ 2019-04-12 22:50 UTC (permalink / raw)
  To: linux-xfs

This patch modifies xfs_attr_item to store a xfs_da_args, a xfs_buf pointer
and a new state type. We will use these in the next patch when
we modify xfs_set_attr_args to roll transactions by returning EAGAIN.
Because the subroutines of this function modify the contents of these
structures, we need to find a place to store them where they remain
instantiated across multiple calls to xfs_set_attr_args.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/libxfs/xfs_attr.h | 18 +++++++++++++++++-
 fs/xfs/scrub/common.c    |  2 ++
 fs/xfs/xfs_acl.c         |  2 ++
 fs/xfs/xfs_attr_item.c   |  2 +-
 fs/xfs/xfs_ioctl.c       |  2 ++
 fs/xfs/xfs_ioctl32.c     |  2 ++
 fs/xfs/xfs_iops.c        |  1 +
 fs/xfs/xfs_xattr.c       |  1 +
 8 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
index 974c963..4ce3b0a 100644
--- a/fs/xfs/libxfs/xfs_attr.h
+++ b/fs/xfs/libxfs/xfs_attr.h
@@ -77,6 +77,13 @@ typedef struct attrlist_ent {	/* data from attr_list() */
 	char	a_name[1];	/* attr name (NULL terminated) */
 } attrlist_ent_t;
 
+/* Attr state machine types */
+enum xfs_attr_state {
+	XFS_ATTR_STATE1 = 1,
+	XFS_ATTR_STATE2 = 2,
+	XFS_ATTR_STATE3 = 3,
+};
+
 /*
  * List of attrs to commit later.
  */
@@ -88,7 +95,16 @@ struct xfs_attr_item {
 	void		  *xattri_name;	      /* attr name */
 	uint32_t	  xattri_name_len;    /* length of name */
 	uint32_t	  xattri_flags;       /* attr flags */
-	struct list_head  xattri_list;
+
+	/*
+	 * Delayed attr parameters that need to remain instantiated
+	 * across transaction rolls during the defer finish
+	 */
+	struct xfs_buf		*xattri_leaf_bp;  /* Leaf buf to release */
+	enum xfs_attr_state	xattri_state;	  /* state machine marker */
+	struct xfs_da_args	xattri_args;	  /* args context */
+
+	struct list_head	xattri_list;
 
 	/*
 	 * A byte array follows the header containing the file name and
diff --git a/fs/xfs/scrub/common.c b/fs/xfs/scrub/common.c
index 0c54ff5..270c32e 100644
--- a/fs/xfs/scrub/common.c
+++ b/fs/xfs/scrub/common.c
@@ -30,6 +30,8 @@
 #include "xfs_rmap_btree.h"
 #include "xfs_log.h"
 #include "xfs_trans_priv.h"
+#include "xfs_da_format.h"
+#include "xfs_da_btree.h"
 #include "xfs_attr.h"
 #include "xfs_reflink.h"
 #include "scrub/xfs_scrub.h"
diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c
index 142de8d..9b1b93e 100644
--- a/fs/xfs/xfs_acl.c
+++ b/fs/xfs/xfs_acl.c
@@ -10,6 +10,8 @@
 #include "xfs_mount.h"
 #include "xfs_inode.h"
 #include "xfs_acl.h"
+#include "xfs_da_format.h"
+#include "xfs_da_btree.h"
 #include "xfs_attr.h"
 #include "xfs_trace.h"
 #include <linux/slab.h>
diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
index 0ea19b4..36e6d1e 100644
--- a/fs/xfs/xfs_attr_item.c
+++ b/fs/xfs/xfs_attr_item.c
@@ -19,10 +19,10 @@
 #include "xfs_rmap.h"
 #include "xfs_inode.h"
 #include "xfs_icache.h"
-#include "xfs_attr.h"
 #include "xfs_shared.h"
 #include "xfs_da_format.h"
 #include "xfs_da_btree.h"
+#include "xfs_attr.h"
 
 static inline struct xfs_attri_log_item *ATTRI_ITEM(struct xfs_log_item *lip)
 {
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index ab341d6..c8728ca 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -16,6 +16,8 @@
 #include "xfs_rtalloc.h"
 #include "xfs_itable.h"
 #include "xfs_error.h"
+#include "xfs_da_format.h"
+#include "xfs_da_btree.h"
 #include "xfs_attr.h"
 #include "xfs_bmap.h"
 #include "xfs_bmap_util.h"
diff --git a/fs/xfs/xfs_ioctl32.c b/fs/xfs/xfs_ioctl32.c
index 5001dca..23f6990 100644
--- a/fs/xfs/xfs_ioctl32.c
+++ b/fs/xfs/xfs_ioctl32.c
@@ -21,6 +21,8 @@
 #include "xfs_fsops.h"
 #include "xfs_alloc.h"
 #include "xfs_rtalloc.h"
+#include "xfs_da_format.h"
+#include "xfs_da_btree.h"
 #include "xfs_attr.h"
 #include "xfs_ioctl.h"
 #include "xfs_ioctl32.h"
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index e73c21a..561c467 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -17,6 +17,7 @@
 #include "xfs_acl.h"
 #include "xfs_quota.h"
 #include "xfs_error.h"
+#include "xfs_da_btree.h"
 #include "xfs_attr.h"
 #include "xfs_trans.h"
 #include "xfs_trace.h"
diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c
index 3013746..938e81d 100644
--- a/fs/xfs/xfs_xattr.c
+++ b/fs/xfs/xfs_xattr.c
@@ -11,6 +11,7 @@
 #include "xfs_mount.h"
 #include "xfs_da_format.h"
 #include "xfs_inode.h"
+#include "xfs_da_btree.h"
 #include "xfs_attr.h"
 #include "xfs_attr_leaf.h"
 #include "xfs_acl.h"
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 8/9] xfs: Roll delayed attr operations by returning EAGAIN
  2019-04-12 22:50 [PATCH 0/9] xfs: Delayed Attributes Allison Henderson
                   ` (6 preceding siblings ...)
  2019-04-12 22:50 ` [PATCH 7/9] xfs: Add attr context to log item Allison Henderson
@ 2019-04-12 22:50 ` Allison Henderson
  2019-04-15 23:31   ` Darrick J. Wong
  2019-04-23 14:19   ` Brian Foster
  2019-04-12 22:50 ` [PATCH 9/9] xfs: Remove roll_trans boolean Allison Henderson
  8 siblings, 2 replies; 48+ messages in thread
From: Allison Henderson @ 2019-04-12 22:50 UTC (permalink / raw)
  To: linux-xfs

This patch modifies xfs_attr_set_args to return -EAGAIN
when a transaction needs to be rolled.  All functions
currently calling xfs_attr_set_args are modified to use
the deferred attr operation, or handle the -EAGAIN return
code

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/libxfs/xfs_attr.c | 62 ++++++++++++++++++++++++++++++++++++++++--------
 fs/xfs/libxfs/xfs_attr.h |  2 +-
 fs/xfs/xfs_attr_item.c   | 41 +++++++++++++++++++++++++++-----
 fs/xfs/xfs_trans.h       |  2 ++
 fs/xfs/xfs_trans_attr.c  | 56 +++++++++++++++++++++++++------------------
 5 files changed, 123 insertions(+), 40 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index 0042708..4ddd86b 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -236,10 +236,37 @@ int
 xfs_attr_set_args(
 	struct xfs_da_args	*args,
 	struct xfs_buf          **leaf_bp,
+	enum xfs_attr_state	*state,
 	bool			roll_trans)
 {
 	struct xfs_inode	*dp = args->dp;
 	int			error = 0;
+	int			sf_size;
+
+	switch (*state) {
+	case (XFS_ATTR_STATE1):
+		goto state1;
+	case (XFS_ATTR_STATE2):
+		goto state2;
+	case (XFS_ATTR_STATE3):
+		goto state3;
+	}
+
+	/*
+	 * New inodes may not have an attribute fork yet. So set the attribute
+	 * fork appropriately
+	 */
+	if (XFS_IFORK_Q((args->dp)) == 0) {
+		sf_size = sizeof(struct xfs_attr_sf_hdr) +
+		     XFS_ATTR_SF_ENTSIZE_BYNAME(args->namelen, args->valuelen);
+		xfs_bmap_set_attrforkoff(args->dp, sf_size, NULL);
+		args->dp->i_afp = kmem_zone_zalloc(xfs_ifork_zone, KM_SLEEP);
+		args->dp->i_afp->if_flags = XFS_IFEXTENTS;
+	}
+
+	*state = XFS_ATTR_STATE1;
+	return -EAGAIN;
+state1:
 
 	/*
 	 * If the attribute list is non-existent or a shortform list,
@@ -248,7 +275,6 @@ xfs_attr_set_args(
 	if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL ||
 	    (dp->i_d.di_aformat == XFS_DINODE_FMT_EXTENTS &&
 	     dp->i_d.di_anextents == 0)) {
-
 		/*
 		 * Build initial attribute list (if required).
 		 */
@@ -262,6 +288,9 @@ xfs_attr_set_args(
 		if (error != -ENOSPC)
 			return error;
 
+		*state = XFS_ATTR_STATE2;
+		return -EAGAIN;
+state2:
 		/*
 		 * It won't fit in the shortform, transform to a leaf block.
 		 * GROT: another possible req'mt for a double-split btree op.
@@ -270,14 +299,14 @@ xfs_attr_set_args(
 		if (error)
 			return error;
 
-		if (roll_trans) {
-			/*
-			 * Prevent the leaf buffer from being unlocked so that a
-			 * concurrent AIL push cannot grab the half-baked leaf
-			 * buffer and run into problems with the write verifier.
-			 */
-			xfs_trans_bhold(args->trans, *leaf_bp);
+		/*
+		 * Prevent the leaf buffer from being unlocked so that a
+		 * concurrent AIL push cannot grab the half-baked leaf
+		 * buffer and run into problems with the write verifier.
+		 */
+		xfs_trans_bhold(args->trans, *leaf_bp);
 
+		if (roll_trans) {
 			error = xfs_defer_finish(&args->trans);
 			if (error)
 				return error;
@@ -293,6 +322,12 @@ xfs_attr_set_args(
 			xfs_trans_bjoin(args->trans, *leaf_bp);
 			*leaf_bp = NULL;
 		}
+
+		*state = XFS_ATTR_STATE3;
+		return -EAGAIN;
+state3:
+		if (*leaf_bp != NULL)
+			xfs_trans_brelse(args->trans, *leaf_bp);
 	}
 
 	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
@@ -419,7 +454,9 @@ xfs_attr_set(
 		goto out_trans_cancel;
 
 	xfs_trans_ijoin(args.trans, dp, 0);
-	error = xfs_attr_set_args(&args, &leaf_bp, true);
+
+	error = xfs_attr_set_deferred(dp, args.trans, name, namelen,
+			value, valuelen, flags);
 	if (error)
 		goto out_release_leaf;
 	if (!args.trans) {
@@ -554,8 +591,13 @@ xfs_attr_remove(
 	 */
 	xfs_trans_ijoin(args.trans, dp, 0);
 
-	error = xfs_attr_remove_args(&args, true);
+	error = xfs_has_attr(&args);
+	if (error)
+		goto out;
+
 
+	error = xfs_attr_remove_deferred(dp, args.trans,
+			name, namelen, flags);
 	if (error)
 		goto out;
 
diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
index 4ce3b0a..da95e69 100644
--- a/fs/xfs/libxfs/xfs_attr.h
+++ b/fs/xfs/libxfs/xfs_attr.h
@@ -181,7 +181,7 @@ int xfs_attr_set(struct xfs_inode *dp, const unsigned char *name,
 		 size_t namelen, unsigned char *value, int valuelen,
 		 int flags);
 int xfs_attr_set_args(struct xfs_da_args *args, struct xfs_buf **leaf_bp,
-		 bool roll_trans);
+		 enum xfs_attr_state *state, bool roll_trans);
 int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name,
 		    size_t namelen, int flags);
 int xfs_has_attr(struct xfs_da_args *args);
diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
index 36e6d1e..292d608 100644
--- a/fs/xfs/xfs_attr_item.c
+++ b/fs/xfs/xfs_attr_item.c
@@ -464,8 +464,11 @@ xfs_attri_recover(
 	struct xfs_attri_log_format	*attrp;
 	struct xfs_trans_res		tres;
 	int				local;
-	int				error = 0;
+	int				error, err2 = 0;
 	int				rsvd = 0;
+	enum xfs_attr_state		state = 0;
+	struct xfs_buf			*leaf_bp = NULL;
+
 
 	ASSERT(!test_bit(XFS_ATTRI_RECOVERED, &attrip->flags));
 
@@ -540,14 +543,40 @@ xfs_attri_recover(
 	xfs_ilock(ip, XFS_ILOCK_EXCL);
 
 	xfs_trans_ijoin(args.trans, ip, 0);
-	error = xfs_trans_attr(&args, attrdp, attrp->alfi_op_flags);
-	if (error)
-		goto abort_error;
 
+	do {
+		leaf_bp = NULL;
+
+		error = xfs_trans_attr(&args, attrdp, &leaf_bp, &state,
+				attrp->alfi_op_flags);
+		if (error && error != -EAGAIN)
+			goto abort_error;
+
+		xfs_trans_log_inode(args.trans, ip,
+				XFS_ILOG_CORE | XFS_ILOG_ADATA);
+
+		err2 = xfs_trans_commit(args.trans);
+		if (err2) {
+			error = err2;
+			goto abort_error;
+		}
+
+		if (error == -EAGAIN) {
+			err2 = xfs_trans_alloc(mp, &tres, args.total, 0,
+				XFS_TRANS_PERM_LOG_RES, &args.trans);
+			if (err2) {
+				error = err2;
+				goto abort_error;
+			}
+			xfs_trans_ijoin(args.trans, ip, 0);
+		}
+
+	} while (error == -EAGAIN);
+
+	if (leaf_bp)
+		xfs_trans_brelse(args.trans, leaf_bp);
 
 	set_bit(XFS_ATTRI_RECOVERED, &attrip->flags);
-	xfs_trans_log_inode(args.trans, ip, XFS_ILOG_CORE | XFS_ILOG_ADATA);
-	error = xfs_trans_commit(args.trans);
 	xfs_iunlock(ip, XFS_ILOCK_EXCL);
 	return error;
 
diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
index 7bb9d8e..c785cd7 100644
--- a/fs/xfs/xfs_trans.h
+++ b/fs/xfs/xfs_trans.h
@@ -239,6 +239,8 @@ xfs_trans_get_attrd(struct xfs_trans *tp,
 		    struct xfs_attri_log_item *attrip);
 int xfs_trans_attr(struct xfs_da_args *args,
 		   struct xfs_attrd_log_item *attrdp,
+		   struct xfs_buf **leaf_bp,
+		   void *state,
 		   uint32_t attr_op_flags);
 
 int		xfs_trans_commit(struct xfs_trans *);
diff --git a/fs/xfs/xfs_trans_attr.c b/fs/xfs/xfs_trans_attr.c
index 3679348..a3339ea 100644
--- a/fs/xfs/xfs_trans_attr.c
+++ b/fs/xfs/xfs_trans_attr.c
@@ -56,10 +56,11 @@ int
 xfs_trans_attr(
 	struct xfs_da_args		*args,
 	struct xfs_attrd_log_item	*attrdp,
+	struct xfs_buf			**leaf_bp,
+	void				*state,
 	uint32_t			op_flags)
 {
 	int				error;
-	struct xfs_buf			*leaf_bp = NULL;
 
 	error = xfs_qm_dqattach_locked(args->dp, 0);
 	if (error)
@@ -68,7 +69,8 @@ xfs_trans_attr(
 	switch (op_flags) {
 	case XFS_ATTR_OP_FLAGS_SET:
 		args->op_flags |= XFS_DA_OP_ADDNAME;
-		error = xfs_attr_set_args(args, &leaf_bp, false);
+		error = xfs_attr_set_args(args, leaf_bp,
+				(enum xfs_attr_state *)state, false);
 		break;
 	case XFS_ATTR_OP_FLAGS_REMOVE:
 		ASSERT(XFS_IFORK_Q((args->dp)));
@@ -78,11 +80,6 @@ xfs_trans_attr(
 		error = -EFSCORRUPTED;
 	}
 
-	if (error) {
-		if (leaf_bp)
-			xfs_trans_brelse(args->trans, leaf_bp);
-	}
-
 	/*
 	 * Mark the transaction dirty, even on error. This ensures the
 	 * transaction is aborted, which:
@@ -184,27 +181,40 @@ xfs_attr_finish_item(
 	char				*name_value;
 	int				error;
 	int				local;
-	struct xfs_da_args		args;
+	struct xfs_da_args		*args;
 
 	attr = container_of(item, struct xfs_attr_item, xattri_list);
-	name_value = ((char *)attr) + sizeof(struct xfs_attr_item);
-
-	error = xfs_attr_args_init(&args, attr->xattri_ip, name_value,
-				   attr->xattri_name_len, attr->xattri_flags);
-	if (error)
-		goto out;
+	args = &attr->xattri_args;
+
+	if (attr->xattri_state == 0) {
+		/* Only need to initialize args context once */
+		name_value = ((char *)attr) + sizeof(struct xfs_attr_item);
+		error = xfs_attr_args_init(args, attr->xattri_ip, name_value,
+					   attr->xattri_name_len,
+					   attr->xattri_flags);
+		if (error)
+			goto out;
+
+		args->hashval = xfs_da_hashname(args->name, args->namelen);
+		args->value = &name_value[attr->xattri_name_len];
+		args->valuelen = attr->xattri_value_len;
+		args->op_flags = XFS_DA_OP_OKNOENT;
+		args->total = xfs_attr_calc_size(args, &local);
+		attr->xattri_leaf_bp = NULL;
+	}
 
-	args.hashval = xfs_da_hashname(args.name, args.namelen);
-	args.value = &name_value[attr->xattri_name_len];
-	args.valuelen = attr->xattri_value_len;
-	args.op_flags = XFS_DA_OP_OKNOENT;
-	args.total = xfs_attr_calc_size(&args, &local);
-	args.trans = tp;
+	/*
+	 * Always reset trans after EAGAIN cycle
+	 * since the transaction is new
+	 */
+	args->trans = tp;
 
-	error = xfs_trans_attr(&args, done_item,
-			attr->xattri_op_flags);
+	error = xfs_trans_attr(args, done_item, &attr->xattri_leaf_bp,
+			&attr->xattri_state, attr->xattri_op_flags);
 out:
-	kmem_free(attr);
+	if (error != -EAGAIN)
+		kmem_free(attr);
+
 	return error;
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 9/9] xfs: Remove roll_trans boolean
  2019-04-12 22:50 [PATCH 0/9] xfs: Delayed Attributes Allison Henderson
                   ` (7 preceding siblings ...)
  2019-04-12 22:50 ` [PATCH 8/9] xfs: Roll delayed attr operations by returning EAGAIN Allison Henderson
@ 2019-04-12 22:50 ` Allison Henderson
  8 siblings, 0 replies; 48+ messages in thread
From: Allison Henderson @ 2019-04-12 22:50 UTC (permalink / raw)
  To: linux-xfs

All calls to functions using this parameter are now passing a
false value.  We can now remove the boolean and all affected
code.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/libxfs/xfs_attr.c        | 178 ++++++----------------------------------
 fs/xfs/libxfs/xfs_attr.h        |   4 +-
 fs/xfs/libxfs/xfs_attr_leaf.c   |  25 +-----
 fs/xfs/libxfs/xfs_attr_leaf.h   |   6 +-
 fs/xfs/libxfs/xfs_attr_remote.c |  34 +-------
 fs/xfs/libxfs/xfs_attr_remote.h |   4 +-
 fs/xfs/xfs_trans_attr.c         |   4 +-
 7 files changed, 41 insertions(+), 214 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index 4ddd86b..bd8fd32 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -51,16 +51,16 @@ STATIC int xfs_attr_shortform_addname(xfs_da_args_t *args);
  * Internal routines when attribute list is one block.
  */
 STATIC int xfs_attr_leaf_get(xfs_da_args_t *args);
-STATIC int xfs_attr_leaf_addname(xfs_da_args_t *args, bool roll_trans);
-STATIC int xfs_attr_leaf_removename(xfs_da_args_t *args, bool roll_trans);
+STATIC int xfs_attr_leaf_addname(xfs_da_args_t *args);
+STATIC int xfs_attr_leaf_removename(xfs_da_args_t *args);
 STATIC int xfs_leaf_has_attr(xfs_da_args_t *args);
 
 /*
  * Internal routines when attribute list is more than one block.
  */
 STATIC int xfs_attr_node_get(xfs_da_args_t *args);
-STATIC int xfs_attr_node_addname(xfs_da_args_t *args, bool roll_trans);
-STATIC int xfs_attr_node_removename(xfs_da_args_t *args, bool roll_trans);
+STATIC int xfs_attr_node_addname(xfs_da_args_t *args);
+STATIC int xfs_attr_node_removename(xfs_da_args_t *args);
 STATIC int xfs_attr_node_hasname(xfs_da_args_t *args);
 STATIC int xfs_attr_fillstate(xfs_da_state_t *state);
 STATIC int xfs_attr_refillstate(xfs_da_state_t *state);
@@ -200,8 +200,7 @@ xfs_attr_calc_size(
 STATIC int
 xfs_attr_try_sf_addname(
 	struct xfs_inode	*dp,
-	struct xfs_da_args	*args,
-	bool			roll_trans)
+	struct xfs_da_args	*args)
 {
 
 	struct xfs_mount	*mp = dp->i_mount;
@@ -221,11 +220,6 @@ xfs_attr_try_sf_addname(
 	if (mp->m_flags & XFS_MOUNT_WSYNC)
 		xfs_trans_set_sync(args->trans);
 
-	if (roll_trans) {
-		error2 = xfs_trans_commit(args->trans);
-		args->trans = NULL;
-	}
-
 	return error ? error : error2;
 }
 
@@ -236,8 +230,7 @@ int
 xfs_attr_set_args(
 	struct xfs_da_args	*args,
 	struct xfs_buf          **leaf_bp,
-	enum xfs_attr_state	*state,
-	bool			roll_trans)
+	enum xfs_attr_state	*state)
 {
 	struct xfs_inode	*dp = args->dp;
 	int			error = 0;
@@ -284,7 +277,7 @@ xfs_attr_set_args(
 		/*
 		 * Try to add the attr to the attribute list in the inode.
 		 */
-		error = xfs_attr_try_sf_addname(dp, args, roll_trans);
+		error = xfs_attr_try_sf_addname(dp, args);
 		if (error != -ENOSPC)
 			return error;
 
@@ -306,23 +299,6 @@ xfs_attr_set_args(
 		 */
 		xfs_trans_bhold(args->trans, *leaf_bp);
 
-		if (roll_trans) {
-			error = xfs_defer_finish(&args->trans);
-			if (error)
-				return error;
-
-			/*
-			 * Commit the leaf transformation.  We'll need another
-			 * (linked) transaction to add the new attribute to the
-			 * leaf.
-			 */
-			error = xfs_trans_roll_inode(&args->trans, dp);
-			if (error)
-				return error;
-			xfs_trans_bjoin(args->trans, *leaf_bp);
-			*leaf_bp = NULL;
-		}
-
 		*state = XFS_ATTR_STATE3;
 		return -EAGAIN;
 state3:
@@ -331,9 +307,9 @@ xfs_attr_set_args(
 	}
 
 	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
-		error = xfs_attr_leaf_addname(args, roll_trans);
+		error = xfs_attr_leaf_addname(args);
 	else
-		error = xfs_attr_node_addname(args, roll_trans);
+		error = xfs_attr_node_addname(args);
 	return error;
 }
 
@@ -365,8 +341,7 @@ xfs_has_attr(
  */
 int
 xfs_attr_remove_args(
-	struct xfs_da_args      *args,
-	bool                    roll_trans)
+	struct xfs_da_args	*args)
 {
 	struct xfs_inode	*dp = args->dp;
 	int			error;
@@ -377,9 +352,9 @@ xfs_attr_remove_args(
 		ASSERT(dp->i_afp->if_flags & XFS_IFINLINE);
 		error = xfs_attr_shortform_remove(args);
 	} else if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
-		error = xfs_attr_leaf_removename(args, roll_trans);
+		error = xfs_attr_leaf_removename(args);
 	} else {
-		error = xfs_attr_node_removename(args, roll_trans);
+		error = xfs_attr_node_removename(args);
 	}
 
 	return error;
@@ -720,8 +695,7 @@ xfs_attr_shortform_addname(xfs_da_args_t *args)
  */
 STATIC int
 xfs_attr_leaf_addname(
-	struct xfs_da_args	*args,
-	bool			roll_trans)
+	struct xfs_da_args	*args)
 {
 	struct xfs_inode	*dp;
 	struct xfs_buf		*bp;
@@ -787,37 +761,13 @@ xfs_attr_leaf_addname(
 		if (error)
 			return error;
 
-		if (roll_trans) {
-			error = xfs_defer_finish(&args->trans);
-			if (error)
-				return error;
-
-			/*
-			 * Commit the current trans (including the inode) and
-			 * start a new one.
-			 */
-			error = xfs_trans_roll_inode(&args->trans, dp);
-			if (error)
-				return error;
-		}
-
 		/*
 		 * Fob the whole rest of the problem off on the Btree code.
 		 */
-		error = xfs_attr_node_addname(args, roll_trans);
+		error = xfs_attr_node_addname(args);
 		return error;
 	}
 
-	if (roll_trans) {
-		/*
-		 * Commit the transaction that added the attr name so that
-		 * later routines can manage their own transactions.
-		 */
-		error = xfs_trans_roll_inode(&args->trans, dp);
-		if (error)
-			return error;
-	}
-
 	/*
 	 * If there was an out-of-line value, allocate the blocks we
 	 * identified for its storage and copy the value.  This is done
@@ -825,7 +775,7 @@ xfs_attr_leaf_addname(
 	 * maximum size of a transaction and/or hit a deadlock.
 	 */
 	if (args->rmtblkno > 0) {
-		error = xfs_attr_rmtval_set(args, roll_trans);
+		error = xfs_attr_rmtval_set(args);
 		if (error)
 			return error;
 	}
@@ -841,7 +791,7 @@ xfs_attr_leaf_addname(
 		 * In a separate transaction, set the incomplete flag on the
 		 * "old" attr and clear the incomplete flag on the "new" attr.
 		 */
-		error = xfs_attr3_leaf_flipflags(args, roll_trans);
+		error = xfs_attr3_leaf_flipflags(args);
 		if (error)
 			return error;
 
@@ -855,7 +805,7 @@ xfs_attr_leaf_addname(
 		args->rmtblkcnt = args->rmtblkcnt2;
 		args->rmtvaluelen = args->rmtvaluelen2;
 		if (args->rmtblkno) {
-			error = xfs_attr_rmtval_remove(args, roll_trans);
+			error = xfs_attr_rmtval_remove(args);
 			if (error)
 				return error;
 		}
@@ -879,25 +829,13 @@ xfs_attr_leaf_addname(
 			/* bp is gone due to xfs_da_shrink_inode */
 			if (error)
 				return error;
-
-			if (roll_trans) {
-				error = xfs_defer_finish(&args->trans);
-				if (error)
-					return error;
-			}
 		}
 
-		/*
-		 * Commit the remove and start the next trans in series.
-		 */
-		if (roll_trans)
-			error = xfs_trans_roll_inode(&args->trans, dp);
-
 	} else if (args->rmtblkno > 0) {
 		/*
 		 * Added a "remote" value, just clear the incomplete flag.
 		 */
-		error = xfs_attr3_leaf_clearflag(args, roll_trans);
+		error = xfs_attr3_leaf_clearflag(args);
 	}
 	return error;
 }
@@ -933,8 +871,7 @@ xfs_leaf_has_attr(
  */
 STATIC int
 xfs_attr_leaf_removename(
-	struct xfs_da_args	*args,
-	bool roll_trans)
+	struct xfs_da_args	*args)
 {
 	struct xfs_inode	*dp;
 	struct xfs_buf		*bp;
@@ -967,12 +904,6 @@ xfs_attr_leaf_removename(
 		/* bp is gone due to xfs_da_shrink_inode */
 		if (error)
 			return error;
-
-		if (roll_trans) {
-			error = xfs_defer_finish(&args->trans);
-			if (error)
-				return error;
-		}
 	}
 	return 0;
 }
@@ -1025,8 +956,7 @@ xfs_attr_leaf_get(xfs_da_args_t *args)
  */
 STATIC int
 xfs_attr_node_addname(
-	struct xfs_da_args	*args,
-	bool			roll_trans)
+	struct xfs_da_args	*args)
 {
 	struct xfs_da_state	*state;
 	struct xfs_da_state_blk	*blk;
@@ -1095,20 +1025,6 @@ xfs_attr_node_addname(
 			if (error)
 				goto out;
 
-			if (roll_trans) {
-				error = xfs_defer_finish(&args->trans);
-				if (error)
-					goto out;
-
-				/*
-				 * Commit the node conversion and start the next
-				 * trans in the chain.
-				 */
-				error = xfs_trans_roll_inode(&args->trans, dp);
-				if (error)
-					goto out;
-			}
-
 			goto restart;
 		}
 
@@ -1122,12 +1038,6 @@ xfs_attr_node_addname(
 		if (error)
 			goto out;
 
-		if (roll_trans) {
-			error = xfs_defer_finish(&args->trans);
-			if (error)
-				goto out;
-		}
-
 	} else {
 		/*
 		 * Addition succeeded, update Btree hashvals.
@@ -1143,23 +1053,13 @@ xfs_attr_node_addname(
 	state = NULL;
 
 	/*
-	 * Commit the leaf addition or btree split and start the next
-	 * trans in the chain.
-	 */
-	if (roll_trans) {
-		error = xfs_trans_roll_inode(&args->trans, dp);
-		if (error)
-			goto out;
-	}
-
-	/*
 	 * If there was an out-of-line value, allocate the blocks we
 	 * identified for its storage and copy the value.  This is done
 	 * after we create the attribute so that we don't overflow the
 	 * maximum size of a transaction and/or hit a deadlock.
 	 */
 	if (args->rmtblkno > 0) {
-		error = xfs_attr_rmtval_set(args, roll_trans);
+		error = xfs_attr_rmtval_set(args);
 		if (error)
 			return error;
 	}
@@ -1175,7 +1075,7 @@ xfs_attr_node_addname(
 		 * In a separate transaction, set the incomplete flag on the
 		 * "old" attr and clear the incomplete flag on the "new" attr.
 		 */
-		error = xfs_attr3_leaf_flipflags(args, roll_trans);
+		error = xfs_attr3_leaf_flipflags(args);
 		if (error)
 			goto out;
 
@@ -1189,7 +1089,7 @@ xfs_attr_node_addname(
 		args->rmtblkcnt = args->rmtblkcnt2;
 		args->rmtvaluelen = args->rmtvaluelen2;
 		if (args->rmtblkno) {
-			error = xfs_attr_rmtval_remove(args, roll_trans);
+			error = xfs_attr_rmtval_remove(args);
 			if (error)
 				return error;
 		}
@@ -1223,11 +1123,6 @@ xfs_attr_node_addname(
 			error = xfs_da3_join(state);
 			if (error)
 				goto out;
-			if (roll_trans) {
-				error = xfs_defer_finish(&args->trans);
-				if (error)
-					goto out;
-			}
 		}
 
 		/*
@@ -1241,7 +1136,7 @@ xfs_attr_node_addname(
 		/*
 		 * Added a "remote" value, just clear the incomplete flag.
 		 */
-		error = xfs_attr3_leaf_clearflag(args, roll_trans);
+		error = xfs_attr3_leaf_clearflag(args);
 		if (error)
 			goto out;
 	}
@@ -1294,8 +1189,7 @@ xfs_attr_node_hasname(
  */
 STATIC int
 xfs_attr_node_removename(
-	struct xfs_da_args	*args,
-	bool			roll_trans)
+	struct xfs_da_args	*args)
 {
 	struct xfs_da_state	*state;
 	struct xfs_da_state_blk	*blk;
@@ -1345,10 +1239,10 @@ xfs_attr_node_removename(
 		 * Mark the attribute as INCOMPLETE, then bunmapi() the
 		 * remote value.
 		 */
-		error = xfs_attr3_leaf_setflag(args, roll_trans);
+		error = xfs_attr3_leaf_setflag(args);
 		if (error)
 			goto out;
-		error = xfs_attr_rmtval_remove(args, roll_trans);
+		error = xfs_attr_rmtval_remove(args);
 		if (error)
 			goto out;
 
@@ -1376,19 +1270,6 @@ xfs_attr_node_removename(
 		error = xfs_da3_join(state);
 		if (error)
 			goto out;
-
-		if (roll_trans) {
-			error = xfs_defer_finish(&args->trans);
-			if (error)
-				goto out;
-			/*
-			 * Commit the Btree join operation and start
-			 * a new trans.
-			 */
-			error = xfs_trans_roll_inode(&args->trans, dp);
-			if (error)
-				goto out;
-		}
 	}
 
 	/*
@@ -1412,11 +1293,6 @@ xfs_attr_node_removename(
 			if (error)
 				goto out;
 
-			if (roll_trans) {
-				error = xfs_defer_finish(&args->trans);
-				if (error)
-					goto out;
-			}
 		} else
 			xfs_trans_brelse(args->trans, bp);
 	}
diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
index da95e69..f369895 100644
--- a/fs/xfs/libxfs/xfs_attr.h
+++ b/fs/xfs/libxfs/xfs_attr.h
@@ -181,11 +181,11 @@ int xfs_attr_set(struct xfs_inode *dp, const unsigned char *name,
 		 size_t namelen, unsigned char *value, int valuelen,
 		 int flags);
 int xfs_attr_set_args(struct xfs_da_args *args, struct xfs_buf **leaf_bp,
-		 enum xfs_attr_state *state, bool roll_trans);
+		 enum xfs_attr_state *state);
 int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name,
 		    size_t namelen, int flags);
 int xfs_has_attr(struct xfs_da_args *args);
-int xfs_attr_remove_args(struct xfs_da_args *args, bool roll_trans);
+int xfs_attr_remove_args(struct xfs_da_args *args);
 int xfs_attr_list(struct xfs_inode *dp, char *buffer, int bufsize,
 		  int flags, struct attrlist_cursor_kern *cursor);
 bool xfs_attr_namecheck(const void *name, size_t length);
diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c
index e9f2f53..90dfa53 100644
--- a/fs/xfs/libxfs/xfs_attr_leaf.c
+++ b/fs/xfs/libxfs/xfs_attr_leaf.c
@@ -2670,8 +2670,7 @@ xfs_attr_leaf_newentsize(
  */
 int
 xfs_attr3_leaf_clearflag(
-	struct xfs_da_args	*args,
-	bool			roll_trans)
+	struct xfs_da_args	*args)
 {
 	struct xfs_attr_leafblock *leaf;
 	struct xfs_attr_leaf_entry *entry;
@@ -2729,11 +2728,6 @@ xfs_attr3_leaf_clearflag(
 			 XFS_DA_LOGRANGE(leaf, name_rmt, sizeof(*name_rmt)));
 	}
 
-	/*
-	 * Commit the flag value change and start the next trans in series.
-	 */
-	if (roll_trans)
-		error = xfs_trans_roll_inode(&args->trans, args->dp);
 	return error;
 }
 
@@ -2742,8 +2736,7 @@ xfs_attr3_leaf_clearflag(
  */
 int
 xfs_attr3_leaf_setflag(
-	struct xfs_da_args	*args,
-	bool			roll_trans)
+	struct xfs_da_args	*args)
 {
 	struct xfs_attr_leafblock *leaf;
 	struct xfs_attr_leaf_entry *entry;
@@ -2783,11 +2776,6 @@ xfs_attr3_leaf_setflag(
 			 XFS_DA_LOGRANGE(leaf, name_rmt, sizeof(*name_rmt)));
 	}
 
-	/*
-	 * Commit the flag value change and start the next trans in series.
-	 */
-	if (roll_trans)
-		error = xfs_trans_roll_inode(&args->trans, args->dp);
 	return error;
 }
 
@@ -2800,8 +2788,7 @@ xfs_attr3_leaf_setflag(
  */
 int
 xfs_attr3_leaf_flipflags(
-	struct xfs_da_args	*args,
-	bool			roll_trans)
+	struct xfs_da_args	*args)
 {
 	struct xfs_attr_leafblock *leaf1;
 	struct xfs_attr_leafblock *leaf2;
@@ -2904,11 +2891,5 @@ xfs_attr3_leaf_flipflags(
 			 XFS_DA_LOGRANGE(leaf2, name_rmt, sizeof(*name_rmt)));
 	}
 
-	/*
-	 * Commit the flag value change and start the next trans in series.
-	 */
-	if (roll_trans)
-		error = xfs_trans_roll_inode(&args->trans, args->dp);
-
 	return error;
 }
diff --git a/fs/xfs/libxfs/xfs_attr_leaf.h b/fs/xfs/libxfs/xfs_attr_leaf.h
index 98dd169..d38c558 100644
--- a/fs/xfs/libxfs/xfs_attr_leaf.h
+++ b/fs/xfs/libxfs/xfs_attr_leaf.h
@@ -51,9 +51,9 @@ void	xfs_attr_fork_remove(struct xfs_inode *ip, struct xfs_trans *tp);
 int	xfs_attr3_leaf_to_node(struct xfs_da_args *args);
 int	xfs_attr3_leaf_to_shortform(struct xfs_buf *bp,
 			struct xfs_da_args *args, int forkoff);
-int	xfs_attr3_leaf_clearflag(struct xfs_da_args *args, bool roll_trans);
-int	xfs_attr3_leaf_setflag(struct xfs_da_args *args, bool roll_trans);
-int	xfs_attr3_leaf_flipflags(struct xfs_da_args *args, bool roll_trans);
+int	xfs_attr3_leaf_clearflag(struct xfs_da_args *args);
+int	xfs_attr3_leaf_setflag(struct xfs_da_args *args);
+int	xfs_attr3_leaf_flipflags(struct xfs_da_args *args);
 
 /*
  * Routines used for growing the Btree.
diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
index 18fbd22..9054f98 100644
--- a/fs/xfs/libxfs/xfs_attr_remote.c
+++ b/fs/xfs/libxfs/xfs_attr_remote.c
@@ -435,8 +435,7 @@ xfs_attr_rmtval_get(
  */
 int
 xfs_attr_rmtval_set(
-	struct xfs_da_args	*args,
-	bool			roll_trans)
+	struct xfs_da_args	*args)
 {
 	struct xfs_inode	*dp = args->dp;
 	struct xfs_mount	*mp = dp->i_mount;
@@ -490,26 +489,11 @@ xfs_attr_rmtval_set(
 		if (error)
 			return error;
 
-		if (roll_trans) {
-			error = xfs_defer_finish(&args->trans);
-			if (error)
-				return error;
-		}
-
 		ASSERT(nmap == 1);
 		ASSERT((map.br_startblock != DELAYSTARTBLOCK) &&
 		       (map.br_startblock != HOLESTARTBLOCK));
 		lblkno += map.br_blockcount;
 		blkcnt -= map.br_blockcount;
-
-		if (roll_trans) {
-			/*
-			 * Start the next trans in the chain.
-			 */
-			error = xfs_trans_roll_inode(&args->trans, dp);
-			if (error)
-				return error;
-		}
 	}
 
 	/*
@@ -569,8 +553,7 @@ xfs_attr_rmtval_set(
  */
 int
 xfs_attr_rmtval_remove(
-	struct xfs_da_args	*args,
-	bool			roll_trans)
+	struct xfs_da_args	*args)
 {
 	struct xfs_mount	*mp = args->dp->i_mount;
 	xfs_dablk_t		lblkno;
@@ -632,19 +615,6 @@ xfs_attr_rmtval_remove(
 				    XFS_BMAPI_ATTRFORK, 1, &done);
 		if (error)
 			return error;
-
-		if (roll_trans) {
-			error = xfs_defer_finish(&args->trans);
-			if (error)
-				return error;
-
-			/*
-			 * Close out trans and start the next one in the chain.
-			 */
-			error = xfs_trans_roll_inode(&args->trans, args->dp);
-			if (error)
-				return error;
-		}
 	}
 	return 0;
 }
diff --git a/fs/xfs/libxfs/xfs_attr_remote.h b/fs/xfs/libxfs/xfs_attr_remote.h
index c7c073d..9d20b66 100644
--- a/fs/xfs/libxfs/xfs_attr_remote.h
+++ b/fs/xfs/libxfs/xfs_attr_remote.h
@@ -9,7 +9,7 @@
 int xfs_attr3_rmt_blocks(struct xfs_mount *mp, int attrlen);
 
 int xfs_attr_rmtval_get(struct xfs_da_args *args);
-int xfs_attr_rmtval_set(struct xfs_da_args *args, bool roll_trans);
-int xfs_attr_rmtval_remove(struct xfs_da_args *args, bool roll_trans);
+int xfs_attr_rmtval_set(struct xfs_da_args *args);
+int xfs_attr_rmtval_remove(struct xfs_da_args *args);
 
 #endif /* __XFS_ATTR_REMOTE_H__ */
diff --git a/fs/xfs/xfs_trans_attr.c b/fs/xfs/xfs_trans_attr.c
index a3339ea..4144ecf 100644
--- a/fs/xfs/xfs_trans_attr.c
+++ b/fs/xfs/xfs_trans_attr.c
@@ -70,11 +70,11 @@ xfs_trans_attr(
 	case XFS_ATTR_OP_FLAGS_SET:
 		args->op_flags |= XFS_DA_OP_ADDNAME;
 		error = xfs_attr_set_args(args, leaf_bp,
-				(enum xfs_attr_state *)state, false);
+				(enum xfs_attr_state *)state);
 		break;
 	case XFS_ATTR_OP_FLAGS_REMOVE:
 		ASSERT(XFS_IFORK_Q((args->dp)));
-		error = xfs_attr_remove_args(args, false);
+		error = xfs_attr_remove_args(args);
 		break;
 	default:
 		error = -EFSCORRUPTED;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/9] xfs: Remove all strlen in all xfs_attr_* functions for attr names.
  2019-04-12 22:50 ` [PATCH 1/9] xfs: Remove all strlen in all xfs_attr_* functions for attr names Allison Henderson
@ 2019-04-14 23:02   ` Dave Chinner
  2019-04-15 20:08     ` Allison Henderson
  2019-04-17 15:42   ` Brian Foster
  1 sibling, 1 reply; 48+ messages in thread
From: Dave Chinner @ 2019-04-14 23:02 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Fri, Apr 12, 2019 at 03:50:28PM -0700, Allison Henderson wrote:
> This helps to pre-simplify the extra handling of the null terminator in
> delayed operations which use memcpy rather than strlen.  Later
> when we introduce parent pointers, attribute names will become binary,
> so strlen will not work at all.  Removing uses of strlen now will
> help reduce complexities later
> 
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

Hmmm. If we are going to pass name/namelen pairs around everywhere,
can we convert these to a struct xfs_name? 

I also wonder where we should convert the name/namelen pairs in the
attr args struct to an xfs_name, too, then we can just do:

	args->name = *name;

to copy in the string pointer and the name length in one go.

And we might even be able to put the attribute flags (e.g.
ATTR_ROOT) in the name.type field, and get rid of that parameter
being passed around, too...

Cheers,

Dave.

> ---
>  fs/xfs/libxfs/xfs_attr.c | 12 ++++++++----
>  fs/xfs/libxfs/xfs_attr.h |  9 ++++++---
>  fs/xfs/xfs_acl.c         | 12 +++++++-----
>  fs/xfs/xfs_ioctl.c       | 13 ++++++++++---
>  fs/xfs/xfs_iops.c        |  6 ++++--
>  fs/xfs/xfs_xattr.c       | 10 ++++++----
>  6 files changed, 41 insertions(+), 21 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index 2dd9ee2..3da6b0d 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -67,6 +67,7 @@ xfs_attr_args_init(
>  	struct xfs_da_args	*args,
>  	struct xfs_inode	*dp,
>  	const unsigned char	*name,
> +	size_t			namelen,
>  	int			flags)
>  {
>  
> @@ -79,7 +80,7 @@ xfs_attr_args_init(
>  	args->dp = dp;
>  	args->flags = flags;
>  	args->name = name;
> -	args->namelen = strlen((const char *)name);
> +	args->namelen = namelen;
>  	if (args->namelen >= MAXNAMELEN)
>  		return -EFAULT;		/* match IRIX behaviour */
>  
> @@ -125,6 +126,7 @@ int
>  xfs_attr_get(
>  	struct xfs_inode	*ip,
>  	const unsigned char	*name,
> +	size_t			namelen,
>  	unsigned char		*value,
>  	int			*valuelenp,
>  	int			flags)
> @@ -138,7 +140,7 @@ xfs_attr_get(
>  	if (XFS_FORCED_SHUTDOWN(ip->i_mount))
>  		return -EIO;
>  
> -	error = xfs_attr_args_init(&args, ip, name, flags);
> +	error = xfs_attr_args_init(&args, ip, name, namelen, flags);
>  	if (error)
>  		return error;
>  
> @@ -317,6 +319,7 @@ int
>  xfs_attr_set(
>  	struct xfs_inode	*dp,
>  	const unsigned char	*name,
> +	size_t			namelen,
>  	unsigned char		*value,
>  	int			valuelen,
>  	int			flags)
> @@ -333,7 +336,7 @@ xfs_attr_set(
>  	if (XFS_FORCED_SHUTDOWN(dp->i_mount))
>  		return -EIO;
>  
> -	error = xfs_attr_args_init(&args, dp, name, flags);
> +	error = xfs_attr_args_init(&args, dp, name, namelen, flags);
>  	if (error)
>  		return error;
>  
> @@ -425,6 +428,7 @@ int
>  xfs_attr_remove(
>  	struct xfs_inode	*dp,
>  	const unsigned char	*name,
> +	size_t			namelen,
>  	int			flags)
>  {
>  	struct xfs_mount	*mp = dp->i_mount;
> @@ -436,7 +440,7 @@ xfs_attr_remove(
>  	if (XFS_FORCED_SHUTDOWN(dp->i_mount))
>  		return -EIO;
>  
> -	error = xfs_attr_args_init(&args, dp, name, flags);
> +	error = xfs_attr_args_init(&args, dp, name, namelen, flags);
>  	if (error)
>  		return error;
>  
> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
> index 2297d84..52f63dc 100644
> --- a/fs/xfs/libxfs/xfs_attr.h
> +++ b/fs/xfs/libxfs/xfs_attr.h
> @@ -137,11 +137,14 @@ int xfs_attr_list_int(struct xfs_attr_list_context *);
>  int xfs_inode_hasattr(struct xfs_inode *ip);
>  int xfs_attr_get_ilocked(struct xfs_inode *ip, struct xfs_da_args *args);
>  int xfs_attr_get(struct xfs_inode *ip, const unsigned char *name,
> -		 unsigned char *value, int *valuelenp, int flags);
> +		 size_t namelen, unsigned char *value, int *valuelenp,
> +		 int flags);
>  int xfs_attr_set(struct xfs_inode *dp, const unsigned char *name,
> -		 unsigned char *value, int valuelen, int flags);
> +		 size_t namelen, unsigned char *value, int valuelen,
> +		 int flags);
>  int xfs_attr_set_args(struct xfs_da_args *args, struct xfs_buf **leaf_bp);
> -int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name, int flags);
> +int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name,
> +		    size_t namelen, int flags);
>  int xfs_attr_remove_args(struct xfs_da_args *args);
>  int xfs_attr_list(struct xfs_inode *dp, char *buffer, int bufsize,
>  		  int flags, struct attrlist_cursor_kern *cursor);
> diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c
> index 8039e35..142de8d 100644
> --- a/fs/xfs/xfs_acl.c
> +++ b/fs/xfs/xfs_acl.c
> @@ -141,8 +141,8 @@ xfs_get_acl(struct inode *inode, int type)
>  	if (!xfs_acl)
>  		return ERR_PTR(-ENOMEM);
>  
> -	error = xfs_attr_get(ip, ea_name, (unsigned char *)xfs_acl,
> -							&len, ATTR_ROOT);
> +	error = xfs_attr_get(ip, ea_name, strlen(ea_name),
> +			     (unsigned char *)xfs_acl, &len, ATTR_ROOT);
>  	if (error) {
>  		/*
>  		 * If the attribute doesn't exist make sure we have a negative
> @@ -192,15 +192,17 @@ __xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type)
>  		len -= sizeof(struct xfs_acl_entry) *
>  			 (XFS_ACL_MAX_ENTRIES(ip->i_mount) - acl->a_count);
>  
> -		error = xfs_attr_set(ip, ea_name, (unsigned char *)xfs_acl,
> -				len, ATTR_ROOT);
> +		error = xfs_attr_set(ip, ea_name, strlen(ea_name),
> +				     (unsigned char *)xfs_acl, len, ATTR_ROOT);
>  
>  		kmem_free(xfs_acl);
>  	} else {
>  		/*
>  		 * A NULL ACL argument means we want to remove the ACL.
>  		 */
> -		error = xfs_attr_remove(ip, ea_name, ATTR_ROOT);
> +		error = xfs_attr_remove(ip, ea_name,
> +					strlen(ea_name),
> +					ATTR_ROOT);
>  
>  		/*
>  		 * If the attribute didn't exist to start with that's fine.
> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> index 6ecdbb3..ab341d6 100644
> --- a/fs/xfs/xfs_ioctl.c
> +++ b/fs/xfs/xfs_ioctl.c
> @@ -437,6 +437,7 @@ xfs_attrmulti_attr_get(
>  {
>  	unsigned char		*kbuf;
>  	int			error = -EFAULT;
> +	size_t			namelen;
>  
>  	if (*len > XFS_XATTR_SIZE_MAX)
>  		return -EINVAL;
> @@ -444,7 +445,9 @@ xfs_attrmulti_attr_get(
>  	if (!kbuf)
>  		return -ENOMEM;
>  
> -	error = xfs_attr_get(XFS_I(inode), name, kbuf, (int *)len, flags);
> +	namelen = strlen(name);
> +	error = xfs_attr_get(XFS_I(inode), name, namelen,
> +			     kbuf, (int *)len, flags);
>  	if (error)
>  		goto out_kfree;
>  
> @@ -466,6 +469,7 @@ xfs_attrmulti_attr_set(
>  {
>  	unsigned char		*kbuf;
>  	int			error;
> +	size_t			namelen;
>  
>  	if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
>  		return -EPERM;
> @@ -476,7 +480,8 @@ xfs_attrmulti_attr_set(
>  	if (IS_ERR(kbuf))
>  		return PTR_ERR(kbuf);
>  
> -	error = xfs_attr_set(XFS_I(inode), name, kbuf, len, flags);
> +	namelen = strlen(name);
> +	error = xfs_attr_set(XFS_I(inode), name, namelen, kbuf, len, flags);
>  	if (!error)
>  		xfs_forget_acl(inode, name, flags);
>  	kfree(kbuf);
> @@ -490,10 +495,12 @@ xfs_attrmulti_attr_remove(
>  	uint32_t		flags)
>  {
>  	int			error;
> +	size_t			namelen;
>  
>  	if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
>  		return -EPERM;
> -	error = xfs_attr_remove(XFS_I(inode), name, flags);
> +	namelen = strlen(name);
> +	error = xfs_attr_remove(XFS_I(inode), name, namelen, flags);
>  	if (!error)
>  		xfs_forget_acl(inode, name, flags);
>  	return error;
> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> index 74047bd..e73c21a 100644
> --- a/fs/xfs/xfs_iops.c
> +++ b/fs/xfs/xfs_iops.c
> @@ -59,8 +59,10 @@ xfs_initxattrs(
>  	int			error = 0;
>  
>  	for (xattr = xattr_array; xattr->name != NULL; xattr++) {
> -		error = xfs_attr_set(ip, xattr->name, xattr->value,
> -				      xattr->value_len, ATTR_SECURE);
> +		error = xfs_attr_set(ip, xattr->name,
> +				     strlen(xattr->name),
> +				     xattr->value, xattr->value_len,
> +				     ATTR_SECURE);
>  		if (error < 0)
>  			break;
>  	}
> diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c
> index 9a63016..3013746 100644
> --- a/fs/xfs/xfs_xattr.c
> +++ b/fs/xfs/xfs_xattr.c
> @@ -26,6 +26,7 @@ xfs_xattr_get(const struct xattr_handler *handler, struct dentry *unused,
>  	int xflags = handler->flags;
>  	struct xfs_inode *ip = XFS_I(inode);
>  	int error, asize = size;
> +	size_t namelen = strlen(name);
>  
>  	/* Convert Linux syscall to XFS internal ATTR flags */
>  	if (!size) {
> @@ -33,7 +34,7 @@ xfs_xattr_get(const struct xattr_handler *handler, struct dentry *unused,
>  		value = NULL;
>  	}
>  
> -	error = xfs_attr_get(ip, (unsigned char *)name, value, &asize, xflags);
> +	error = xfs_attr_get(ip, name, namelen, value, &asize, xflags);
>  	if (error)
>  		return error;
>  	return asize;
> @@ -69,6 +70,7 @@ xfs_xattr_set(const struct xattr_handler *handler, struct dentry *unused,
>  	int			xflags = handler->flags;
>  	struct xfs_inode	*ip = XFS_I(inode);
>  	int			error;
> +	size_t			namelen = strlen(name);
>  
>  	/* Convert Linux syscall to XFS internal ATTR flags */
>  	if (flags & XATTR_CREATE)
> @@ -77,9 +79,9 @@ xfs_xattr_set(const struct xattr_handler *handler, struct dentry *unused,
>  		xflags |= ATTR_REPLACE;
>  
>  	if (!value)
> -		return xfs_attr_remove(ip, (unsigned char *)name, xflags);
> -	error = xfs_attr_set(ip, (unsigned char *)name,
> -				(void *)value, size, xflags);
> +		return xfs_attr_remove(ip, name,
> +				       namelen, xflags);
> +	error = xfs_attr_set(ip, name, namelen, (void *)value, size, xflags);
>  	if (!error)
>  		xfs_forget_acl(inode, name, xflags);
>  
> -- 
> 2.7.4
> 
> 

-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 6/9] xfs: Add xfs_has_attr and subroutines
  2019-04-12 22:50 ` [PATCH 6/9] xfs: Add xfs_has_attr and subroutines Allison Henderson
@ 2019-04-15  2:46   ` Su Yue
  2019-04-15 20:13     ` Allison Henderson
  2019-04-22 13:00   ` Brian Foster
  1 sibling, 1 reply; 48+ messages in thread
From: Su Yue @ 2019-04-15  2:46 UTC (permalink / raw)
  To: Allison Henderson, linux-xfs



On 2019/4/13 6:50 AM, Allison Henderson wrote:
> This patch adds a new functions to check for the existence of
> an attribute.  Subroutines are also added to handle the cases
> of leaf blocks, nodes or shortform.  We will need this later
> for delayed attributes since delayed operations cannot return
> error codes.
>
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---
>   fs/xfs/libxfs/xfs_attr.c      | 78 +++++++++++++++++++++++++++++++++++++++++++
>   fs/xfs/libxfs/xfs_attr.h      |  1 +
>   fs/xfs/libxfs/xfs_attr_leaf.c | 33 ++++++++++++++++++
>   fs/xfs/libxfs/xfs_attr_leaf.h |  1 +
>   4 files changed, 113 insertions(+)
>
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index c3477fa7..0042708 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -53,6 +53,7 @@ STATIC int xfs_attr_shortform_addname(xfs_da_args_t *args);
>   STATIC int xfs_attr_leaf_get(xfs_da_args_t *args);
>   STATIC int xfs_attr_leaf_addname(xfs_da_args_t *args, bool roll_trans);
>   STATIC int xfs_attr_leaf_removename(xfs_da_args_t *args, bool roll_trans);
> +STATIC int xfs_leaf_has_attr(xfs_da_args_t *args);
>
>   /*
>    * Internal routines when attribute list is more than one block.
> @@ -60,6 +61,7 @@ STATIC int xfs_attr_leaf_removename(xfs_da_args_t *args, bool roll_trans);
>   STATIC int xfs_attr_node_get(xfs_da_args_t *args);
>   STATIC int xfs_attr_node_addname(xfs_da_args_t *args, bool roll_trans);
>   STATIC int xfs_attr_node_removename(xfs_da_args_t *args, bool roll_trans);
> +STATIC int xfs_attr_node_hasname(xfs_da_args_t *args);
>   STATIC int xfs_attr_fillstate(xfs_da_state_t *state);
>   STATIC int xfs_attr_refillstate(xfs_da_state_t *state);
>
> @@ -301,6 +303,29 @@ xfs_attr_set_args(
>   }
>
>   /*
> + * Return successful if attr is found, or ENOATTR if not
> + */
> +int
> +xfs_has_attr(
> +	struct xfs_da_args      *args)
> +{
> +	struct xfs_inode        *dp = args->dp;
> +	int                     error;
> +
> +	if (!xfs_inode_hasattr(dp))
> +		error = -ENOATTR;
> +	else if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL) {
> +		ASSERT(dp->i_afp->if_flags & XFS_IFINLINE);
> +		error = xfs_shortform_has_attr(args);
> +	} else if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
> +		error = xfs_leaf_has_attr(args);
> +	else
> +		error = xfs_attr_node_hasname(args);
> +
> +	return error;
> +}
> +
> +/*
>    * Remove the attribute specified in @args.
>    */
>   int
> @@ -836,6 +861,29 @@ xfs_attr_leaf_addname(
>   }
>
>   /*
> + * Return successful if attr is found, or ENOATTR if not
> + */
> +STATIC int
> +xfs_leaf_has_attr(
> +	struct xfs_da_args      *args)
> +{
> +	struct xfs_buf          *bp;
> +	int                     error = 0;
> +
> +	args->blkno = 0;
> +	error = xfs_attr3_leaf_read(args->trans, args->dp,
> +			args->blkno, -1, &bp);
> +	if (error)
> +		return error;
> +
> +	error = xfs_attr3_leaf_lookup_int(bp, args);
> +	error = (error == -ENOATTR) ? -ENOATTR : 0;
> +	xfs_trans_brelse(args->trans, bp);
> +
> +	return error;
> +}
> +
> +/*
>    * Remove a name from the leaf attribute list structure
>    *
>    * This leaf block cannot have a "remote" value, we only call this routine
> @@ -1166,6 +1214,36 @@ xfs_attr_node_addname(
>   }
>
>   /*
> + * Return successful if attr is found, or ENOATTR if not
> + */
> +STATIC int
> +xfs_attr_node_hasname(
> +	struct xfs_da_args	*args)
> +{
> +	struct xfs_da_state	*state;
> +	struct xfs_inode	*dp;
> +	int			retval, error;
> +
> +	/*
> +	 * Tie a string around our finger to remind us where we are.
> +	 */
> +	dp = args->dp;
> +	state = xfs_da_state_alloc();

The corresponding xfs_da_state_free(state) seems lost...

---
Su
> +	state->args = args;
> +	state->mp = dp->i_mount;
> +
> +	/*
> +	 * Search to see if name exists, and get back a pointer to it.
> +	 */
> +	error = xfs_da3_node_lookup_int(state, &retval);
> +	if (error || (retval != -EEXIST)) {
> +		if (error == 0)
> +			error = retval;
> +	}
> +	return error;
> +}
> +
> +/*
>    * Remove a name from a B-tree attribute list.
>    *
>    * This will involve walking down the Btree, and may involve joining
> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
> index 83b3621..974c963 100644
> --- a/fs/xfs/libxfs/xfs_attr.h
> +++ b/fs/xfs/libxfs/xfs_attr.h
> @@ -168,6 +168,7 @@ int xfs_attr_set_args(struct xfs_da_args *args, struct xfs_buf **leaf_bp,
>   		 bool roll_trans);
>   int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name,
>   		    size_t namelen, int flags);
> +int xfs_has_attr(struct xfs_da_args *args);
>   int xfs_attr_remove_args(struct xfs_da_args *args, bool roll_trans);
>   int xfs_attr_list(struct xfs_inode *dp, char *buffer, int bufsize,
>   		  int flags, struct attrlist_cursor_kern *cursor);
> diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c
> index 128bfe9..e9f2f53 100644
> --- a/fs/xfs/libxfs/xfs_attr_leaf.c
> +++ b/fs/xfs/libxfs/xfs_attr_leaf.c
> @@ -622,6 +622,39 @@ xfs_attr_fork_remove(
>   }
>
>   /*
> + * Return successful if attr is found, or ENOATTR if not
> + */
> +int
> +xfs_shortform_has_attr(
> +	struct xfs_da_args	 *args)
> +{
> +	struct xfs_attr_shortform *sf;
> +	struct xfs_attr_sf_entry *sfe;
> +	int			base = sizeof(struct xfs_attr_sf_hdr);
> +	int			size = 0;
> +	int			end;
> +	int			i;
> +
> +	sf = (struct xfs_attr_shortform *)args->dp->i_afp->if_u1.if_data;
> +	sfe = &sf->list[0];
> +	end = sf->hdr.count;
> +	for (i = 0; i < end; sfe = XFS_ATTR_SF_NEXTENTRY(sfe),
> +			base += size, i++) {
> +		size = XFS_ATTR_SF_ENTSIZE(sfe);
> +		if (sfe->namelen != args->namelen)
> +			continue;
> +		if (memcmp(sfe->nameval, args->name, args->namelen) != 0)
> +			continue;
> +		if (!xfs_attr_namesp_match(args->flags, sfe->flags))
> +			continue;
> +		break;
> +	}
> +	if (i == end)
> +		return -ENOATTR;
> +	return 0;
> +}
> +
> +/*
>    * Remove an attribute from the shortform attribute list structure.
>    */
>   int
> diff --git a/fs/xfs/libxfs/xfs_attr_leaf.h b/fs/xfs/libxfs/xfs_attr_leaf.h
> index 9d830ec..98dd169 100644
> --- a/fs/xfs/libxfs/xfs_attr_leaf.h
> +++ b/fs/xfs/libxfs/xfs_attr_leaf.h
> @@ -39,6 +39,7 @@ int	xfs_attr_shortform_getvalue(struct xfs_da_args *args);
>   int	xfs_attr_shortform_to_leaf(struct xfs_da_args *args,
>   			struct xfs_buf **leaf_bp);
>   int	xfs_attr_shortform_remove(struct xfs_da_args *args);
> +int	xfs_shortform_has_attr(struct xfs_da_args *args);
>   int	xfs_attr_shortform_allfit(struct xfs_buf *bp, struct xfs_inode *dp);
>   int	xfs_attr_shortform_bytesfit(struct xfs_inode *dp, int bytes);
>   xfs_failaddr_t xfs_attr_shortform_verify(struct xfs_inode *ip);
>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/9] xfs: Remove all strlen in all xfs_attr_* functions for attr names.
  2019-04-14 23:02   ` Dave Chinner
@ 2019-04-15 20:08     ` Allison Henderson
  2019-04-15 21:18       ` Dave Chinner
  0 siblings, 1 reply; 48+ messages in thread
From: Allison Henderson @ 2019-04-15 20:08 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs



On 4/14/19 4:02 PM, Dave Chinner wrote:
> On Fri, Apr 12, 2019 at 03:50:28PM -0700, Allison Henderson wrote:
>> This helps to pre-simplify the extra handling of the null terminator in
>> delayed operations which use memcpy rather than strlen.  Later
>> when we introduce parent pointers, attribute names will become binary,
>> so strlen will not work at all.  Removing uses of strlen now will
>> help reduce complexities later
>>
>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
>> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Hmmm. If we are going to pass name/namelen pairs around everywhere,
> can we convert these to a struct xfs_name?
> 
> I also wonder where we should convert the name/namelen pairs in the
> attr args struct to an xfs_name, too, then we can just do:
> 
> 	args->name = *name;
> 
> to copy in the string pointer and the name length in one go.
> 
> And we might even be able to put the attribute flags (e.g.
> ATTR_ROOT) in the name.type field, and get rid of that parameter
> being passed around, too...
> 
> Cheers,
> 
> Dave.

I think a new struct xfs_name is reasonable.  Or maybe a general purpose 
xfs_array?  The value/valuelen parameters have a similar relation that 
could make use of that too.  How would people feel about something like 
this:

struct xfs_array {
	unsigned char *bytes;
	size_t		len;
}

struct xfs_attribute {
	struct xfs_array	name;
	int			flags;
}

We could add the value member in here too.  It tends to get passed 
around just as much with the exception of attr remove operations which 
only need the name.  I think changing the actual members of xfs_da_args 
should be another patch though, since that's a bigger change that 
affects a wider scope of code.  I can look into it though.

Also, do I still keep the old reviewed-by on the patch if the patch is 
still going through changes?  I have a few signed-off patches in the 
parent pointer set that have changed quite a bit since we've started too.

Thanks!
Allison

> 
>> ---
>>   fs/xfs/libxfs/xfs_attr.c | 12 ++++++++----
>>   fs/xfs/libxfs/xfs_attr.h |  9 ++++++---
>>   fs/xfs/xfs_acl.c         | 12 +++++++-----
>>   fs/xfs/xfs_ioctl.c       | 13 ++++++++++---
>>   fs/xfs/xfs_iops.c        |  6 ++++--
>>   fs/xfs/xfs_xattr.c       | 10 ++++++----
>>   6 files changed, 41 insertions(+), 21 deletions(-)
>>
>> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
>> index 2dd9ee2..3da6b0d 100644
>> --- a/fs/xfs/libxfs/xfs_attr.c
>> +++ b/fs/xfs/libxfs/xfs_attr.c
>> @@ -67,6 +67,7 @@ xfs_attr_args_init(
>>   	struct xfs_da_args	*args,
>>   	struct xfs_inode	*dp,
>>   	const unsigned char	*name,
>> +	size_t			namelen,
>>   	int			flags)
>>   {
>>   
>> @@ -79,7 +80,7 @@ xfs_attr_args_init(
>>   	args->dp = dp;
>>   	args->flags = flags;
>>   	args->name = name;
>> -	args->namelen = strlen((const char *)name);
>> +	args->namelen = namelen;
>>   	if (args->namelen >= MAXNAMELEN)
>>   		return -EFAULT;		/* match IRIX behaviour */
>>   
>> @@ -125,6 +126,7 @@ int
>>   xfs_attr_get(
>>   	struct xfs_inode	*ip,
>>   	const unsigned char	*name,
>> +	size_t			namelen,
>>   	unsigned char		*value,
>>   	int			*valuelenp,
>>   	int			flags)
>> @@ -138,7 +140,7 @@ xfs_attr_get(
>>   	if (XFS_FORCED_SHUTDOWN(ip->i_mount))
>>   		return -EIO;
>>   
>> -	error = xfs_attr_args_init(&args, ip, name, flags);
>> +	error = xfs_attr_args_init(&args, ip, name, namelen, flags);
>>   	if (error)
>>   		return error;
>>   
>> @@ -317,6 +319,7 @@ int
>>   xfs_attr_set(
>>   	struct xfs_inode	*dp,
>>   	const unsigned char	*name,
>> +	size_t			namelen,
>>   	unsigned char		*value,
>>   	int			valuelen,
>>   	int			flags)
>> @@ -333,7 +336,7 @@ xfs_attr_set(
>>   	if (XFS_FORCED_SHUTDOWN(dp->i_mount))
>>   		return -EIO;
>>   
>> -	error = xfs_attr_args_init(&args, dp, name, flags);
>> +	error = xfs_attr_args_init(&args, dp, name, namelen, flags);
>>   	if (error)
>>   		return error;
>>   
>> @@ -425,6 +428,7 @@ int
>>   xfs_attr_remove(
>>   	struct xfs_inode	*dp,
>>   	const unsigned char	*name,
>> +	size_t			namelen,
>>   	int			flags)
>>   {
>>   	struct xfs_mount	*mp = dp->i_mount;
>> @@ -436,7 +440,7 @@ xfs_attr_remove(
>>   	if (XFS_FORCED_SHUTDOWN(dp->i_mount))
>>   		return -EIO;
>>   
>> -	error = xfs_attr_args_init(&args, dp, name, flags);
>> +	error = xfs_attr_args_init(&args, dp, name, namelen, flags);
>>   	if (error)
>>   		return error;
>>   
>> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
>> index 2297d84..52f63dc 100644
>> --- a/fs/xfs/libxfs/xfs_attr.h
>> +++ b/fs/xfs/libxfs/xfs_attr.h
>> @@ -137,11 +137,14 @@ int xfs_attr_list_int(struct xfs_attr_list_context *);
>>   int xfs_inode_hasattr(struct xfs_inode *ip);
>>   int xfs_attr_get_ilocked(struct xfs_inode *ip, struct xfs_da_args *args);
>>   int xfs_attr_get(struct xfs_inode *ip, const unsigned char *name,
>> -		 unsigned char *value, int *valuelenp, int flags);
>> +		 size_t namelen, unsigned char *value, int *valuelenp,
>> +		 int flags);
>>   int xfs_attr_set(struct xfs_inode *dp, const unsigned char *name,
>> -		 unsigned char *value, int valuelen, int flags);
>> +		 size_t namelen, unsigned char *value, int valuelen,
>> +		 int flags);
>>   int xfs_attr_set_args(struct xfs_da_args *args, struct xfs_buf **leaf_bp);
>> -int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name, int flags);
>> +int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name,
>> +		    size_t namelen, int flags);
>>   int xfs_attr_remove_args(struct xfs_da_args *args);
>>   int xfs_attr_list(struct xfs_inode *dp, char *buffer, int bufsize,
>>   		  int flags, struct attrlist_cursor_kern *cursor);
>> diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c
>> index 8039e35..142de8d 100644
>> --- a/fs/xfs/xfs_acl.c
>> +++ b/fs/xfs/xfs_acl.c
>> @@ -141,8 +141,8 @@ xfs_get_acl(struct inode *inode, int type)
>>   	if (!xfs_acl)
>>   		return ERR_PTR(-ENOMEM);
>>   
>> -	error = xfs_attr_get(ip, ea_name, (unsigned char *)xfs_acl,
>> -							&len, ATTR_ROOT);
>> +	error = xfs_attr_get(ip, ea_name, strlen(ea_name),
>> +			     (unsigned char *)xfs_acl, &len, ATTR_ROOT);
>>   	if (error) {
>>   		/*
>>   		 * If the attribute doesn't exist make sure we have a negative
>> @@ -192,15 +192,17 @@ __xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type)
>>   		len -= sizeof(struct xfs_acl_entry) *
>>   			 (XFS_ACL_MAX_ENTRIES(ip->i_mount) - acl->a_count);
>>   
>> -		error = xfs_attr_set(ip, ea_name, (unsigned char *)xfs_acl,
>> -				len, ATTR_ROOT);
>> +		error = xfs_attr_set(ip, ea_name, strlen(ea_name),
>> +				     (unsigned char *)xfs_acl, len, ATTR_ROOT);
>>   
>>   		kmem_free(xfs_acl);
>>   	} else {
>>   		/*
>>   		 * A NULL ACL argument means we want to remove the ACL.
>>   		 */
>> -		error = xfs_attr_remove(ip, ea_name, ATTR_ROOT);
>> +		error = xfs_attr_remove(ip, ea_name,
>> +					strlen(ea_name),
>> +					ATTR_ROOT);
>>   
>>   		/*
>>   		 * If the attribute didn't exist to start with that's fine.
>> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
>> index 6ecdbb3..ab341d6 100644
>> --- a/fs/xfs/xfs_ioctl.c
>> +++ b/fs/xfs/xfs_ioctl.c
>> @@ -437,6 +437,7 @@ xfs_attrmulti_attr_get(
>>   {
>>   	unsigned char		*kbuf;
>>   	int			error = -EFAULT;
>> +	size_t			namelen;
>>   
>>   	if (*len > XFS_XATTR_SIZE_MAX)
>>   		return -EINVAL;
>> @@ -444,7 +445,9 @@ xfs_attrmulti_attr_get(
>>   	if (!kbuf)
>>   		return -ENOMEM;
>>   
>> -	error = xfs_attr_get(XFS_I(inode), name, kbuf, (int *)len, flags);
>> +	namelen = strlen(name);
>> +	error = xfs_attr_get(XFS_I(inode), name, namelen,
>> +			     kbuf, (int *)len, flags);
>>   	if (error)
>>   		goto out_kfree;
>>   
>> @@ -466,6 +469,7 @@ xfs_attrmulti_attr_set(
>>   {
>>   	unsigned char		*kbuf;
>>   	int			error;
>> +	size_t			namelen;
>>   
>>   	if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
>>   		return -EPERM;
>> @@ -476,7 +480,8 @@ xfs_attrmulti_attr_set(
>>   	if (IS_ERR(kbuf))
>>   		return PTR_ERR(kbuf);
>>   
>> -	error = xfs_attr_set(XFS_I(inode), name, kbuf, len, flags);
>> +	namelen = strlen(name);
>> +	error = xfs_attr_set(XFS_I(inode), name, namelen, kbuf, len, flags);
>>   	if (!error)
>>   		xfs_forget_acl(inode, name, flags);
>>   	kfree(kbuf);
>> @@ -490,10 +495,12 @@ xfs_attrmulti_attr_remove(
>>   	uint32_t		flags)
>>   {
>>   	int			error;
>> +	size_t			namelen;
>>   
>>   	if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
>>   		return -EPERM;
>> -	error = xfs_attr_remove(XFS_I(inode), name, flags);
>> +	namelen = strlen(name);
>> +	error = xfs_attr_remove(XFS_I(inode), name, namelen, flags);
>>   	if (!error)
>>   		xfs_forget_acl(inode, name, flags);
>>   	return error;
>> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
>> index 74047bd..e73c21a 100644
>> --- a/fs/xfs/xfs_iops.c
>> +++ b/fs/xfs/xfs_iops.c
>> @@ -59,8 +59,10 @@ xfs_initxattrs(
>>   	int			error = 0;
>>   
>>   	for (xattr = xattr_array; xattr->name != NULL; xattr++) {
>> -		error = xfs_attr_set(ip, xattr->name, xattr->value,
>> -				      xattr->value_len, ATTR_SECURE);
>> +		error = xfs_attr_set(ip, xattr->name,
>> +				     strlen(xattr->name),
>> +				     xattr->value, xattr->value_len,
>> +				     ATTR_SECURE);
>>   		if (error < 0)
>>   			break;
>>   	}
>> diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c
>> index 9a63016..3013746 100644
>> --- a/fs/xfs/xfs_xattr.c
>> +++ b/fs/xfs/xfs_xattr.c
>> @@ -26,6 +26,7 @@ xfs_xattr_get(const struct xattr_handler *handler, struct dentry *unused,
>>   	int xflags = handler->flags;
>>   	struct xfs_inode *ip = XFS_I(inode);
>>   	int error, asize = size;
>> +	size_t namelen = strlen(name);
>>   
>>   	/* Convert Linux syscall to XFS internal ATTR flags */
>>   	if (!size) {
>> @@ -33,7 +34,7 @@ xfs_xattr_get(const struct xattr_handler *handler, struct dentry *unused,
>>   		value = NULL;
>>   	}
>>   
>> -	error = xfs_attr_get(ip, (unsigned char *)name, value, &asize, xflags);
>> +	error = xfs_attr_get(ip, name, namelen, value, &asize, xflags);
>>   	if (error)
>>   		return error;
>>   	return asize;
>> @@ -69,6 +70,7 @@ xfs_xattr_set(const struct xattr_handler *handler, struct dentry *unused,
>>   	int			xflags = handler->flags;
>>   	struct xfs_inode	*ip = XFS_I(inode);
>>   	int			error;
>> +	size_t			namelen = strlen(name);
>>   
>>   	/* Convert Linux syscall to XFS internal ATTR flags */
>>   	if (flags & XATTR_CREATE)
>> @@ -77,9 +79,9 @@ xfs_xattr_set(const struct xattr_handler *handler, struct dentry *unused,
>>   		xflags |= ATTR_REPLACE;
>>   
>>   	if (!value)
>> -		return xfs_attr_remove(ip, (unsigned char *)name, xflags);
>> -	error = xfs_attr_set(ip, (unsigned char *)name,
>> -				(void *)value, size, xflags);
>> +		return xfs_attr_remove(ip, name,
>> +				       namelen, xflags);
>> +	error = xfs_attr_set(ip, name, namelen, (void *)value, size, xflags);
>>   	if (!error)
>>   		xfs_forget_acl(inode, name, xflags);
>>   
>> -- 
>> 2.7.4
>>
>>
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 6/9] xfs: Add xfs_has_attr and subroutines
  2019-04-15  2:46   ` Su Yue
@ 2019-04-15 20:13     ` Allison Henderson
  0 siblings, 0 replies; 48+ messages in thread
From: Allison Henderson @ 2019-04-15 20:13 UTC (permalink / raw)
  To: Su Yue, linux-xfs

On 4/14/19 7:46 PM, Su Yue wrote:
> 
> 
> On 2019/4/13 6:50 AM, Allison Henderson wrote:
>> This patch adds a new functions to check for the existence of
>> an attribute.  Subroutines are also added to handle the cases
>> of leaf blocks, nodes or shortform.  We will need this later
>> for delayed attributes since delayed operations cannot return
>> error codes.
>>
>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
>> ---
>>   fs/xfs/libxfs/xfs_attr.c      | 78 
>> +++++++++++++++++++++++++++++++++++++++++++
>>   fs/xfs/libxfs/xfs_attr.h      |  1 +
>>   fs/xfs/libxfs/xfs_attr_leaf.c | 33 ++++++++++++++++++
>>   fs/xfs/libxfs/xfs_attr_leaf.h |  1 +
>>   4 files changed, 113 insertions(+)
>>
>> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
>> index c3477fa7..0042708 100644
>> --- a/fs/xfs/libxfs/xfs_attr.c
>> +++ b/fs/xfs/libxfs/xfs_attr.c
>> @@ -53,6 +53,7 @@ STATIC int xfs_attr_shortform_addname(xfs_da_args_t 
>> *args);
>>   STATIC int xfs_attr_leaf_get(xfs_da_args_t *args);
>>   STATIC int xfs_attr_leaf_addname(xfs_da_args_t *args, bool roll_trans);
>>   STATIC int xfs_attr_leaf_removename(xfs_da_args_t *args, bool 
>> roll_trans);
>> +STATIC int xfs_leaf_has_attr(xfs_da_args_t *args);
>>
>>   /*
>>    * Internal routines when attribute list is more than one block.
>> @@ -60,6 +61,7 @@ STATIC int xfs_attr_leaf_removename(xfs_da_args_t 
>> *args, bool roll_trans);
>>   STATIC int xfs_attr_node_get(xfs_da_args_t *args);
>>   STATIC int xfs_attr_node_addname(xfs_da_args_t *args, bool roll_trans);
>>   STATIC int xfs_attr_node_removename(xfs_da_args_t *args, bool 
>> roll_trans);
>> +STATIC int xfs_attr_node_hasname(xfs_da_args_t *args);
>>   STATIC int xfs_attr_fillstate(xfs_da_state_t *state);
>>   STATIC int xfs_attr_refillstate(xfs_da_state_t *state);
>>
>> @@ -301,6 +303,29 @@ xfs_attr_set_args(
>>   }
>>
>>   /*
>> + * Return successful if attr is found, or ENOATTR if not
>> + */
>> +int
>> +xfs_has_attr(
>> +    struct xfs_da_args      *args)
>> +{
>> +    struct xfs_inode        *dp = args->dp;
>> +    int                     error;
>> +
>> +    if (!xfs_inode_hasattr(dp))
>> +        error = -ENOATTR;
>> +    else if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL) {
>> +        ASSERT(dp->i_afp->if_flags & XFS_IFINLINE);
>> +        error = xfs_shortform_has_attr(args);
>> +    } else if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
>> +        error = xfs_leaf_has_attr(args);
>> +    else
>> +        error = xfs_attr_node_hasname(args);
>> +
>> +    return error;
>> +}
>> +
>> +/*
>>    * Remove the attribute specified in @args.
>>    */
>>   int
>> @@ -836,6 +861,29 @@ xfs_attr_leaf_addname(
>>   }
>>
>>   /*
>> + * Return successful if attr is found, or ENOATTR if not
>> + */
>> +STATIC int
>> +xfs_leaf_has_attr(
>> +    struct xfs_da_args      *args)
>> +{
>> +    struct xfs_buf          *bp;
>> +    int                     error = 0;
>> +
>> +    args->blkno = 0;
>> +    error = xfs_attr3_leaf_read(args->trans, args->dp,
>> +            args->blkno, -1, &bp);
>> +    if (error)
>> +        return error;
>> +
>> +    error = xfs_attr3_leaf_lookup_int(bp, args);
>> +    error = (error == -ENOATTR) ? -ENOATTR : 0;
>> +    xfs_trans_brelse(args->trans, bp);
>> +
>> +    return error;
>> +}
>> +
>> +/*
>>    * Remove a name from the leaf attribute list structure
>>    *
>>    * This leaf block cannot have a "remote" value, we only call this 
>> routine
>> @@ -1166,6 +1214,36 @@ xfs_attr_node_addname(
>>   }
>>
>>   /*
>> + * Return successful if attr is found, or ENOATTR if not
>> + */
>> +STATIC int
>> +xfs_attr_node_hasname(
>> +    struct xfs_da_args    *args)
>> +{
>> +    struct xfs_da_state    *state;
>> +    struct xfs_inode    *dp;
>> +    int            retval, error;
>> +
>> +    /*
>> +     * Tie a string around our finger to remind us where we are.
>> +     */
>> +    dp = args->dp;
>> +    state = xfs_da_state_alloc();
> 
> The corresponding xfs_da_state_free(state) seems lost...
> 
> ---
> Su

Good catch, you are right.  I will add that in the next version.

Thanks!
Allison

>> +    state->args = args;
>> +    state->mp = dp->i_mount;
>> +
>> +    /*
>> +     * Search to see if name exists, and get back a pointer to it.
>> +     */
>> +    error = xfs_da3_node_lookup_int(state, &retval);
>> +    if (error || (retval != -EEXIST)) {
>> +        if (error == 0)
>> +            error = retval;
>> +    }
>> +    return error;
>> +}
>> +
>> +/*
>>    * Remove a name from a B-tree attribute list.
>>    *
>>    * This will involve walking down the Btree, and may involve joining
>> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
>> index 83b3621..974c963 100644
>> --- a/fs/xfs/libxfs/xfs_attr.h
>> +++ b/fs/xfs/libxfs/xfs_attr.h
>> @@ -168,6 +168,7 @@ int xfs_attr_set_args(struct xfs_da_args *args, 
>> struct xfs_buf **leaf_bp,
>>            bool roll_trans);
>>   int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name,
>>               size_t namelen, int flags);
>> +int xfs_has_attr(struct xfs_da_args *args);
>>   int xfs_attr_remove_args(struct xfs_da_args *args, bool roll_trans);
>>   int xfs_attr_list(struct xfs_inode *dp, char *buffer, int bufsize,
>>             int flags, struct attrlist_cursor_kern *cursor);
>> diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c 
>> b/fs/xfs/libxfs/xfs_attr_leaf.c
>> index 128bfe9..e9f2f53 100644
>> --- a/fs/xfs/libxfs/xfs_attr_leaf.c
>> +++ b/fs/xfs/libxfs/xfs_attr_leaf.c
>> @@ -622,6 +622,39 @@ xfs_attr_fork_remove(
>>   }
>>
>>   /*
>> + * Return successful if attr is found, or ENOATTR if not
>> + */
>> +int
>> +xfs_shortform_has_attr(
>> +    struct xfs_da_args     *args)
>> +{
>> +    struct xfs_attr_shortform *sf;
>> +    struct xfs_attr_sf_entry *sfe;
>> +    int            base = sizeof(struct xfs_attr_sf_hdr);
>> +    int            size = 0;
>> +    int            end;
>> +    int            i;
>> +
>> +    sf = (struct xfs_attr_shortform *)args->dp->i_afp->if_u1.if_data;
>> +    sfe = &sf->list[0];
>> +    end = sf->hdr.count;
>> +    for (i = 0; i < end; sfe = XFS_ATTR_SF_NEXTENTRY(sfe),
>> +            base += size, i++) {
>> +        size = XFS_ATTR_SF_ENTSIZE(sfe);
>> +        if (sfe->namelen != args->namelen)
>> +            continue;
>> +        if (memcmp(sfe->nameval, args->name, args->namelen) != 0)
>> +            continue;
>> +        if (!xfs_attr_namesp_match(args->flags, sfe->flags))
>> +            continue;
>> +        break;
>> +    }
>> +    if (i == end)
>> +        return -ENOATTR;
>> +    return 0;
>> +}
>> +
>> +/*
>>    * Remove an attribute from the shortform attribute list structure.
>>    */
>>   int
>> diff --git a/fs/xfs/libxfs/xfs_attr_leaf.h 
>> b/fs/xfs/libxfs/xfs_attr_leaf.h
>> index 9d830ec..98dd169 100644
>> --- a/fs/xfs/libxfs/xfs_attr_leaf.h
>> +++ b/fs/xfs/libxfs/xfs_attr_leaf.h
>> @@ -39,6 +39,7 @@ int    xfs_attr_shortform_getvalue(struct 
>> xfs_da_args *args);
>>   int    xfs_attr_shortform_to_leaf(struct xfs_da_args *args,
>>               struct xfs_buf **leaf_bp);
>>   int    xfs_attr_shortform_remove(struct xfs_da_args *args);
>> +int    xfs_shortform_has_attr(struct xfs_da_args *args);
>>   int    xfs_attr_shortform_allfit(struct xfs_buf *bp, struct 
>> xfs_inode *dp);
>>   int    xfs_attr_shortform_bytesfit(struct xfs_inode *dp, int bytes);
>>   xfs_failaddr_t xfs_attr_shortform_verify(struct xfs_inode *ip);
>>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/9] xfs: Remove all strlen in all xfs_attr_* functions for attr names.
  2019-04-15 20:08     ` Allison Henderson
@ 2019-04-15 21:18       ` Dave Chinner
  2019-04-16  1:33         ` Allison Henderson
  0 siblings, 1 reply; 48+ messages in thread
From: Dave Chinner @ 2019-04-15 21:18 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Mon, Apr 15, 2019 at 01:08:50PM -0700, Allison Henderson wrote:
> On 4/14/19 4:02 PM, Dave Chinner wrote:
> > On Fri, Apr 12, 2019 at 03:50:28PM -0700, Allison Henderson wrote:
> > > This helps to pre-simplify the extra handling of the null terminator in
> > > delayed operations which use memcpy rather than strlen.  Later
> > > when we introduce parent pointers, attribute names will become binary,
> > > so strlen will not work at all.  Removing uses of strlen now will
> > > help reduce complexities later
> > > 
> > > Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> > > Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Hmmm. If we are going to pass name/namelen pairs around everywhere,
> > can we convert these to a struct xfs_name?
> > 
> > I also wonder where we should convert the name/namelen pairs in the
> > attr args struct to an xfs_name, too, then we can just do:
> > 
> > 	args->name = *name;
> > 
> > to copy in the string pointer and the name length in one go.
> > 
> > And we might even be able to put the attribute flags (e.g.
> > ATTR_ROOT) in the name.type field, and get rid of that parameter
> > being passed around, too...
> > 
> > Cheers,
> > 
> > Dave.
> 
> I think a new struct xfs_name is reasonable.  Or maybe a general purpose
> xfs_array?  The value/valuelen parameters have a similar relation that could
> make use of that too.  How would people feel about something like this:
> 
> struct xfs_array {
> 	unsigned char *bytes;
> 	size_t		len;
> }
> 
> struct xfs_attribute {
> 	struct xfs_array	name;
> 	int			flags;
> }

Hmmm. I think that's overkill - we really don't need that much
abstraction and it makes the code more complex rather than
simplifies it. i.e. args->name.name.bytes isn't really a
simplification...

Also the above means we have to discard the const part of the names
we are passed because this construction doesn't appear to be
intended for read only strings. i.e.:

> We could add the value member in here too. 

Which is something that isn't const. :)

> It tends to get passed around
> just as much with the exception of attr remove operations which only need
> the name.  I think changing the actual members of xfs_da_args should be
> another patch though, since that's a bigger change that affects a wider
> scope of code.  I can look into it though.

I don't think we want anything more complex or to extent it to
include other parts of the attr API - that just means we have a
special snowflake for the API rather than a generic way of encoding
a name string across all the internal interfaces that space a name/len
pair...

Really, I was just suggesting using the struct xfs_name because
it already exists and encodes/documents exactly what we are passing
here, hence getting rid of some of the mess around passing the attr
names around.

> Also, do I still keep the old reviewed-by on the patch if the patch is still
> going through changes?  I have a few signed-off patches in the parent
> pointer set that have changed quite a bit since we've started too.

If the changes are significant, then it needs review again and so
you should drop it.

But I'd do the name/len -> xfsname conversion as a separate patch,
so this patch doesn't need changing.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 7/9] xfs: Add attr context to log item
  2019-04-12 22:50 ` [PATCH 7/9] xfs: Add attr context to log item Allison Henderson
@ 2019-04-15 22:50   ` Darrick J. Wong
  2019-04-16  2:30     ` Allison Henderson
  2019-04-22 13:03   ` Brian Foster
  1 sibling, 1 reply; 48+ messages in thread
From: Darrick J. Wong @ 2019-04-15 22:50 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Fri, Apr 12, 2019 at 03:50:34PM -0700, Allison Henderson wrote:
> This patch modifies xfs_attr_item to store a xfs_da_args, a xfs_buf pointer
> and a new state type. We will use these in the next patch when
> we modify xfs_set_attr_args to roll transactions by returning EAGAIN.
> Because the subroutines of this function modify the contents of these
> structures, we need to find a place to store them where they remain
> instantiated across multiple calls to xfs_set_attr_args.
> 
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_attr.h | 18 +++++++++++++++++-
>  fs/xfs/scrub/common.c    |  2 ++
>  fs/xfs/xfs_acl.c         |  2 ++
>  fs/xfs/xfs_attr_item.c   |  2 +-
>  fs/xfs/xfs_ioctl.c       |  2 ++
>  fs/xfs/xfs_ioctl32.c     |  2 ++
>  fs/xfs/xfs_iops.c        |  1 +
>  fs/xfs/xfs_xattr.c       |  1 +
>  8 files changed, 28 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
> index 974c963..4ce3b0a 100644
> --- a/fs/xfs/libxfs/xfs_attr.h
> +++ b/fs/xfs/libxfs/xfs_attr.h
> @@ -77,6 +77,13 @@ typedef struct attrlist_ent {	/* data from attr_list() */
>  	char	a_name[1];	/* attr name (NULL terminated) */
>  } attrlist_ent_t;
>  
> +/* Attr state machine types */
> +enum xfs_attr_state {
> +	XFS_ATTR_STATE1 = 1,
> +	XFS_ATTR_STATE2 = 2,
> +	XFS_ATTR_STATE3 = 3,

Um... to what states do these refer?

> +};
> +
>  /*
>   * List of attrs to commit later.
>   */
> @@ -88,7 +95,16 @@ struct xfs_attr_item {
>  	void		  *xattri_name;	      /* attr name */
>  	uint32_t	  xattri_name_len;    /* length of name */
>  	uint32_t	  xattri_flags;       /* attr flags */
> -	struct list_head  xattri_list;
> +
> +	/*
> +	 * Delayed attr parameters that need to remain instantiated
> +	 * across transaction rolls during the defer finish
> +	 */
> +	struct xfs_buf		*xattri_leaf_bp;  /* Leaf buf to release */
> +	enum xfs_attr_state	xattri_state;	  /* state machine marker */
> +	struct xfs_da_args	xattri_args;	  /* args context */

Assuming we're keeping xattri_args.trans up to date here?

> +
> +	struct list_head	xattri_list;

What's this for?

--D

>  
>  	/*
>  	 * A byte array follows the header containing the file name and
> diff --git a/fs/xfs/scrub/common.c b/fs/xfs/scrub/common.c
> index 0c54ff5..270c32e 100644
> --- a/fs/xfs/scrub/common.c
> +++ b/fs/xfs/scrub/common.c
> @@ -30,6 +30,8 @@
>  #include "xfs_rmap_btree.h"
>  #include "xfs_log.h"
>  #include "xfs_trans_priv.h"
> +#include "xfs_da_format.h"
> +#include "xfs_da_btree.h"
>  #include "xfs_attr.h"
>  #include "xfs_reflink.h"
>  #include "scrub/xfs_scrub.h"
> diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c
> index 142de8d..9b1b93e 100644
> --- a/fs/xfs/xfs_acl.c
> +++ b/fs/xfs/xfs_acl.c
> @@ -10,6 +10,8 @@
>  #include "xfs_mount.h"
>  #include "xfs_inode.h"
>  #include "xfs_acl.h"
> +#include "xfs_da_format.h"
> +#include "xfs_da_btree.h"
>  #include "xfs_attr.h"
>  #include "xfs_trace.h"
>  #include <linux/slab.h>
> diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
> index 0ea19b4..36e6d1e 100644
> --- a/fs/xfs/xfs_attr_item.c
> +++ b/fs/xfs/xfs_attr_item.c
> @@ -19,10 +19,10 @@
>  #include "xfs_rmap.h"
>  #include "xfs_inode.h"
>  #include "xfs_icache.h"
> -#include "xfs_attr.h"
>  #include "xfs_shared.h"
>  #include "xfs_da_format.h"
>  #include "xfs_da_btree.h"
> +#include "xfs_attr.h"
>  
>  static inline struct xfs_attri_log_item *ATTRI_ITEM(struct xfs_log_item *lip)
>  {
> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> index ab341d6..c8728ca 100644
> --- a/fs/xfs/xfs_ioctl.c
> +++ b/fs/xfs/xfs_ioctl.c
> @@ -16,6 +16,8 @@
>  #include "xfs_rtalloc.h"
>  #include "xfs_itable.h"
>  #include "xfs_error.h"
> +#include "xfs_da_format.h"
> +#include "xfs_da_btree.h"
>  #include "xfs_attr.h"
>  #include "xfs_bmap.h"
>  #include "xfs_bmap_util.h"
> diff --git a/fs/xfs/xfs_ioctl32.c b/fs/xfs/xfs_ioctl32.c
> index 5001dca..23f6990 100644
> --- a/fs/xfs/xfs_ioctl32.c
> +++ b/fs/xfs/xfs_ioctl32.c
> @@ -21,6 +21,8 @@
>  #include "xfs_fsops.h"
>  #include "xfs_alloc.h"
>  #include "xfs_rtalloc.h"
> +#include "xfs_da_format.h"
> +#include "xfs_da_btree.h"
>  #include "xfs_attr.h"
>  #include "xfs_ioctl.h"
>  #include "xfs_ioctl32.h"
> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> index e73c21a..561c467 100644
> --- a/fs/xfs/xfs_iops.c
> +++ b/fs/xfs/xfs_iops.c
> @@ -17,6 +17,7 @@
>  #include "xfs_acl.h"
>  #include "xfs_quota.h"
>  #include "xfs_error.h"
> +#include "xfs_da_btree.h"
>  #include "xfs_attr.h"
>  #include "xfs_trans.h"
>  #include "xfs_trace.h"
> diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c
> index 3013746..938e81d 100644
> --- a/fs/xfs/xfs_xattr.c
> +++ b/fs/xfs/xfs_xattr.c
> @@ -11,6 +11,7 @@
>  #include "xfs_mount.h"
>  #include "xfs_da_format.h"
>  #include "xfs_inode.h"
> +#include "xfs_da_btree.h"
>  #include "xfs_attr.h"
>  #include "xfs_attr_leaf.h"
>  #include "xfs_acl.h"
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 8/9] xfs: Roll delayed attr operations by returning EAGAIN
  2019-04-12 22:50 ` [PATCH 8/9] xfs: Roll delayed attr operations by returning EAGAIN Allison Henderson
@ 2019-04-15 23:31   ` Darrick J. Wong
  2019-04-16 19:54     ` Allison Henderson
  2019-04-23 14:19   ` Brian Foster
  1 sibling, 1 reply; 48+ messages in thread
From: Darrick J. Wong @ 2019-04-15 23:31 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Fri, Apr 12, 2019 at 03:50:35PM -0700, Allison Henderson wrote:
> This patch modifies xfs_attr_set_args to return -EAGAIN
> when a transaction needs to be rolled.  All functions
> currently calling xfs_attr_set_args are modified to use
> the deferred attr operation, or handle the -EAGAIN return
> code
> 
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_attr.c | 62 ++++++++++++++++++++++++++++++++++++++++--------
>  fs/xfs/libxfs/xfs_attr.h |  2 +-
>  fs/xfs/xfs_attr_item.c   | 41 +++++++++++++++++++++++++++-----
>  fs/xfs/xfs_trans.h       |  2 ++
>  fs/xfs/xfs_trans_attr.c  | 56 +++++++++++++++++++++++++------------------
>  5 files changed, 123 insertions(+), 40 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index 0042708..4ddd86b 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -236,10 +236,37 @@ int
>  xfs_attr_set_args(
>  	struct xfs_da_args	*args,
>  	struct xfs_buf          **leaf_bp,
> +	enum xfs_attr_state	*state,
>  	bool			roll_trans)
>  {
>  	struct xfs_inode	*dp = args->dp;
>  	int			error = 0;
> +	int			sf_size;
> +
> +	switch (*state) {
> +	case (XFS_ATTR_STATE1):
> +		goto state1;
> +	case (XFS_ATTR_STATE2):
> +		goto state2;
> +	case (XFS_ATTR_STATE3):
> +		goto state3;
> +	}

I still don't understand what these there states are, though evidently
if we get to this line then we weren't in any of the three possible
states?

XFS_ATTR_STATE_INIT...?

> +
> +	/*
> +	 * New inodes may not have an attribute fork yet. So set the attribute
> +	 * fork appropriately
> +	 */
> +	if (XFS_IFORK_Q((args->dp)) == 0) {
> +		sf_size = sizeof(struct xfs_attr_sf_hdr) +
> +		     XFS_ATTR_SF_ENTSIZE_BYNAME(args->namelen, args->valuelen);
> +		xfs_bmap_set_attrforkoff(args->dp, sf_size, NULL);
> +		args->dp->i_afp = kmem_zone_zalloc(xfs_ifork_zone, KM_SLEEP);
> +		args->dp->i_afp->if_flags = XFS_IFEXTENTS;
> +	}
> +
> +	*state = XFS_ATTR_STATE1;

XFS_ATTR_STATE_ADDED_FORK...

> +	return -EAGAIN;
> +state1:
>  
>  	/*
>  	 * If the attribute list is non-existent or a shortform list,
> @@ -248,7 +275,6 @@ xfs_attr_set_args(
>  	if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL ||
>  	    (dp->i_d.di_aformat == XFS_DINODE_FMT_EXTENTS &&
>  	     dp->i_d.di_anextents == 0)) {
> -
>  		/*
>  		 * Build initial attribute list (if required).
>  		 */
> @@ -262,6 +288,9 @@ xfs_attr_set_args(
>  		if (error != -ENOSPC)
>  			return error;
>  
> +		*state = XFS_ATTR_STATE2;

XFS_ATTR_STATE_FAILED_SF_ADD...

> +		return -EAGAIN;
> +state2:
>  		/*
>  		 * It won't fit in the shortform, transform to a leaf block.
>  		 * GROT: another possible req'mt for a double-split btree op.
> @@ -270,14 +299,14 @@ xfs_attr_set_args(
>  		if (error)
>  			return error;
>  
> -		if (roll_trans) {
> -			/*
> -			 * Prevent the leaf buffer from being unlocked so that a
> -			 * concurrent AIL push cannot grab the half-baked leaf
> -			 * buffer and run into problems with the write verifier.
> -			 */
> -			xfs_trans_bhold(args->trans, *leaf_bp);
> +		/*
> +		 * Prevent the leaf buffer from being unlocked so that a
> +		 * concurrent AIL push cannot grab the half-baked leaf
> +		 * buffer and run into problems with the write verifier.
> +		 */
> +		xfs_trans_bhold(args->trans, *leaf_bp);
>  
> +		if (roll_trans) {
>  			error = xfs_defer_finish(&args->trans);
>  			if (error)
>  				return error;
> @@ -293,6 +322,12 @@ xfs_attr_set_args(
>  			xfs_trans_bjoin(args->trans, *leaf_bp);
>  			*leaf_bp = NULL;
>  		}
> +
> +		*state = XFS_ATTR_STATE3;

XFS_ATTR_STATE_LEAF_AVAIL...

> +		return -EAGAIN;
> +state3:
> +		if (*leaf_bp != NULL)
> +			xfs_trans_brelse(args->trans, *leaf_bp);
>  	}
>  
>  	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
> @@ -419,7 +454,9 @@ xfs_attr_set(
>  		goto out_trans_cancel;
>  
>  	xfs_trans_ijoin(args.trans, dp, 0);
> -	error = xfs_attr_set_args(&args, &leaf_bp, true);
> +
> +	error = xfs_attr_set_deferred(dp, args.trans, name, namelen,
> +			value, valuelen, flags);

Oh, I see, the XFS_ATTR_STATE[1-3] added in the previous patch are
supposed to record restart points when we have to duck out to roll a
transaction or something.

Hmm, why does this have to happen?  Is it because the current attr
setting code will allocate and commit transactions, but now that we have
deferred attr items, each of those commits has to turn into backing out
to whomever allocated the transaction to get another?

Oh right, there's that whole mess where the log recovery transaction
isn't supposed to be rolled or committed, ever, so that the defer ops
can be ripped off and run after the recovered items are all more or less
written out.  Ugh.

Uh... meeting time, I'll think about this and continue this reply later.

--D

>  	if (error)
>  		goto out_release_leaf;
>  	if (!args.trans) {
> @@ -554,8 +591,13 @@ xfs_attr_remove(
>  	 */
>  	xfs_trans_ijoin(args.trans, dp, 0);
>  
> -	error = xfs_attr_remove_args(&args, true);
> +	error = xfs_has_attr(&args);
> +	if (error)
> +		goto out;
> +
>  
> +	error = xfs_attr_remove_deferred(dp, args.trans,
> +			name, namelen, flags);
>  	if (error)
>  		goto out;
>  
> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
> index 4ce3b0a..da95e69 100644
> --- a/fs/xfs/libxfs/xfs_attr.h
> +++ b/fs/xfs/libxfs/xfs_attr.h
> @@ -181,7 +181,7 @@ int xfs_attr_set(struct xfs_inode *dp, const unsigned char *name,
>  		 size_t namelen, unsigned char *value, int valuelen,
>  		 int flags);
>  int xfs_attr_set_args(struct xfs_da_args *args, struct xfs_buf **leaf_bp,
> -		 bool roll_trans);
> +		 enum xfs_attr_state *state, bool roll_trans);
>  int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name,
>  		    size_t namelen, int flags);
>  int xfs_has_attr(struct xfs_da_args *args);
> diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
> index 36e6d1e..292d608 100644
> --- a/fs/xfs/xfs_attr_item.c
> +++ b/fs/xfs/xfs_attr_item.c
> @@ -464,8 +464,11 @@ xfs_attri_recover(
>  	struct xfs_attri_log_format	*attrp;
>  	struct xfs_trans_res		tres;
>  	int				local;
> -	int				error = 0;
> +	int				error, err2 = 0;
>  	int				rsvd = 0;
> +	enum xfs_attr_state		state = 0;
> +	struct xfs_buf			*leaf_bp = NULL;
> +
>  
>  	ASSERT(!test_bit(XFS_ATTRI_RECOVERED, &attrip->flags));
>  
> @@ -540,14 +543,40 @@ xfs_attri_recover(
>  	xfs_ilock(ip, XFS_ILOCK_EXCL);
>  
>  	xfs_trans_ijoin(args.trans, ip, 0);
> -	error = xfs_trans_attr(&args, attrdp, attrp->alfi_op_flags);
> -	if (error)
> -		goto abort_error;
>  
> +	do {
> +		leaf_bp = NULL;
> +
> +		error = xfs_trans_attr(&args, attrdp, &leaf_bp, &state,
> +				attrp->alfi_op_flags);
> +		if (error && error != -EAGAIN)
> +			goto abort_error;
> +
> +		xfs_trans_log_inode(args.trans, ip,
> +				XFS_ILOG_CORE | XFS_ILOG_ADATA);
> +
> +		err2 = xfs_trans_commit(args.trans);
> +		if (err2) {
> +			error = err2;
> +			goto abort_error;
> +		}
> +
> +		if (error == -EAGAIN) {
> +			err2 = xfs_trans_alloc(mp, &tres, args.total, 0,
> +				XFS_TRANS_PERM_LOG_RES, &args.trans);
> +			if (err2) {
> +				error = err2;
> +				goto abort_error;
> +			}
> +			xfs_trans_ijoin(args.trans, ip, 0);
> +		}
> +
> +	} while (error == -EAGAIN);
> +
> +	if (leaf_bp)
> +		xfs_trans_brelse(args.trans, leaf_bp);
>  
>  	set_bit(XFS_ATTRI_RECOVERED, &attrip->flags);
> -	xfs_trans_log_inode(args.trans, ip, XFS_ILOG_CORE | XFS_ILOG_ADATA);
> -	error = xfs_trans_commit(args.trans);
>  	xfs_iunlock(ip, XFS_ILOCK_EXCL);
>  	return error;
>  
> diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
> index 7bb9d8e..c785cd7 100644
> --- a/fs/xfs/xfs_trans.h
> +++ b/fs/xfs/xfs_trans.h
> @@ -239,6 +239,8 @@ xfs_trans_get_attrd(struct xfs_trans *tp,
>  		    struct xfs_attri_log_item *attrip);
>  int xfs_trans_attr(struct xfs_da_args *args,
>  		   struct xfs_attrd_log_item *attrdp,
> +		   struct xfs_buf **leaf_bp,
> +		   void *state,
>  		   uint32_t attr_op_flags);
>  
>  int		xfs_trans_commit(struct xfs_trans *);
> diff --git a/fs/xfs/xfs_trans_attr.c b/fs/xfs/xfs_trans_attr.c
> index 3679348..a3339ea 100644
> --- a/fs/xfs/xfs_trans_attr.c
> +++ b/fs/xfs/xfs_trans_attr.c
> @@ -56,10 +56,11 @@ int
>  xfs_trans_attr(
>  	struct xfs_da_args		*args,
>  	struct xfs_attrd_log_item	*attrdp,
> +	struct xfs_buf			**leaf_bp,
> +	void				*state,
>  	uint32_t			op_flags)
>  {
>  	int				error;
> -	struct xfs_buf			*leaf_bp = NULL;
>  
>  	error = xfs_qm_dqattach_locked(args->dp, 0);
>  	if (error)
> @@ -68,7 +69,8 @@ xfs_trans_attr(
>  	switch (op_flags) {
>  	case XFS_ATTR_OP_FLAGS_SET:
>  		args->op_flags |= XFS_DA_OP_ADDNAME;
> -		error = xfs_attr_set_args(args, &leaf_bp, false);
> +		error = xfs_attr_set_args(args, leaf_bp,
> +				(enum xfs_attr_state *)state, false);
>  		break;
>  	case XFS_ATTR_OP_FLAGS_REMOVE:
>  		ASSERT(XFS_IFORK_Q((args->dp)));
> @@ -78,11 +80,6 @@ xfs_trans_attr(
>  		error = -EFSCORRUPTED;
>  	}
>  
> -	if (error) {
> -		if (leaf_bp)
> -			xfs_trans_brelse(args->trans, leaf_bp);
> -	}
> -
>  	/*
>  	 * Mark the transaction dirty, even on error. This ensures the
>  	 * transaction is aborted, which:
> @@ -184,27 +181,40 @@ xfs_attr_finish_item(
>  	char				*name_value;
>  	int				error;
>  	int				local;
> -	struct xfs_da_args		args;
> +	struct xfs_da_args		*args;
>  
>  	attr = container_of(item, struct xfs_attr_item, xattri_list);
> -	name_value = ((char *)attr) + sizeof(struct xfs_attr_item);
> -
> -	error = xfs_attr_args_init(&args, attr->xattri_ip, name_value,
> -				   attr->xattri_name_len, attr->xattri_flags);
> -	if (error)
> -		goto out;
> +	args = &attr->xattri_args;
> +
> +	if (attr->xattri_state == 0) {
> +		/* Only need to initialize args context once */
> +		name_value = ((char *)attr) + sizeof(struct xfs_attr_item);
> +		error = xfs_attr_args_init(args, attr->xattri_ip, name_value,
> +					   attr->xattri_name_len,
> +					   attr->xattri_flags);
> +		if (error)
> +			goto out;
> +
> +		args->hashval = xfs_da_hashname(args->name, args->namelen);
> +		args->value = &name_value[attr->xattri_name_len];
> +		args->valuelen = attr->xattri_value_len;
> +		args->op_flags = XFS_DA_OP_OKNOENT;
> +		args->total = xfs_attr_calc_size(args, &local);
> +		attr->xattri_leaf_bp = NULL;
> +	}
>  
> -	args.hashval = xfs_da_hashname(args.name, args.namelen);
> -	args.value = &name_value[attr->xattri_name_len];
> -	args.valuelen = attr->xattri_value_len;
> -	args.op_flags = XFS_DA_OP_OKNOENT;
> -	args.total = xfs_attr_calc_size(&args, &local);
> -	args.trans = tp;
> +	/*
> +	 * Always reset trans after EAGAIN cycle
> +	 * since the transaction is new
> +	 */
> +	args->trans = tp;
>  
> -	error = xfs_trans_attr(&args, done_item,
> -			attr->xattri_op_flags);
> +	error = xfs_trans_attr(args, done_item, &attr->xattri_leaf_bp,
> +			&attr->xattri_state, attr->xattri_op_flags);
>  out:
> -	kmem_free(attr);
> +	if (error != -EAGAIN)
> +		kmem_free(attr);
> +
>  	return error;
>  }
>  
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/9] xfs: Remove all strlen in all xfs_attr_* functions for attr names.
  2019-04-15 21:18       ` Dave Chinner
@ 2019-04-16  1:33         ` Allison Henderson
  0 siblings, 0 replies; 48+ messages in thread
From: Allison Henderson @ 2019-04-16  1:33 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs



On 4/15/19 2:18 PM, Dave Chinner wrote:
> On Mon, Apr 15, 2019 at 01:08:50PM -0700, Allison Henderson wrote:
>> On 4/14/19 4:02 PM, Dave Chinner wrote:
>>> On Fri, Apr 12, 2019 at 03:50:28PM -0700, Allison Henderson wrote:
>>>> This helps to pre-simplify the extra handling of the null terminator in
>>>> delayed operations which use memcpy rather than strlen.  Later
>>>> when we introduce parent pointers, attribute names will become binary,
>>>> so strlen will not work at all.  Removing uses of strlen now will
>>>> help reduce complexities later
>>>>
>>>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
>>>> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
>>>
>>> Hmmm. If we are going to pass name/namelen pairs around everywhere,
>>> can we convert these to a struct xfs_name?
>>>
>>> I also wonder where we should convert the name/namelen pairs in the
>>> attr args struct to an xfs_name, too, then we can just do:
>>>
>>> 	args->name = *name;
>>>
>>> to copy in the string pointer and the name length in one go.
>>>
>>> And we might even be able to put the attribute flags (e.g.
>>> ATTR_ROOT) in the name.type field, and get rid of that parameter
>>> being passed around, too...
>>>
>>> Cheers,
>>>
>>> Dave.
>>
>> I think a new struct xfs_name is reasonable.  Or maybe a general purpose
>> xfs_array?  The value/valuelen parameters have a similar relation that could
>> make use of that too.  How would people feel about something like this:
>>
>> struct xfs_array {
>> 	unsigned char *bytes;
>> 	size_t		len;
>> }
>>
>> struct xfs_attribute {
>> 	struct xfs_array	name;
>> 	int			flags;
>> }
> 
> Hmmm. I think that's overkill - we really don't need that much
> abstraction and it makes the code more complex rather than
> simplifies it. i.e. args->name.name.bytes isn't really a
> simplification...
> 
> Also the above means we have to discard the const part of the names
> we are passed because this construction doesn't appear to be
> intended for read only strings. i.e.:
> 
>> We could add the value member in here too.
> 
> Which is something that isn't const. :)
> 
>> It tends to get passed around
>> just as much with the exception of attr remove operations which only need
>> the name.  I think changing the actual members of xfs_da_args should be
>> another patch though, since that's a bigger change that affects a wider
>> scope of code.  I can look into it though.
> 
> I don't think we want anything more complex or to extent it to
> include other parts of the attr API - that just means we have a
> special snowflake for the API rather than a generic way of encoding
> a name string across all the internal interfaces that space a name/len
> pair...
> 
> Really, I was just suggesting using the struct xfs_name because
> it already exists and encodes/documents exactly what we are passing
> here, hence getting rid of some of the mess around passing the attr
> names around.

Oh I see.  I thought you were suggesting a new struct, I didnt realize 
it was already there.  That makes sense then.  :-)

> 
>> Also, do I still keep the old reviewed-by on the patch if the patch is still
>> going through changes?  I have a few signed-off patches in the parent
>> pointer set that have changed quite a bit since we've started too.
> 
> If the changes are significant, then it needs review again and so
> you should drop it.
> 
> But I'd do the name/len -> xfsname conversion as a separate patch,
> so this patch doesn't need changing.

Alrighty, I will add in another patch that does that then.

Thanks!
Allison

> 
> Cheers,
> 
> Dave.
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 7/9] xfs: Add attr context to log item
  2019-04-15 22:50   ` Darrick J. Wong
@ 2019-04-16  2:30     ` Allison Henderson
  2019-04-16  3:21       ` Allison Henderson
  0 siblings, 1 reply; 48+ messages in thread
From: Allison Henderson @ 2019-04-16  2:30 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: xfs

On 4/15/19 3:50 PM,  wrote:
> On Fri, Apr 12, 2019 at 03:50:34PM -0700, Allison Henderson wrote:
>> This patch modifies xfs_attr_item to store a xfs_da_args, a xfs_buf pointer
>> and a new state type. We will use these in the next patch when
>> we modify xfs_set_attr_args to roll transactions by returning EAGAIN.
>> Because the subroutines of this function modify the contents of these
>> structures, we need to find a place to store them where they remain
>> instantiated across multiple calls to xfs_set_attr_args.
>>
>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
>> ---
>>   fs/xfs/libxfs/xfs_attr.h | 18 +++++++++++++++++-
>>   fs/xfs/scrub/common.c    |  2 ++
>>   fs/xfs/xfs_acl.c         |  2 ++
>>   fs/xfs/xfs_attr_item.c   |  2 +-
>>   fs/xfs/xfs_ioctl.c       |  2 ++
>>   fs/xfs/xfs_ioctl32.c     |  2 ++
>>   fs/xfs/xfs_iops.c        |  1 +
>>   fs/xfs/xfs_xattr.c       |  1 +
>>   8 files changed, 28 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
>> index 974c963..4ce3b0a 100644
>> --- a/fs/xfs/libxfs/xfs_attr.h
>> +++ b/fs/xfs/libxfs/xfs_attr.h
>> @@ -77,6 +77,13 @@ typedef struct attrlist_ent {	/* data from attr_list() */
>>   	char	a_name[1];	/* attr name (NULL terminated) */
>>   } attrlist_ent_t;
>>   
>> +/* Attr state machine types */
>> +enum xfs_attr_state {
>> +	XFS_ATTR_STATE1 = 1,
>> +	XFS_ATTR_STATE2 = 2,
>> +	XFS_ATTR_STATE3 = 3,
> 
> Um... to what states do these refer?

I actually struggled with what to call these other than state machine 
types.  They are sort of "you were here" bookmark for xfs_attr_set_args. 
  The idea is that when we return EAGAIN, and then get recalled with a 
new transaction, we jump back to where we were based on this marker.

> 
>> +};
>> +
>>   /*
>>    * List of attrs to commit later.
>>    */
>> @@ -88,7 +95,16 @@ struct xfs_attr_item {
>>   	void		  *xattri_name;	      /* attr name */
>>   	uint32_t	  xattri_name_len;    /* length of name */
>>   	uint32_t	  xattri_flags;       /* attr flags */
>> -	struct list_head  xattri_list;
>> +
>> +	/*
>> +	 * Delayed attr parameters that need to remain instantiated
>> +	 * across transaction rolls during the defer finish
>> +	 */
>> +	struct xfs_buf		*xattri_leaf_bp;  /* Leaf buf to release */
>> +	enum xfs_attr_state	xattri_state;	  /* state machine marker */
>> +	struct xfs_da_args	xattri_args;	  /* args context */
> 
> Assuming we're keeping xattri_args.trans up to date here?

Yes, that happens in xfs_attr_finish_item in the next patch.

> 
>> +
>> +	struct list_head	xattri_list;
> 
> What's this for?

xattri_list is introduced in patch 2, which I loosely modeled off other 
delayed items at the time.  It a list of intents that have been logged 
to this item.  Though it could use a comment :-)

It doesn't relate directly to the "re-roll with EAGAIN" mechanics being 
added in this patch if that's what you are asking.  It just needs to be 
the last member of the struct because it's followed by a byte array.

I hope that helps to explain some.  Let me know if you have any other 
questions.  Thanks!

Allison
> 
> --D
> 
>>   
>>   	/*
>>   	 * A byte array follows the header containing the file name and
>> diff --git a/fs/xfs/scrub/common.c b/fs/xfs/scrub/common.c
>> index 0c54ff5..270c32e 100644
>> --- a/fs/xfs/scrub/common.c
>> +++ b/fs/xfs/scrub/common.c
>> @@ -30,6 +30,8 @@
>>   #include "xfs_rmap_btree.h"
>>   #include "xfs_log.h"
>>   #include "xfs_trans_priv.h"
>> +#include "xfs_da_format.h"
>> +#include "xfs_da_btree.h"
>>   #include "xfs_attr.h"
>>   #include "xfs_reflink.h"
>>   #include "scrub/xfs_scrub.h"
>> diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c
>> index 142de8d..9b1b93e 100644
>> --- a/fs/xfs/xfs_acl.c
>> +++ b/fs/xfs/xfs_acl.c
>> @@ -10,6 +10,8 @@
>>   #include "xfs_mount.h"
>>   #include "xfs_inode.h"
>>   #include "xfs_acl.h"
>> +#include "xfs_da_format.h"
>> +#include "xfs_da_btree.h"
>>   #include "xfs_attr.h"
>>   #include "xfs_trace.h"
>>   #include <linux/slab.h>
>> diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
>> index 0ea19b4..36e6d1e 100644
>> --- a/fs/xfs/xfs_attr_item.c
>> +++ b/fs/xfs/xfs_attr_item.c
>> @@ -19,10 +19,10 @@
>>   #include "xfs_rmap.h"
>>   #include "xfs_inode.h"
>>   #include "xfs_icache.h"
>> -#include "xfs_attr.h"
>>   #include "xfs_shared.h"
>>   #include "xfs_da_format.h"
>>   #include "xfs_da_btree.h"
>> +#include "xfs_attr.h"
>>   
>>   static inline struct xfs_attri_log_item *ATTRI_ITEM(struct xfs_log_item *lip)
>>   {
>> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
>> index ab341d6..c8728ca 100644
>> --- a/fs/xfs/xfs_ioctl.c
>> +++ b/fs/xfs/xfs_ioctl.c
>> @@ -16,6 +16,8 @@
>>   #include "xfs_rtalloc.h"
>>   #include "xfs_itable.h"
>>   #include "xfs_error.h"
>> +#include "xfs_da_format.h"
>> +#include "xfs_da_btree.h"
>>   #include "xfs_attr.h"
>>   #include "xfs_bmap.h"
>>   #include "xfs_bmap_util.h"
>> diff --git a/fs/xfs/xfs_ioctl32.c b/fs/xfs/xfs_ioctl32.c
>> index 5001dca..23f6990 100644
>> --- a/fs/xfs/xfs_ioctl32.c
>> +++ b/fs/xfs/xfs_ioctl32.c
>> @@ -21,6 +21,8 @@
>>   #include "xfs_fsops.h"
>>   #include "xfs_alloc.h"
>>   #include "xfs_rtalloc.h"
>> +#include "xfs_da_format.h"
>> +#include "xfs_da_btree.h"
>>   #include "xfs_attr.h"
>>   #include "xfs_ioctl.h"
>>   #include "xfs_ioctl32.h"
>> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
>> index e73c21a..561c467 100644
>> --- a/fs/xfs/xfs_iops.c
>> +++ b/fs/xfs/xfs_iops.c
>> @@ -17,6 +17,7 @@
>>   #include "xfs_acl.h"
>>   #include "xfs_quota.h"
>>   #include "xfs_error.h"
>> +#include "xfs_da_btree.h"
>>   #include "xfs_attr.h"
>>   #include "xfs_trans.h"
>>   #include "xfs_trace.h"
>> diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c
>> index 3013746..938e81d 100644
>> --- a/fs/xfs/xfs_xattr.c
>> +++ b/fs/xfs/xfs_xattr.c
>> @@ -11,6 +11,7 @@
>>   #include "xfs_mount.h"
>>   #include "xfs_da_format.h"
>>   #include "xfs_inode.h"
>> +#include "xfs_da_btree.h"
>>   #include "xfs_attr.h"
>>   #include "xfs_attr_leaf.h"
>>   #include "xfs_acl.h"
>> -- 
>> 2.7.4
>>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 7/9] xfs: Add attr context to log item
  2019-04-16  2:30     ` Allison Henderson
@ 2019-04-16  3:21       ` Allison Henderson
  0 siblings, 0 replies; 48+ messages in thread
From: Allison Henderson @ 2019-04-16  3:21 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: xfs



On 4/15/19 7:30 PM, Allison Henderson wrote:
> On 4/15/19 3:50 PM,  wrote:
>> On Fri, Apr 12, 2019 at 03:50:34PM -0700, Allison Henderson wrote:
>>> This patch modifies xfs_attr_item to store a xfs_da_args, a xfs_buf 
>>> pointer
>>> and a new state type. We will use these in the next patch when
>>> we modify xfs_set_attr_args to roll transactions by returning EAGAIN.
>>> Because the subroutines of this function modify the contents of these
>>> structures, we need to find a place to store them where they remain
>>> instantiated across multiple calls to xfs_set_attr_args.
>>>
>>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
>>> ---
>>>   fs/xfs/libxfs/xfs_attr.h | 18 +++++++++++++++++-
>>>   fs/xfs/scrub/common.c    |  2 ++
>>>   fs/xfs/xfs_acl.c         |  2 ++
>>>   fs/xfs/xfs_attr_item.c   |  2 +-
>>>   fs/xfs/xfs_ioctl.c       |  2 ++
>>>   fs/xfs/xfs_ioctl32.c     |  2 ++
>>>   fs/xfs/xfs_iops.c        |  1 +
>>>   fs/xfs/xfs_xattr.c       |  1 +
>>>   8 files changed, 28 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
>>> index 974c963..4ce3b0a 100644
>>> --- a/fs/xfs/libxfs/xfs_attr.h
>>> +++ b/fs/xfs/libxfs/xfs_attr.h
>>> @@ -77,6 +77,13 @@ typedef struct attrlist_ent {    /* data from 
>>> attr_list() */
>>>       char    a_name[1];    /* attr name (NULL terminated) */
>>>   } attrlist_ent_t;
>>> +/* Attr state machine types */
>>> +enum xfs_attr_state {
>>> +    XFS_ATTR_STATE1 = 1,
>>> +    XFS_ATTR_STATE2 = 2,
>>> +    XFS_ATTR_STATE3 = 3,
>>
>> Um... to what states do these refer?
> 
> I actually struggled with what to call these other than state machine 
> types.  They are sort of "you were here" bookmark for xfs_attr_set_args. 
>   The idea is that when we return EAGAIN, and then get recalled with a 
> new transaction, we jump back to where we were based on this marker.
> 
>>
>>> +};
>>> +
>>>   /*
>>>    * List of attrs to commit later.
>>>    */
>>> @@ -88,7 +95,16 @@ struct xfs_attr_item {
>>>       void          *xattri_name;          /* attr name */
>>>       uint32_t      xattri_name_len;    /* length of name */
>>>       uint32_t      xattri_flags;       /* attr flags */
>>> -    struct list_head  xattri_list;
>>> +
>>> +    /*
>>> +     * Delayed attr parameters that need to remain instantiated
>>> +     * across transaction rolls during the defer finish
>>> +     */
>>> +    struct xfs_buf        *xattri_leaf_bp;  /* Leaf buf to release */
>>> +    enum xfs_attr_state    xattri_state;      /* state machine 
>>> marker */
>>> +    struct xfs_da_args    xattri_args;      /* args context */
>>
>> Assuming we're keeping xattri_args.trans up to date here?
> 
> Yes, that happens in xfs_attr_finish_item in the next patch.
> 
>>
>>> +
>>> +    struct list_head    xattri_list;
>>
>> What's this for?
> 
> xattri_list is introduced in patch 2, which I loosely modeled off other 

Sorry, typo: xattri_list is introduced in patch 4, not 2.

> delayed items at the time.  It a list of intents that have been logged 
> to this item.  Though it could use a comment :-)
> 
> It doesn't relate directly to the "re-roll with EAGAIN" mechanics being 
> added in this patch if that's what you are asking.  It just needs to be 
> the last member of the struct because it's followed by a byte array.
> 
> I hope that helps to explain some.  Let me know if you have any other 
> questions.  Thanks!
> 
> Allison
>>
>> --D
>>
>>>       /*
>>>        * A byte array follows the header containing the file name and
>>> diff --git a/fs/xfs/scrub/common.c b/fs/xfs/scrub/common.c
>>> index 0c54ff5..270c32e 100644
>>> --- a/fs/xfs/scrub/common.c
>>> +++ b/fs/xfs/scrub/common.c
>>> @@ -30,6 +30,8 @@
>>>   #include "xfs_rmap_btree.h"
>>>   #include "xfs_log.h"
>>>   #include "xfs_trans_priv.h"
>>> +#include "xfs_da_format.h"
>>> +#include "xfs_da_btree.h"
>>>   #include "xfs_attr.h"
>>>   #include "xfs_reflink.h"
>>>   #include "scrub/xfs_scrub.h"
>>> diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c
>>> index 142de8d..9b1b93e 100644
>>> --- a/fs/xfs/xfs_acl.c
>>> +++ b/fs/xfs/xfs_acl.c
>>> @@ -10,6 +10,8 @@
>>>   #include "xfs_mount.h"
>>>   #include "xfs_inode.h"
>>>   #include "xfs_acl.h"
>>> +#include "xfs_da_format.h"
>>> +#include "xfs_da_btree.h"
>>>   #include "xfs_attr.h"
>>>   #include "xfs_trace.h"
>>>   #include <linux/slab.h>
>>> diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
>>> index 0ea19b4..36e6d1e 100644
>>> --- a/fs/xfs/xfs_attr_item.c
>>> +++ b/fs/xfs/xfs_attr_item.c
>>> @@ -19,10 +19,10 @@
>>>   #include "xfs_rmap.h"
>>>   #include "xfs_inode.h"
>>>   #include "xfs_icache.h"
>>> -#include "xfs_attr.h"
>>>   #include "xfs_shared.h"
>>>   #include "xfs_da_format.h"
>>>   #include "xfs_da_btree.h"
>>> +#include "xfs_attr.h"
>>>   static inline struct xfs_attri_log_item *ATTRI_ITEM(struct 
>>> xfs_log_item *lip)
>>>   {
>>> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
>>> index ab341d6..c8728ca 100644
>>> --- a/fs/xfs/xfs_ioctl.c
>>> +++ b/fs/xfs/xfs_ioctl.c
>>> @@ -16,6 +16,8 @@
>>>   #include "xfs_rtalloc.h"
>>>   #include "xfs_itable.h"
>>>   #include "xfs_error.h"
>>> +#include "xfs_da_format.h"
>>> +#include "xfs_da_btree.h"
>>>   #include "xfs_attr.h"
>>>   #include "xfs_bmap.h"
>>>   #include "xfs_bmap_util.h"
>>> diff --git a/fs/xfs/xfs_ioctl32.c b/fs/xfs/xfs_ioctl32.c
>>> index 5001dca..23f6990 100644
>>> --- a/fs/xfs/xfs_ioctl32.c
>>> +++ b/fs/xfs/xfs_ioctl32.c
>>> @@ -21,6 +21,8 @@
>>>   #include "xfs_fsops.h"
>>>   #include "xfs_alloc.h"
>>>   #include "xfs_rtalloc.h"
>>> +#include "xfs_da_format.h"
>>> +#include "xfs_da_btree.h"
>>>   #include "xfs_attr.h"
>>>   #include "xfs_ioctl.h"
>>>   #include "xfs_ioctl32.h"
>>> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
>>> index e73c21a..561c467 100644
>>> --- a/fs/xfs/xfs_iops.c
>>> +++ b/fs/xfs/xfs_iops.c
>>> @@ -17,6 +17,7 @@
>>>   #include "xfs_acl.h"
>>>   #include "xfs_quota.h"
>>>   #include "xfs_error.h"
>>> +#include "xfs_da_btree.h"
>>>   #include "xfs_attr.h"
>>>   #include "xfs_trans.h"
>>>   #include "xfs_trace.h"
>>> diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c
>>> index 3013746..938e81d 100644
>>> --- a/fs/xfs/xfs_xattr.c
>>> +++ b/fs/xfs/xfs_xattr.c
>>> @@ -11,6 +11,7 @@
>>>   #include "xfs_mount.h"
>>>   #include "xfs_da_format.h"
>>>   #include "xfs_inode.h"
>>> +#include "xfs_da_btree.h"
>>>   #include "xfs_attr.h"
>>>   #include "xfs_attr_leaf.h"
>>>   #include "xfs_acl.h"
>>> -- 
>>> 2.7.4
>>>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 8/9] xfs: Roll delayed attr operations by returning EAGAIN
  2019-04-15 23:31   ` Darrick J. Wong
@ 2019-04-16 19:54     ` Allison Henderson
  0 siblings, 0 replies; 48+ messages in thread
From: Allison Henderson @ 2019-04-16 19:54 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On 4/15/19 4:31 PM, Darrick J. Wong wrote:
> On Fri, Apr 12, 2019 at 03:50:35PM -0700, Allison Henderson wrote:
>> This patch modifies xfs_attr_set_args to return -EAGAIN
>> when a transaction needs to be rolled.  All functions
>> currently calling xfs_attr_set_args are modified to use
>> the deferred attr operation, or handle the -EAGAIN return
>> code
>>
>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
>> ---
>>   fs/xfs/libxfs/xfs_attr.c | 62 ++++++++++++++++++++++++++++++++++++++++--------
>>   fs/xfs/libxfs/xfs_attr.h |  2 +-
>>   fs/xfs/xfs_attr_item.c   | 41 +++++++++++++++++++++++++++-----
>>   fs/xfs/xfs_trans.h       |  2 ++
>>   fs/xfs/xfs_trans_attr.c  | 56 +++++++++++++++++++++++++------------------
>>   5 files changed, 123 insertions(+), 40 deletions(-)
>>
>> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
>> index 0042708..4ddd86b 100644
>> --- a/fs/xfs/libxfs/xfs_attr.c
>> +++ b/fs/xfs/libxfs/xfs_attr.c
>> @@ -236,10 +236,37 @@ int
>>   xfs_attr_set_args(
>>   	struct xfs_da_args	*args,
>>   	struct xfs_buf          **leaf_bp,
>> +	enum xfs_attr_state	*state,
>>   	bool			roll_trans)
>>   {
>>   	struct xfs_inode	*dp = args->dp;
>>   	int			error = 0;
>> +	int			sf_size;
>> +
>> +	switch (*state) {
>> +	case (XFS_ATTR_STATE1):
>> +		goto state1;
>> +	case (XFS_ATTR_STATE2):
>> +		goto state2;
>> +	case (XFS_ATTR_STATE3):
>> +		goto state3;
>> +	}
> 
> I still don't understand what these there states are, though evidently
> if we get to this line then we weren't in any of the three possible
> states?
> 
> XFS_ATTR_STATE_INIT...?
> 
>> +
>> +	/*
>> +	 * New inodes may not have an attribute fork yet. So set the attribute
>> +	 * fork appropriately
>> +	 */
>> +	if (XFS_IFORK_Q((args->dp)) == 0) {
>> +		sf_size = sizeof(struct xfs_attr_sf_hdr) +
>> +		     XFS_ATTR_SF_ENTSIZE_BYNAME(args->namelen, args->valuelen);
>> +		xfs_bmap_set_attrforkoff(args->dp, sf_size, NULL);
>> +		args->dp->i_afp = kmem_zone_zalloc(xfs_ifork_zone, KM_SLEEP);
>> +		args->dp->i_afp->if_flags = XFS_IFEXTENTS;
>> +	}
>> +
>> +	*state = XFS_ATTR_STATE1;
> 
> XFS_ATTR_STATE_ADDED_FORK...
> 
>> +	return -EAGAIN;
>> +state1:
>>   
>>   	/*
>>   	 * If the attribute list is non-existent or a shortform list,
>> @@ -248,7 +275,6 @@ xfs_attr_set_args(
>>   	if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL ||
>>   	    (dp->i_d.di_aformat == XFS_DINODE_FMT_EXTENTS &&
>>   	     dp->i_d.di_anextents == 0)) {
>> -
>>   		/*
>>   		 * Build initial attribute list (if required).
>>   		 */
>> @@ -262,6 +288,9 @@ xfs_attr_set_args(
>>   		if (error != -ENOSPC)
>>   			return error;
>>   
>> +		*state = XFS_ATTR_STATE2;
> 
> XFS_ATTR_STATE_FAILED_SF_ADD...
> 
>> +		return -EAGAIN;
>> +state2:
>>   		/*
>>   		 * It won't fit in the shortform, transform to a leaf block.
>>   		 * GROT: another possible req'mt for a double-split btree op.
>> @@ -270,14 +299,14 @@ xfs_attr_set_args(
>>   		if (error)
>>   			return error;
>>   
>> -		if (roll_trans) {
>> -			/*
>> -			 * Prevent the leaf buffer from being unlocked so that a
>> -			 * concurrent AIL push cannot grab the half-baked leaf
>> -			 * buffer and run into problems with the write verifier.
>> -			 */
>> -			xfs_trans_bhold(args->trans, *leaf_bp);
>> +		/*
>> +		 * Prevent the leaf buffer from being unlocked so that a
>> +		 * concurrent AIL push cannot grab the half-baked leaf
>> +		 * buffer and run into problems with the write verifier.
>> +		 */
>> +		xfs_trans_bhold(args->trans, *leaf_bp);
>>   
>> +		if (roll_trans) {
>>   			error = xfs_defer_finish(&args->trans);
>>   			if (error)
>>   				return error;
>> @@ -293,6 +322,12 @@ xfs_attr_set_args(
>>   			xfs_trans_bjoin(args->trans, *leaf_bp);
>>   			*leaf_bp = NULL;
>>   		}
>> +
>> +		*state = XFS_ATTR_STATE3;
> 
> XFS_ATTR_STATE_LEAF_AVAIL...
> 
>> +		return -EAGAIN;
>> +state3:
>> +		if (*leaf_bp != NULL)
>> +			xfs_trans_brelse(args->trans, *leaf_bp);
>>   	}
>>   
>>   	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
>> @@ -419,7 +454,9 @@ xfs_attr_set(
>>   		goto out_trans_cancel;
>>   
>>   	xfs_trans_ijoin(args.trans, dp, 0);
>> -	error = xfs_attr_set_args(&args, &leaf_bp, true);
>> +
>> +	error = xfs_attr_set_deferred(dp, args.trans, name, namelen,
>> +			value, valuelen, flags);
> 
> Oh, I see, the XFS_ATTR_STATE[1-3] added in the previous patch are
> supposed to record restart points when we have to duck out to roll a
> transaction or something.
> 
> Hmm, why does this have to happen?  Is it because the current attr
> setting code will allocate and commit transactions, but now that we have
> deferred attr items, each of those commits has to turn into backing out
> to whomever allocated the transaction to get another?
> 
> Oh right, there's that whole mess where the log recovery transaction
> isn't supposed to be rolled or committed, ever, so that the defer ops
> can be ripped off and run after the recovered items are all more or less
> written out.  Ugh.
> 
> Uh... meeting time, I'll think about this and continue this reply later.
> 
> --D

Sorry I just noticed your reply on patch 8 just now. i seem to be having 
some delays with my inbox.  In any case, it seems you have pieced 
together the idea I was aiming for.  To clarify, no, it doesn't have to 
happen.  Patches 7 and 8 could disappear from the set, and it should 
still work, but we end up with a lot of activity on one transaction.  So 
we were looking for a way to break it up a bit.  This solution appears 
to work, but to be honest, I do think it's a bit convoluted.  So I am 
open to suggestions.

Ideally it would be nice to roll the transaction in all the spots they 
used to.  They're easy to find: they are marked by all the "roll_trans" 
booleans that get removed in the next patch.  They are just hard to get 
to in so far as a "restart point" because they're buried in subroutines.

Alternately, I had toyed with the idea of using a child thread that 
signals the parent thread to go roll the transaction and then wake me 
when it's done.  That way the stack doesn't back out, and we don't need 
these state markers.  I'm not sure child thread management is really 
less of a head ache though, and it's probably not great for performance 
either.  :-(

Allison

> 
>>   	if (error)
>>   		goto out_release_leaf;
>>   	if (!args.trans) {
>> @@ -554,8 +591,13 @@ xfs_attr_remove(
>>   	 */
>>   	xfs_trans_ijoin(args.trans, dp, 0);
>>   
>> -	error = xfs_attr_remove_args(&args, true);
>> +	error = xfs_has_attr(&args);
>> +	if (error)
>> +		goto out;
>> +
>>   
>> +	error = xfs_attr_remove_deferred(dp, args.trans,
>> +			name, namelen, flags);
>>   	if (error)
>>   		goto out;
>>   
>> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
>> index 4ce3b0a..da95e69 100644
>> --- a/fs/xfs/libxfs/xfs_attr.h
>> +++ b/fs/xfs/libxfs/xfs_attr.h
>> @@ -181,7 +181,7 @@ int xfs_attr_set(struct xfs_inode *dp, const unsigned char *name,
>>   		 size_t namelen, unsigned char *value, int valuelen,
>>   		 int flags);
>>   int xfs_attr_set_args(struct xfs_da_args *args, struct xfs_buf **leaf_bp,
>> -		 bool roll_trans);
>> +		 enum xfs_attr_state *state, bool roll_trans);
>>   int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name,
>>   		    size_t namelen, int flags);
>>   int xfs_has_attr(struct xfs_da_args *args);
>> diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
>> index 36e6d1e..292d608 100644
>> --- a/fs/xfs/xfs_attr_item.c
>> +++ b/fs/xfs/xfs_attr_item.c
>> @@ -464,8 +464,11 @@ xfs_attri_recover(
>>   	struct xfs_attri_log_format	*attrp;
>>   	struct xfs_trans_res		tres;
>>   	int				local;
>> -	int				error = 0;
>> +	int				error, err2 = 0;
>>   	int				rsvd = 0;
>> +	enum xfs_attr_state		state = 0;
>> +	struct xfs_buf			*leaf_bp = NULL;
>> +
>>   
>>   	ASSERT(!test_bit(XFS_ATTRI_RECOVERED, &attrip->flags));
>>   
>> @@ -540,14 +543,40 @@ xfs_attri_recover(
>>   	xfs_ilock(ip, XFS_ILOCK_EXCL);
>>   
>>   	xfs_trans_ijoin(args.trans, ip, 0);
>> -	error = xfs_trans_attr(&args, attrdp, attrp->alfi_op_flags);
>> -	if (error)
>> -		goto abort_error;
>>   
>> +	do {
>> +		leaf_bp = NULL;
>> +
>> +		error = xfs_trans_attr(&args, attrdp, &leaf_bp, &state,
>> +				attrp->alfi_op_flags);
>> +		if (error && error != -EAGAIN)
>> +			goto abort_error;
>> +
>> +		xfs_trans_log_inode(args.trans, ip,
>> +				XFS_ILOG_CORE | XFS_ILOG_ADATA);
>> +
>> +		err2 = xfs_trans_commit(args.trans);
>> +		if (err2) {
>> +			error = err2;
>> +			goto abort_error;
>> +		}
>> +
>> +		if (error == -EAGAIN) {
>> +			err2 = xfs_trans_alloc(mp, &tres, args.total, 0,
>> +				XFS_TRANS_PERM_LOG_RES, &args.trans);
>> +			if (err2) {
>> +				error = err2;
>> +				goto abort_error;
>> +			}
>> +			xfs_trans_ijoin(args.trans, ip, 0);
>> +		}
>> +
>> +	} while (error == -EAGAIN);
>> +
>> +	if (leaf_bp)
>> +		xfs_trans_brelse(args.trans, leaf_bp);
>>   
>>   	set_bit(XFS_ATTRI_RECOVERED, &attrip->flags);
>> -	xfs_trans_log_inode(args.trans, ip, XFS_ILOG_CORE | XFS_ILOG_ADATA);
>> -	error = xfs_trans_commit(args.trans);
>>   	xfs_iunlock(ip, XFS_ILOCK_EXCL);
>>   	return error;
>>   
>> diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
>> index 7bb9d8e..c785cd7 100644
>> --- a/fs/xfs/xfs_trans.h
>> +++ b/fs/xfs/xfs_trans.h
>> @@ -239,6 +239,8 @@ xfs_trans_get_attrd(struct xfs_trans *tp,
>>   		    struct xfs_attri_log_item *attrip);
>>   int xfs_trans_attr(struct xfs_da_args *args,
>>   		   struct xfs_attrd_log_item *attrdp,
>> +		   struct xfs_buf **leaf_bp,
>> +		   void *state,
>>   		   uint32_t attr_op_flags);
>>   
>>   int		xfs_trans_commit(struct xfs_trans *);
>> diff --git a/fs/xfs/xfs_trans_attr.c b/fs/xfs/xfs_trans_attr.c
>> index 3679348..a3339ea 100644
>> --- a/fs/xfs/xfs_trans_attr.c
>> +++ b/fs/xfs/xfs_trans_attr.c
>> @@ -56,10 +56,11 @@ int
>>   xfs_trans_attr(
>>   	struct xfs_da_args		*args,
>>   	struct xfs_attrd_log_item	*attrdp,
>> +	struct xfs_buf			**leaf_bp,
>> +	void				*state,
>>   	uint32_t			op_flags)
>>   {
>>   	int				error;
>> -	struct xfs_buf			*leaf_bp = NULL;
>>   
>>   	error = xfs_qm_dqattach_locked(args->dp, 0);
>>   	if (error)
>> @@ -68,7 +69,8 @@ xfs_trans_attr(
>>   	switch (op_flags) {
>>   	case XFS_ATTR_OP_FLAGS_SET:
>>   		args->op_flags |= XFS_DA_OP_ADDNAME;
>> -		error = xfs_attr_set_args(args, &leaf_bp, false);
>> +		error = xfs_attr_set_args(args, leaf_bp,
>> +				(enum xfs_attr_state *)state, false);
>>   		break;
>>   	case XFS_ATTR_OP_FLAGS_REMOVE:
>>   		ASSERT(XFS_IFORK_Q((args->dp)));
>> @@ -78,11 +80,6 @@ xfs_trans_attr(
>>   		error = -EFSCORRUPTED;
>>   	}
>>   
>> -	if (error) {
>> -		if (leaf_bp)
>> -			xfs_trans_brelse(args->trans, leaf_bp);
>> -	}
>> -
>>   	/*
>>   	 * Mark the transaction dirty, even on error. This ensures the
>>   	 * transaction is aborted, which:
>> @@ -184,27 +181,40 @@ xfs_attr_finish_item(
>>   	char				*name_value;
>>   	int				error;
>>   	int				local;
>> -	struct xfs_da_args		args;
>> +	struct xfs_da_args		*args;
>>   
>>   	attr = container_of(item, struct xfs_attr_item, xattri_list);
>> -	name_value = ((char *)attr) + sizeof(struct xfs_attr_item);
>> -
>> -	error = xfs_attr_args_init(&args, attr->xattri_ip, name_value,
>> -				   attr->xattri_name_len, attr->xattri_flags);
>> -	if (error)
>> -		goto out;
>> +	args = &attr->xattri_args;
>> +
>> +	if (attr->xattri_state == 0) {
>> +		/* Only need to initialize args context once */
>> +		name_value = ((char *)attr) + sizeof(struct xfs_attr_item);
>> +		error = xfs_attr_args_init(args, attr->xattri_ip, name_value,
>> +					   attr->xattri_name_len,
>> +					   attr->xattri_flags);
>> +		if (error)
>> +			goto out;
>> +
>> +		args->hashval = xfs_da_hashname(args->name, args->namelen);
>> +		args->value = &name_value[attr->xattri_name_len];
>> +		args->valuelen = attr->xattri_value_len;
>> +		args->op_flags = XFS_DA_OP_OKNOENT;
>> +		args->total = xfs_attr_calc_size(args, &local);
>> +		attr->xattri_leaf_bp = NULL;
>> +	}
>>   
>> -	args.hashval = xfs_da_hashname(args.name, args.namelen);
>> -	args.value = &name_value[attr->xattri_name_len];
>> -	args.valuelen = attr->xattri_value_len;
>> -	args.op_flags = XFS_DA_OP_OKNOENT;
>> -	args.total = xfs_attr_calc_size(&args, &local);
>> -	args.trans = tp;
>> +	/*
>> +	 * Always reset trans after EAGAIN cycle
>> +	 * since the transaction is new
>> +	 */
>> +	args->trans = tp;
>>   
>> -	error = xfs_trans_attr(&args, done_item,
>> -			attr->xattri_op_flags);
>> +	error = xfs_trans_attr(args, done_item, &attr->xattri_leaf_bp,
>> +			&attr->xattri_state, attr->xattri_op_flags);
>>   out:
>> -	kmem_free(attr);
>> +	if (error != -EAGAIN)
>> +		kmem_free(attr);
>> +
>>   	return error;
>>   }
>>   
>> -- 
>> 2.7.4
>>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/9] xfs: Remove all strlen in all xfs_attr_* functions for attr names.
  2019-04-12 22:50 ` [PATCH 1/9] xfs: Remove all strlen in all xfs_attr_* functions for attr names Allison Henderson
  2019-04-14 23:02   ` Dave Chinner
@ 2019-04-17 15:42   ` Brian Foster
  1 sibling, 0 replies; 48+ messages in thread
From: Brian Foster @ 2019-04-17 15:42 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Fri, Apr 12, 2019 at 03:50:28PM -0700, Allison Henderson wrote:
> This helps to pre-simplify the extra handling of the null terminator in
> delayed operations which use memcpy rather than strlen.  Later
> when we introduce parent pointers, attribute names will become binary,
> so strlen will not work at all.  Removing uses of strlen now will
> help reduce complexities later
> 
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---

This looks fine to me, Dave's suggestions aside:

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  fs/xfs/libxfs/xfs_attr.c | 12 ++++++++----
>  fs/xfs/libxfs/xfs_attr.h |  9 ++++++---
>  fs/xfs/xfs_acl.c         | 12 +++++++-----
>  fs/xfs/xfs_ioctl.c       | 13 ++++++++++---
>  fs/xfs/xfs_iops.c        |  6 ++++--
>  fs/xfs/xfs_xattr.c       | 10 ++++++----
>  6 files changed, 41 insertions(+), 21 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index 2dd9ee2..3da6b0d 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -67,6 +67,7 @@ xfs_attr_args_init(
>  	struct xfs_da_args	*args,
>  	struct xfs_inode	*dp,
>  	const unsigned char	*name,
> +	size_t			namelen,
>  	int			flags)
>  {
>  
> @@ -79,7 +80,7 @@ xfs_attr_args_init(
>  	args->dp = dp;
>  	args->flags = flags;
>  	args->name = name;
> -	args->namelen = strlen((const char *)name);
> +	args->namelen = namelen;
>  	if (args->namelen >= MAXNAMELEN)
>  		return -EFAULT;		/* match IRIX behaviour */
>  
> @@ -125,6 +126,7 @@ int
>  xfs_attr_get(
>  	struct xfs_inode	*ip,
>  	const unsigned char	*name,
> +	size_t			namelen,
>  	unsigned char		*value,
>  	int			*valuelenp,
>  	int			flags)
> @@ -138,7 +140,7 @@ xfs_attr_get(
>  	if (XFS_FORCED_SHUTDOWN(ip->i_mount))
>  		return -EIO;
>  
> -	error = xfs_attr_args_init(&args, ip, name, flags);
> +	error = xfs_attr_args_init(&args, ip, name, namelen, flags);
>  	if (error)
>  		return error;
>  
> @@ -317,6 +319,7 @@ int
>  xfs_attr_set(
>  	struct xfs_inode	*dp,
>  	const unsigned char	*name,
> +	size_t			namelen,
>  	unsigned char		*value,
>  	int			valuelen,
>  	int			flags)
> @@ -333,7 +336,7 @@ xfs_attr_set(
>  	if (XFS_FORCED_SHUTDOWN(dp->i_mount))
>  		return -EIO;
>  
> -	error = xfs_attr_args_init(&args, dp, name, flags);
> +	error = xfs_attr_args_init(&args, dp, name, namelen, flags);
>  	if (error)
>  		return error;
>  
> @@ -425,6 +428,7 @@ int
>  xfs_attr_remove(
>  	struct xfs_inode	*dp,
>  	const unsigned char	*name,
> +	size_t			namelen,
>  	int			flags)
>  {
>  	struct xfs_mount	*mp = dp->i_mount;
> @@ -436,7 +440,7 @@ xfs_attr_remove(
>  	if (XFS_FORCED_SHUTDOWN(dp->i_mount))
>  		return -EIO;
>  
> -	error = xfs_attr_args_init(&args, dp, name, flags);
> +	error = xfs_attr_args_init(&args, dp, name, namelen, flags);
>  	if (error)
>  		return error;
>  
> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
> index 2297d84..52f63dc 100644
> --- a/fs/xfs/libxfs/xfs_attr.h
> +++ b/fs/xfs/libxfs/xfs_attr.h
> @@ -137,11 +137,14 @@ int xfs_attr_list_int(struct xfs_attr_list_context *);
>  int xfs_inode_hasattr(struct xfs_inode *ip);
>  int xfs_attr_get_ilocked(struct xfs_inode *ip, struct xfs_da_args *args);
>  int xfs_attr_get(struct xfs_inode *ip, const unsigned char *name,
> -		 unsigned char *value, int *valuelenp, int flags);
> +		 size_t namelen, unsigned char *value, int *valuelenp,
> +		 int flags);
>  int xfs_attr_set(struct xfs_inode *dp, const unsigned char *name,
> -		 unsigned char *value, int valuelen, int flags);
> +		 size_t namelen, unsigned char *value, int valuelen,
> +		 int flags);
>  int xfs_attr_set_args(struct xfs_da_args *args, struct xfs_buf **leaf_bp);
> -int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name, int flags);
> +int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name,
> +		    size_t namelen, int flags);
>  int xfs_attr_remove_args(struct xfs_da_args *args);
>  int xfs_attr_list(struct xfs_inode *dp, char *buffer, int bufsize,
>  		  int flags, struct attrlist_cursor_kern *cursor);
> diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c
> index 8039e35..142de8d 100644
> --- a/fs/xfs/xfs_acl.c
> +++ b/fs/xfs/xfs_acl.c
> @@ -141,8 +141,8 @@ xfs_get_acl(struct inode *inode, int type)
>  	if (!xfs_acl)
>  		return ERR_PTR(-ENOMEM);
>  
> -	error = xfs_attr_get(ip, ea_name, (unsigned char *)xfs_acl,
> -							&len, ATTR_ROOT);
> +	error = xfs_attr_get(ip, ea_name, strlen(ea_name),
> +			     (unsigned char *)xfs_acl, &len, ATTR_ROOT);
>  	if (error) {
>  		/*
>  		 * If the attribute doesn't exist make sure we have a negative
> @@ -192,15 +192,17 @@ __xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type)
>  		len -= sizeof(struct xfs_acl_entry) *
>  			 (XFS_ACL_MAX_ENTRIES(ip->i_mount) - acl->a_count);
>  
> -		error = xfs_attr_set(ip, ea_name, (unsigned char *)xfs_acl,
> -				len, ATTR_ROOT);
> +		error = xfs_attr_set(ip, ea_name, strlen(ea_name),
> +				     (unsigned char *)xfs_acl, len, ATTR_ROOT);
>  
>  		kmem_free(xfs_acl);
>  	} else {
>  		/*
>  		 * A NULL ACL argument means we want to remove the ACL.
>  		 */
> -		error = xfs_attr_remove(ip, ea_name, ATTR_ROOT);
> +		error = xfs_attr_remove(ip, ea_name,
> +					strlen(ea_name),
> +					ATTR_ROOT);
>  
>  		/*
>  		 * If the attribute didn't exist to start with that's fine.
> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> index 6ecdbb3..ab341d6 100644
> --- a/fs/xfs/xfs_ioctl.c
> +++ b/fs/xfs/xfs_ioctl.c
> @@ -437,6 +437,7 @@ xfs_attrmulti_attr_get(
>  {
>  	unsigned char		*kbuf;
>  	int			error = -EFAULT;
> +	size_t			namelen;
>  
>  	if (*len > XFS_XATTR_SIZE_MAX)
>  		return -EINVAL;
> @@ -444,7 +445,9 @@ xfs_attrmulti_attr_get(
>  	if (!kbuf)
>  		return -ENOMEM;
>  
> -	error = xfs_attr_get(XFS_I(inode), name, kbuf, (int *)len, flags);
> +	namelen = strlen(name);
> +	error = xfs_attr_get(XFS_I(inode), name, namelen,
> +			     kbuf, (int *)len, flags);
>  	if (error)
>  		goto out_kfree;
>  
> @@ -466,6 +469,7 @@ xfs_attrmulti_attr_set(
>  {
>  	unsigned char		*kbuf;
>  	int			error;
> +	size_t			namelen;
>  
>  	if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
>  		return -EPERM;
> @@ -476,7 +480,8 @@ xfs_attrmulti_attr_set(
>  	if (IS_ERR(kbuf))
>  		return PTR_ERR(kbuf);
>  
> -	error = xfs_attr_set(XFS_I(inode), name, kbuf, len, flags);
> +	namelen = strlen(name);
> +	error = xfs_attr_set(XFS_I(inode), name, namelen, kbuf, len, flags);
>  	if (!error)
>  		xfs_forget_acl(inode, name, flags);
>  	kfree(kbuf);
> @@ -490,10 +495,12 @@ xfs_attrmulti_attr_remove(
>  	uint32_t		flags)
>  {
>  	int			error;
> +	size_t			namelen;
>  
>  	if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
>  		return -EPERM;
> -	error = xfs_attr_remove(XFS_I(inode), name, flags);
> +	namelen = strlen(name);
> +	error = xfs_attr_remove(XFS_I(inode), name, namelen, flags);
>  	if (!error)
>  		xfs_forget_acl(inode, name, flags);
>  	return error;
> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> index 74047bd..e73c21a 100644
> --- a/fs/xfs/xfs_iops.c
> +++ b/fs/xfs/xfs_iops.c
> @@ -59,8 +59,10 @@ xfs_initxattrs(
>  	int			error = 0;
>  
>  	for (xattr = xattr_array; xattr->name != NULL; xattr++) {
> -		error = xfs_attr_set(ip, xattr->name, xattr->value,
> -				      xattr->value_len, ATTR_SECURE);
> +		error = xfs_attr_set(ip, xattr->name,
> +				     strlen(xattr->name),
> +				     xattr->value, xattr->value_len,
> +				     ATTR_SECURE);
>  		if (error < 0)
>  			break;
>  	}
> diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c
> index 9a63016..3013746 100644
> --- a/fs/xfs/xfs_xattr.c
> +++ b/fs/xfs/xfs_xattr.c
> @@ -26,6 +26,7 @@ xfs_xattr_get(const struct xattr_handler *handler, struct dentry *unused,
>  	int xflags = handler->flags;
>  	struct xfs_inode *ip = XFS_I(inode);
>  	int error, asize = size;
> +	size_t namelen = strlen(name);
>  
>  	/* Convert Linux syscall to XFS internal ATTR flags */
>  	if (!size) {
> @@ -33,7 +34,7 @@ xfs_xattr_get(const struct xattr_handler *handler, struct dentry *unused,
>  		value = NULL;
>  	}
>  
> -	error = xfs_attr_get(ip, (unsigned char *)name, value, &asize, xflags);
> +	error = xfs_attr_get(ip, name, namelen, value, &asize, xflags);
>  	if (error)
>  		return error;
>  	return asize;
> @@ -69,6 +70,7 @@ xfs_xattr_set(const struct xattr_handler *handler, struct dentry *unused,
>  	int			xflags = handler->flags;
>  	struct xfs_inode	*ip = XFS_I(inode);
>  	int			error;
> +	size_t			namelen = strlen(name);
>  
>  	/* Convert Linux syscall to XFS internal ATTR flags */
>  	if (flags & XATTR_CREATE)
> @@ -77,9 +79,9 @@ xfs_xattr_set(const struct xattr_handler *handler, struct dentry *unused,
>  		xflags |= ATTR_REPLACE;
>  
>  	if (!value)
> -		return xfs_attr_remove(ip, (unsigned char *)name, xflags);
> -	error = xfs_attr_set(ip, (unsigned char *)name,
> -				(void *)value, size, xflags);
> +		return xfs_attr_remove(ip, name,
> +				       namelen, xflags);
> +	error = xfs_attr_set(ip, name, namelen, (void *)value, size, xflags);
>  	if (!error)
>  		xfs_forget_acl(inode, name, xflags);
>  
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 2/9] xfs: Hold inode locks in xfs_ialloc
  2019-04-12 22:50 ` [PATCH 2/9] xfs: Hold inode locks in xfs_ialloc Allison Henderson
@ 2019-04-17 15:44   ` Brian Foster
  2019-04-17 17:35     ` Allison Henderson
  0 siblings, 1 reply; 48+ messages in thread
From: Brian Foster @ 2019-04-17 15:44 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Fri, Apr 12, 2019 at 03:50:29PM -0700, Allison Henderson wrote:
> Modify xfs_ialloc to hold locks after return.  Caller
> will be responsible for manual unlock.  We will need
> this later to hold locks across parent pointer operations
> 
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---

Is this needed for this series, or can it be deferred to the parent
pointer stuff? (If it is needed now, it might be a good idea to update
the commit log to explain why.)

Brian

>  fs/xfs/xfs_inode.c   | 6 +++++-
>  fs/xfs/xfs_qm.c      | 1 +
>  fs/xfs/xfs_symlink.c | 3 +++
>  3 files changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index f643a92..30a3130 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -744,6 +744,8 @@ xfs_lookup(
>   * to attach to or associate with (i.e. pip == NULL) because they
>   * are not linked into the directory structure - they are attached
>   * directly to the superblock - and so have no parent.
> + *
> + * Caller is responsible for unlocking the inode manually upon return
>   */
>  static int
>  xfs_ialloc(
> @@ -942,7 +944,7 @@ xfs_ialloc(
>  	/*
>  	 * Log the new values stuffed into the inode.
>  	 */
> -	xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL);
> +	xfs_trans_ijoin(tp, ip, 0);
>  	xfs_trans_log_inode(tp, ip, flags);
>  
>  	/* now that we have an i_mode we can setup the inode structure */
> @@ -1264,6 +1266,7 @@ xfs_create(
>  	xfs_qm_dqrele(pdqp);
>  
>  	*ipp = ip;
> +	xfs_iunlock(ip, XFS_ILOCK_EXCL);
>  	return 0;
>  
>   out_trans_cancel:
> @@ -1359,6 +1362,7 @@ xfs_create_tmpfile(
>  	xfs_qm_dqrele(pdqp);
>  
>  	*ipp = ip;
> +	xfs_iunlock(ip, XFS_ILOCK_EXCL);
>  	return 0;
>  
>   out_trans_cancel:
> diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
> index 52ed790..69006e5 100644
> --- a/fs/xfs/xfs_qm.c
> +++ b/fs/xfs/xfs_qm.c
> @@ -820,6 +820,7 @@ xfs_qm_qino_alloc(
>  	}
>  	if (need_alloc)
>  		xfs_finish_inode_setup(*ip);
> +	xfs_iunlock(*ip, XFS_ILOCK_EXCL);
>  	return error;
>  }
>  
> diff --git a/fs/xfs/xfs_symlink.c b/fs/xfs/xfs_symlink.c
> index b2c1177..13d31fe 100644
> --- a/fs/xfs/xfs_symlink.c
> +++ b/fs/xfs/xfs_symlink.c
> @@ -353,6 +353,7 @@ xfs_symlink(
>  	xfs_qm_dqrele(pdqp);
>  
>  	*ipp = ip;
> +	xfs_iunlock(ip, XFS_ILOCK_EXCL);
>  	return 0;
>  
>  out_trans_cancel:
> @@ -374,6 +375,8 @@ xfs_symlink(
>  
>  	if (unlock_dp_on_error)
>  		xfs_iunlock(dp, XFS_ILOCK_EXCL);
> +	if (ip)
> +		xfs_iunlock(ip, XFS_ILOCK_EXCL);
>  	return error;
>  }
>  
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 2/9] xfs: Hold inode locks in xfs_ialloc
  2019-04-17 15:44   ` Brian Foster
@ 2019-04-17 17:35     ` Allison Henderson
  0 siblings, 0 replies; 48+ messages in thread
From: Allison Henderson @ 2019-04-17 17:35 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On 4/17/19 8:44 AM, Brian Foster wrote:
> On Fri, Apr 12, 2019 at 03:50:29PM -0700, Allison Henderson wrote:
>> Modify xfs_ialloc to hold locks after return.  Caller
>> will be responsible for manual unlock.  We will need
>> this later to hold locks across parent pointer operations
>>
>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
>> ---
> 
> Is this needed for this series, or can it be deferred to the parent
> pointer stuff? (If it is needed now, it might be a good idea to update
> the commit log to explain why.)
> 
> Brian

I think I can defer this one.  I seem to be able to take it out and 
still get through the attribute test group, so I don't think it's really 
having an effect just yet.  I will save it for the pptr series then. 
Thanks!

Allison

> 
>>   fs/xfs/xfs_inode.c   | 6 +++++-
>>   fs/xfs/xfs_qm.c      | 1 +
>>   fs/xfs/xfs_symlink.c | 3 +++
>>   3 files changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
>> index f643a92..30a3130 100644
>> --- a/fs/xfs/xfs_inode.c
>> +++ b/fs/xfs/xfs_inode.c
>> @@ -744,6 +744,8 @@ xfs_lookup(
>>    * to attach to or associate with (i.e. pip == NULL) because they
>>    * are not linked into the directory structure - they are attached
>>    * directly to the superblock - and so have no parent.
>> + *
>> + * Caller is responsible for unlocking the inode manually upon return
>>    */
>>   static int
>>   xfs_ialloc(
>> @@ -942,7 +944,7 @@ xfs_ialloc(
>>   	/*
>>   	 * Log the new values stuffed into the inode.
>>   	 */
>> -	xfs_trans_ijoin(tp, ip, XFS_ILOCK_EXCL);
>> +	xfs_trans_ijoin(tp, ip, 0);
>>   	xfs_trans_log_inode(tp, ip, flags);
>>   
>>   	/* now that we have an i_mode we can setup the inode structure */
>> @@ -1264,6 +1266,7 @@ xfs_create(
>>   	xfs_qm_dqrele(pdqp);
>>   
>>   	*ipp = ip;
>> +	xfs_iunlock(ip, XFS_ILOCK_EXCL);
>>   	return 0;
>>   
>>    out_trans_cancel:
>> @@ -1359,6 +1362,7 @@ xfs_create_tmpfile(
>>   	xfs_qm_dqrele(pdqp);
>>   
>>   	*ipp = ip;
>> +	xfs_iunlock(ip, XFS_ILOCK_EXCL);
>>   	return 0;
>>   
>>    out_trans_cancel:
>> diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
>> index 52ed790..69006e5 100644
>> --- a/fs/xfs/xfs_qm.c
>> +++ b/fs/xfs/xfs_qm.c
>> @@ -820,6 +820,7 @@ xfs_qm_qino_alloc(
>>   	}
>>   	if (need_alloc)
>>   		xfs_finish_inode_setup(*ip);
>> +	xfs_iunlock(*ip, XFS_ILOCK_EXCL);
>>   	return error;
>>   }
>>   
>> diff --git a/fs/xfs/xfs_symlink.c b/fs/xfs/xfs_symlink.c
>> index b2c1177..13d31fe 100644
>> --- a/fs/xfs/xfs_symlink.c
>> +++ b/fs/xfs/xfs_symlink.c
>> @@ -353,6 +353,7 @@ xfs_symlink(
>>   	xfs_qm_dqrele(pdqp);
>>   
>>   	*ipp = ip;
>> +	xfs_iunlock(ip, XFS_ILOCK_EXCL);
>>   	return 0;
>>   
>>   out_trans_cancel:
>> @@ -374,6 +375,8 @@ xfs_symlink(
>>   
>>   	if (unlock_dp_on_error)
>>   		xfs_iunlock(dp, XFS_ILOCK_EXCL);
>> +	if (ip)
>> +		xfs_iunlock(ip, XFS_ILOCK_EXCL);
>>   	return error;
>>   }
>>   
>> -- 
>> 2.7.4
>>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 3/9] xfs: Add trans toggle to attr routines
  2019-04-12 22:50 ` [PATCH 3/9] xfs: Add trans toggle to attr routines Allison Henderson
@ 2019-04-18 15:27   ` Brian Foster
  2019-04-18 21:23     ` Allison Henderson
  0 siblings, 1 reply; 48+ messages in thread
From: Brian Foster @ 2019-04-18 15:27 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Fri, Apr 12, 2019 at 03:50:30PM -0700, Allison Henderson wrote:
> This patch adds a roll_trans parameter to all attribute routines
> that may roll a transaction. Calling functions may pass true to
> roll transactions as normal, or false to hold them.
> 
> This patch is temporary and will be removed later when all code
> paths have been made to pass a false value.  The temporary boolean
> assists us to introduce changes across multiple smaller patches instead
> of handling all affected code paths in one large patch.
> 
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---

A couple more sentences in the commit log with details on the purpose
for this would be helpful. E.g., the current state of the attr code
rolls the transaction at various places, the implementation we're moving
to can't or shouldn't do this because <reasons>, etc.

>  fs/xfs/libxfs/xfs_attr.c        | 257 +++++++++++++++++++++++-----------------
>  fs/xfs/libxfs/xfs_attr.h        |   5 +-
>  fs/xfs/libxfs/xfs_attr_leaf.c   |  20 +++-
>  fs/xfs/libxfs/xfs_attr_leaf.h   |   8 +-
>  fs/xfs/libxfs/xfs_attr_remote.c |  50 ++++----
>  fs/xfs/libxfs/xfs_attr_remote.h |   4 +-
>  6 files changed, 203 insertions(+), 141 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index 3da6b0d..c50bbf6 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
...
> @@ -743,7 +762,8 @@ xfs_attr_leaf_addname(
>   */
>  STATIC int
>  xfs_attr_leaf_removename(
> -	struct xfs_da_args	*args)
> +	struct xfs_da_args	*args,
> +	bool roll_trans)

Indentation ^

Nits aside this looks like fairly mechanical plumbing, doesn't change
behavior in the current attr codepaths and seems reasonable as a
transient step:

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  {
>  	struct xfs_inode	*dp;
>  	struct xfs_buf		*bp;
> @@ -776,9 +796,12 @@ xfs_attr_leaf_removename(
>  		/* bp is gone due to xfs_da_shrink_inode */
>  		if (error)
>  			return error;
> -		error = xfs_defer_finish(&args->trans);
> -		if (error)
> -			return error;
> +
> +		if (roll_trans) {
> +			error = xfs_defer_finish(&args->trans);
> +			if (error)
> +				return error;
> +		}
>  	}
>  	return 0;
>  }
> @@ -831,7 +854,8 @@ xfs_attr_leaf_get(xfs_da_args_t *args)
>   */
>  STATIC int
>  xfs_attr_node_addname(
> -	struct xfs_da_args	*args)
> +	struct xfs_da_args	*args,
> +	bool			roll_trans)
>  {
>  	struct xfs_da_state	*state;
>  	struct xfs_da_state_blk	*blk;
> @@ -899,17 +923,20 @@ xfs_attr_node_addname(
>  			error = xfs_attr3_leaf_to_node(args);
>  			if (error)
>  				goto out;
> -			error = xfs_defer_finish(&args->trans);
> -			if (error)
> -				goto out;
>  
> -			/*
> -			 * Commit the node conversion and start the next
> -			 * trans in the chain.
> -			 */
> -			error = xfs_trans_roll_inode(&args->trans, dp);
> -			if (error)
> -				goto out;
> +			if (roll_trans) {
> +				error = xfs_defer_finish(&args->trans);
> +				if (error)
> +					goto out;
> +
> +				/*
> +				 * Commit the node conversion and start the next
> +				 * trans in the chain.
> +				 */
> +				error = xfs_trans_roll_inode(&args->trans, dp);
> +				if (error)
> +					goto out;
> +			}
>  
>  			goto restart;
>  		}
> @@ -923,9 +950,13 @@ xfs_attr_node_addname(
>  		error = xfs_da3_split(state);
>  		if (error)
>  			goto out;
> -		error = xfs_defer_finish(&args->trans);
> -		if (error)
> -			goto out;
> +
> +		if (roll_trans) {
> +			error = xfs_defer_finish(&args->trans);
> +			if (error)
> +				goto out;
> +		}
> +
>  	} else {
>  		/*
>  		 * Addition succeeded, update Btree hashvals.
> @@ -944,9 +975,11 @@ xfs_attr_node_addname(
>  	 * Commit the leaf addition or btree split and start the next
>  	 * trans in the chain.
>  	 */
> -	error = xfs_trans_roll_inode(&args->trans, dp);
> -	if (error)
> -		goto out;
> +	if (roll_trans) {
> +		error = xfs_trans_roll_inode(&args->trans, dp);
> +		if (error)
> +			goto out;
> +	}
>  
>  	/*
>  	 * If there was an out-of-line value, allocate the blocks we
> @@ -955,7 +988,7 @@ xfs_attr_node_addname(
>  	 * maximum size of a transaction and/or hit a deadlock.
>  	 */
>  	if (args->rmtblkno > 0) {
> -		error = xfs_attr_rmtval_set(args);
> +		error = xfs_attr_rmtval_set(args, roll_trans);
>  		if (error)
>  			return error;
>  	}
> @@ -971,7 +1004,7 @@ xfs_attr_node_addname(
>  		 * In a separate transaction, set the incomplete flag on the
>  		 * "old" attr and clear the incomplete flag on the "new" attr.
>  		 */
> -		error = xfs_attr3_leaf_flipflags(args);
> +		error = xfs_attr3_leaf_flipflags(args, roll_trans);
>  		if (error)
>  			goto out;
>  
> @@ -985,7 +1018,7 @@ xfs_attr_node_addname(
>  		args->rmtblkcnt = args->rmtblkcnt2;
>  		args->rmtvaluelen = args->rmtvaluelen2;
>  		if (args->rmtblkno) {
> -			error = xfs_attr_rmtval_remove(args);
> +			error = xfs_attr_rmtval_remove(args, roll_trans);
>  			if (error)
>  				return error;
>  		}
> @@ -1019,9 +1052,11 @@ xfs_attr_node_addname(
>  			error = xfs_da3_join(state);
>  			if (error)
>  				goto out;
> -			error = xfs_defer_finish(&args->trans);
> -			if (error)
> -				goto out;
> +			if (roll_trans) {
> +				error = xfs_defer_finish(&args->trans);
> +				if (error)
> +					goto out;
> +			}
>  		}
>  
>  		/*
> @@ -1035,7 +1070,7 @@ xfs_attr_node_addname(
>  		/*
>  		 * Added a "remote" value, just clear the incomplete flag.
>  		 */
> -		error = xfs_attr3_leaf_clearflag(args);
> +		error = xfs_attr3_leaf_clearflag(args, roll_trans);
>  		if (error)
>  			goto out;
>  	}
> @@ -1058,7 +1093,8 @@ xfs_attr_node_addname(
>   */
>  STATIC int
>  xfs_attr_node_removename(
> -	struct xfs_da_args	*args)
> +	struct xfs_da_args	*args,
> +	bool			roll_trans)
>  {
>  	struct xfs_da_state	*state;
>  	struct xfs_da_state_blk	*blk;
> @@ -1108,10 +1144,10 @@ xfs_attr_node_removename(
>  		 * Mark the attribute as INCOMPLETE, then bunmapi() the
>  		 * remote value.
>  		 */
> -		error = xfs_attr3_leaf_setflag(args);
> +		error = xfs_attr3_leaf_setflag(args, roll_trans);
>  		if (error)
>  			goto out;
> -		error = xfs_attr_rmtval_remove(args);
> +		error = xfs_attr_rmtval_remove(args, roll_trans);
>  		if (error)
>  			goto out;
>  
> @@ -1139,15 +1175,19 @@ xfs_attr_node_removename(
>  		error = xfs_da3_join(state);
>  		if (error)
>  			goto out;
> -		error = xfs_defer_finish(&args->trans);
> -		if (error)
> -			goto out;
> -		/*
> -		 * Commit the Btree join operation and start a new trans.
> -		 */
> -		error = xfs_trans_roll_inode(&args->trans, dp);
> -		if (error)
> -			goto out;
> +
> +		if (roll_trans) {
> +			error = xfs_defer_finish(&args->trans);
> +			if (error)
> +				goto out;
> +			/*
> +			 * Commit the Btree join operation and start
> +			 * a new trans.
> +			 */
> +			error = xfs_trans_roll_inode(&args->trans, dp);
> +			if (error)
> +				goto out;
> +		}
>  	}
>  
>  	/*
> @@ -1170,9 +1210,12 @@ xfs_attr_node_removename(
>  			/* bp is gone due to xfs_da_shrink_inode */
>  			if (error)
>  				goto out;
> -			error = xfs_defer_finish(&args->trans);
> -			if (error)
> -				goto out;
> +
> +			if (roll_trans) {
> +				error = xfs_defer_finish(&args->trans);
> +				if (error)
> +					goto out;
> +			}
>  		} else
>  			xfs_trans_brelse(args->trans, bp);
>  	}
> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
> index 52f63dc..f0e91bf 100644
> --- a/fs/xfs/libxfs/xfs_attr.h
> +++ b/fs/xfs/libxfs/xfs_attr.h
> @@ -142,10 +142,11 @@ int xfs_attr_get(struct xfs_inode *ip, const unsigned char *name,
>  int xfs_attr_set(struct xfs_inode *dp, const unsigned char *name,
>  		 size_t namelen, unsigned char *value, int valuelen,
>  		 int flags);
> -int xfs_attr_set_args(struct xfs_da_args *args, struct xfs_buf **leaf_bp);
> +int xfs_attr_set_args(struct xfs_da_args *args, struct xfs_buf **leaf_bp,
> +		 bool roll_trans);
>  int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name,
>  		    size_t namelen, int flags);
> -int xfs_attr_remove_args(struct xfs_da_args *args);
> +int xfs_attr_remove_args(struct xfs_da_args *args, bool roll_trans);
>  int xfs_attr_list(struct xfs_inode *dp, char *buffer, int bufsize,
>  		  int flags, struct attrlist_cursor_kern *cursor);
>  bool xfs_attr_namecheck(const void *name, size_t length);
> diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c
> index 1f6e396..128bfe9 100644
> --- a/fs/xfs/libxfs/xfs_attr_leaf.c
> +++ b/fs/xfs/libxfs/xfs_attr_leaf.c
> @@ -2637,7 +2637,8 @@ xfs_attr_leaf_newentsize(
>   */
>  int
>  xfs_attr3_leaf_clearflag(
> -	struct xfs_da_args	*args)
> +	struct xfs_da_args	*args,
> +	bool			roll_trans)
>  {
>  	struct xfs_attr_leafblock *leaf;
>  	struct xfs_attr_leaf_entry *entry;
> @@ -2698,7 +2699,9 @@ xfs_attr3_leaf_clearflag(
>  	/*
>  	 * Commit the flag value change and start the next trans in series.
>  	 */
> -	return xfs_trans_roll_inode(&args->trans, args->dp);
> +	if (roll_trans)
> +		error = xfs_trans_roll_inode(&args->trans, args->dp);
> +	return error;
>  }
>  
>  /*
> @@ -2706,7 +2709,8 @@ xfs_attr3_leaf_clearflag(
>   */
>  int
>  xfs_attr3_leaf_setflag(
> -	struct xfs_da_args	*args)
> +	struct xfs_da_args	*args,
> +	bool			roll_trans)
>  {
>  	struct xfs_attr_leafblock *leaf;
>  	struct xfs_attr_leaf_entry *entry;
> @@ -2749,7 +2753,9 @@ xfs_attr3_leaf_setflag(
>  	/*
>  	 * Commit the flag value change and start the next trans in series.
>  	 */
> -	return xfs_trans_roll_inode(&args->trans, args->dp);
> +	if (roll_trans)
> +		error = xfs_trans_roll_inode(&args->trans, args->dp);
> +	return error;
>  }
>  
>  /*
> @@ -2761,7 +2767,8 @@ xfs_attr3_leaf_setflag(
>   */
>  int
>  xfs_attr3_leaf_flipflags(
> -	struct xfs_da_args	*args)
> +	struct xfs_da_args	*args,
> +	bool			roll_trans)
>  {
>  	struct xfs_attr_leafblock *leaf1;
>  	struct xfs_attr_leafblock *leaf2;
> @@ -2867,7 +2874,8 @@ xfs_attr3_leaf_flipflags(
>  	/*
>  	 * Commit the flag value change and start the next trans in series.
>  	 */
> -	error = xfs_trans_roll_inode(&args->trans, args->dp);
> +	if (roll_trans)
> +		error = xfs_trans_roll_inode(&args->trans, args->dp);
>  
>  	return error;
>  }
> diff --git a/fs/xfs/libxfs/xfs_attr_leaf.h b/fs/xfs/libxfs/xfs_attr_leaf.h
> index 7b74e18..9d830ec 100644
> --- a/fs/xfs/libxfs/xfs_attr_leaf.h
> +++ b/fs/xfs/libxfs/xfs_attr_leaf.h
> @@ -49,10 +49,10 @@ void	xfs_attr_fork_remove(struct xfs_inode *ip, struct xfs_trans *tp);
>   */
>  int	xfs_attr3_leaf_to_node(struct xfs_da_args *args);
>  int	xfs_attr3_leaf_to_shortform(struct xfs_buf *bp,
> -				   struct xfs_da_args *args, int forkoff);
> -int	xfs_attr3_leaf_clearflag(struct xfs_da_args *args);
> -int	xfs_attr3_leaf_setflag(struct xfs_da_args *args);
> -int	xfs_attr3_leaf_flipflags(struct xfs_da_args *args);
> +			struct xfs_da_args *args, int forkoff);
> +int	xfs_attr3_leaf_clearflag(struct xfs_da_args *args, bool roll_trans);
> +int	xfs_attr3_leaf_setflag(struct xfs_da_args *args, bool roll_trans);
> +int	xfs_attr3_leaf_flipflags(struct xfs_da_args *args, bool roll_trans);
>  
>  /*
>   * Routines used for growing the Btree.
> diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
> index 65ff600..18fbd22 100644
> --- a/fs/xfs/libxfs/xfs_attr_remote.c
> +++ b/fs/xfs/libxfs/xfs_attr_remote.c
> @@ -435,7 +435,8 @@ xfs_attr_rmtval_get(
>   */
>  int
>  xfs_attr_rmtval_set(
> -	struct xfs_da_args	*args)
> +	struct xfs_da_args	*args,
> +	bool			roll_trans)
>  {
>  	struct xfs_inode	*dp = args->dp;
>  	struct xfs_mount	*mp = dp->i_mount;
> @@ -488,9 +489,12 @@ xfs_attr_rmtval_set(
>  				  &nmap);
>  		if (error)
>  			return error;
> -		error = xfs_defer_finish(&args->trans);
> -		if (error)
> -			return error;
> +
> +		if (roll_trans) {
> +			error = xfs_defer_finish(&args->trans);
> +			if (error)
> +				return error;
> +		}
>  
>  		ASSERT(nmap == 1);
>  		ASSERT((map.br_startblock != DELAYSTARTBLOCK) &&
> @@ -498,12 +502,14 @@ xfs_attr_rmtval_set(
>  		lblkno += map.br_blockcount;
>  		blkcnt -= map.br_blockcount;
>  
> -		/*
> -		 * Start the next trans in the chain.
> -		 */
> -		error = xfs_trans_roll_inode(&args->trans, dp);
> -		if (error)
> -			return error;
> +		if (roll_trans) {
> +			/*
> +			 * Start the next trans in the chain.
> +			 */
> +			error = xfs_trans_roll_inode(&args->trans, dp);
> +			if (error)
> +				return error;
> +		}
>  	}
>  
>  	/*
> @@ -563,7 +569,8 @@ xfs_attr_rmtval_set(
>   */
>  int
>  xfs_attr_rmtval_remove(
> -	struct xfs_da_args	*args)
> +	struct xfs_da_args	*args,
> +	bool			roll_trans)
>  {
>  	struct xfs_mount	*mp = args->dp->i_mount;
>  	xfs_dablk_t		lblkno;
> @@ -625,16 +632,19 @@ xfs_attr_rmtval_remove(
>  				    XFS_BMAPI_ATTRFORK, 1, &done);
>  		if (error)
>  			return error;
> -		error = xfs_defer_finish(&args->trans);
> -		if (error)
> -			return error;
>  
> -		/*
> -		 * Close out trans and start the next one in the chain.
> -		 */
> -		error = xfs_trans_roll_inode(&args->trans, args->dp);
> -		if (error)
> -			return error;
> +		if (roll_trans) {
> +			error = xfs_defer_finish(&args->trans);
> +			if (error)
> +				return error;
> +
> +			/*
> +			 * Close out trans and start the next one in the chain.
> +			 */
> +			error = xfs_trans_roll_inode(&args->trans, args->dp);
> +			if (error)
> +				return error;
> +		}
>  	}
>  	return 0;
>  }
> diff --git a/fs/xfs/libxfs/xfs_attr_remote.h b/fs/xfs/libxfs/xfs_attr_remote.h
> index 9d20b66..c7c073d 100644
> --- a/fs/xfs/libxfs/xfs_attr_remote.h
> +++ b/fs/xfs/libxfs/xfs_attr_remote.h
> @@ -9,7 +9,7 @@
>  int xfs_attr3_rmt_blocks(struct xfs_mount *mp, int attrlen);
>  
>  int xfs_attr_rmtval_get(struct xfs_da_args *args);
> -int xfs_attr_rmtval_set(struct xfs_da_args *args);
> -int xfs_attr_rmtval_remove(struct xfs_da_args *args);
> +int xfs_attr_rmtval_set(struct xfs_da_args *args, bool roll_trans);
> +int xfs_attr_rmtval_remove(struct xfs_da_args *args, bool roll_trans);
>  
>  #endif /* __XFS_ATTR_REMOTE_H__ */
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 4/9] xfs: Set up infastructure for deferred attribute operations
  2019-04-12 22:50 ` [PATCH 4/9] xfs: Set up infastructure for deferred attribute operations Allison Henderson
@ 2019-04-18 15:48   ` Brian Foster
  2019-04-18 21:27     ` Allison Henderson
  0 siblings, 1 reply; 48+ messages in thread
From: Brian Foster @ 2019-04-18 15:48 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Fri, Apr 12, 2019 at 03:50:31PM -0700, Allison Henderson wrote:
> This patch adds two new log item types for setting or
> removing attributes as deferred operations.  The
> xfs_attri_log_item logs an intent to set or remove an
> attribute.  The corresponding xfs_attrd_log_item holds
> a reference to the xfs_attri_log_item and is freed once
> the transaction is done.  Both log items use a generic
> xfs_attr_log_format structure that contains the attribute
> name, value, flags, inode, and an op_flag that indicates
> if the operations is a set or remove.
> 
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---

This mostly looks sane to me on a first high level pass. We're adding
the intent/done log item infrastructure for attrs, associated dfops
processing code and log recovery hooks. I'll probably have to go back
through this once I get further through the series and have grokked more
context, but so far I think I just have some various nits and aesthetic
comments.

Firstly, note that git complained about an extra blank line at EOF of
xfs_trans_attr.c when I applied this patch. Also, the commit log above
looks like it could be widened (I think 68 chars is the standard) and
could probably include a bit more context on the big picture changes
associated with this work. In general, I think the commit log should
(briefly) explain 1.) how attrs currently work 2.) how things are
expected to work based on this infrastructure and 3.) the advantage(s)
of doing so.

For example, one thing that is glossed over is that this implies we'll
be logging xattr values even in remote attribute block cases. BTW, do we
need to update the transaction reservation to account for that? I didn't
notice that being changed anwhere (yet)..

>  fs/xfs/Makefile                |   2 +
>  fs/xfs/libxfs/xfs_attr.c       |   5 +-
>  fs/xfs/libxfs/xfs_attr.h       |  25 ++
>  fs/xfs/libxfs/xfs_defer.c      |   1 +
>  fs/xfs/libxfs/xfs_defer.h      |   3 +
>  fs/xfs/libxfs/xfs_log_format.h |  44 +++-
>  fs/xfs/libxfs/xfs_types.h      |   1 +
>  fs/xfs/xfs_attr_item.c         | 558 +++++++++++++++++++++++++++++++++++++++++
>  fs/xfs/xfs_attr_item.h         | 103 ++++++++
>  fs/xfs/xfs_log_recover.c       | 172 +++++++++++++
>  fs/xfs/xfs_ondisk.h            |   2 +
>  fs/xfs/xfs_trans.h             |  10 +
>  fs/xfs/xfs_trans_attr.c        | 240 ++++++++++++++++++
>  13 files changed, 1162 insertions(+), 4 deletions(-)
> 
...
> diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
> new file mode 100644
> index 0000000..0ea19b4
> --- /dev/null
> +++ b/fs/xfs/xfs_attr_item.c
> @@ -0,0 +1,558 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +/*
> + * Copyright (C) 2019 Oracle.  All Rights Reserved.
> + * Author: Allison Henderson <allison.henderson@oracle.com>
> + */
> +#include "xfs.h"
> +#include "xfs_fs.h"
> +#include "xfs_format.h"
> +#include "xfs_log_format.h"
> +#include "xfs_trans_resv.h"
> +#include "xfs_bit.h"
> +#include "xfs_mount.h"
> +#include "xfs_trans.h"
> +#include "xfs_trans_priv.h"
> +#include "xfs_buf_item.h"
> +#include "xfs_attr_item.h"
> +#include "xfs_log.h"
> +#include "xfs_btree.h"
> +#include "xfs_rmap.h"
> +#include "xfs_inode.h"
> +#include "xfs_icache.h"
> +#include "xfs_attr.h"
> +#include "xfs_shared.h"
> +#include "xfs_da_format.h"
> +#include "xfs_da_btree.h"
> +
> +static inline struct xfs_attri_log_item *ATTRI_ITEM(struct xfs_log_item *lip)
> +{
> +	return container_of(lip, struct xfs_attri_log_item, item);
> +}
> +
> +void
> +xfs_attri_item_free(
> +	struct xfs_attri_log_item	*attrip)
> +{
> +	kmem_free(attrip->item.li_lv_shadow);
> +	kmem_free(attrip);
> +}
> +
> +/*
> + * This returns the number of iovecs needed to log the given attri item.
> + * We only need 1 iovec for an attri item.  It just logs the attr_log_format
> + * structure.
> + */
> +static inline int
> +xfs_attri_item_sizeof(
> +	struct xfs_attri_log_item *attrip)
> +{
> +	return sizeof(struct xfs_attri_log_format);
> +}
> +
> +STATIC void
> +xfs_attri_item_size(
> +	struct xfs_log_item	*lip,
> +	int			*nvecs,
> +	int			*nbytes)
> +{
> +	struct xfs_attri_log_item       *attrip = ATTRI_ITEM(lip);
> +
> +	*nvecs += 1;
> +	*nbytes += xfs_attri_item_sizeof(attrip);
> +
> +	if (attrip->name_len > 0) {
> +		*nvecs += 1;
> +		*nbytes += ATTR_NVEC_SIZE(attrip->name_len);
> +	}
> +
> +	if (attrip->value_len > 0) {
> +		*nvecs += 1;
> +		*nbytes += ATTR_NVEC_SIZE(attrip->value_len);
> +	}
> +}
> +
> +/*
> + * This is called to fill in the vector of log iovecs for the
> + * given attri log item. We use only 1 iovec, and we point that
> + * at the attri_log_format structure embedded in the attri item.
> + * It is at this point that we assert that all of the attr
> + * slots in the attri item have been filled.
> + */

I see a bunch of places throughout this patch such as above where the
line length formatting looks inconsistent. The above comment should be
widened to 80 chars. I'm sure much of this code was boilerplate brought
over from other log items and such, but we should take the opportunity
to properly format the new code we're adding.

> +STATIC void
> +xfs_attri_item_format(
> +	struct xfs_log_item	*lip,
> +	struct xfs_log_vec	*lv)
> +{
> +	struct xfs_attri_log_item	*attrip = ATTRI_ITEM(lip);
> +	struct xfs_log_iovec	*vecp = NULL;
> +
> +	attrip->format.alfi_type = XFS_LI_ATTRI;
> +	attrip->format.alfi_size = 1;
> +	if (attrip->name_len > 0)
> +		attrip->format.alfi_size++;
> +	if (attrip->value_len > 0)
> +		attrip->format.alfi_size++;
> +

I'd move these afli_size updates to the equivalent if checks below.

> +	xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTRI_FORMAT,
> +			&attrip->format,
> +			xfs_attri_item_sizeof(attrip));
> +	if (attrip->name_len > 0)
> +		xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTR_NAME,
> +				attrip->name, ATTR_NVEC_SIZE(attrip->name_len));
> +
> +	if (attrip->value_len > 0)
> +		xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTR_VALUE,
> +				attrip->value,
> +				ATTR_NVEC_SIZE(attrip->value_len));
> +}
> +
> +
> +/*
> + * Pinning has no meaning for an attri item, so just return.
> + */
> +STATIC void
> +xfs_attri_item_pin(
> +	struct xfs_log_item	*lip)
> +{
> +}
> +
> +/*
> + * The unpin operation is the last place an ATTRI is manipulated in the log. It
> + * is either inserted in the AIL or aborted in the event of a log I/O error. In
> + * either case, the ATTRI transaction has been successfully committed to make it
> + * this far. Therefore, we expect whoever committed the ATTRI to either
> + * construct and commit the ATTRD or drop the ATTRD's reference in the event of
> + * error. Simply drop the log's ATTRI reference now that the log is done with
> + * it.
> + */
> +STATIC void
> +xfs_attri_item_unpin(
> +	struct xfs_log_item	*lip,
> +	int			remove)
> +{
> +	struct xfs_attri_log_item	*attrip = ATTRI_ITEM(lip);
> +
> +	xfs_attri_release(attrip);
> +}
> +
> +/*
> + * attri items have no locking or pushing.  However, since ATTRIs are pulled
> + * from the AIL when their corresponding ATTRDs are committed to disk, their
> + * situation is very similar to being pinned.  Return XFS_ITEM_PINNED so that
> + * the caller will eventually flush the log.  This should help in getting the
> + * ATTRI out of the AIL.
> + */
> +STATIC uint
> +xfs_attri_item_push(
> +	struct xfs_log_item	*lip,
> +	struct list_head	*buffer_list)
> +{
> +	return XFS_ITEM_PINNED;
> +}
> +
> +/*
> + * The ATTRI has been either committed or aborted if the transaction has been
> + * cancelled. If the transaction was cancelled, an ATTRD isn't going to be
> + * constructed and thus we free the ATTRI here directly.
> + */
> +STATIC void
> +xfs_attri_item_unlock(
> +	struct xfs_log_item	*lip)
> +{
> +	if (test_bit(XFS_LI_ABORTED, &lip->li_flags))
> +		xfs_attri_release(ATTRI_ITEM(lip));
> +}
> +
> +/*
> + * The ATTRI is logged only once and cannot be moved in the log, so simply
> + * return the lsn at which it's been logged.
> + */
> +STATIC xfs_lsn_t
> +xfs_attri_item_committed(
> +	struct xfs_log_item	*lip,
> +	xfs_lsn_t		lsn)
> +{
> +	return lsn;
> +}
> +
> +STATIC void
> +xfs_attri_item_committing(
> +	struct xfs_log_item	*lip,
> +	xfs_lsn_t		lsn)
> +{
> +}
> +
> +/*
> + * This is the ops vector shared by all attri log items.
> + */
> +static const struct xfs_item_ops xfs_attri_item_ops = {
> +	.iop_size	= xfs_attri_item_size,
> +	.iop_format	= xfs_attri_item_format,
> +	.iop_pin	= xfs_attri_item_pin,
> +	.iop_unpin	= xfs_attri_item_unpin,
> +	.iop_unlock	= xfs_attri_item_unlock,
> +	.iop_committed	= xfs_attri_item_committed,
> +	.iop_push	= xfs_attri_item_push,
> +	.iop_committing = xfs_attri_item_committing
> +};
> +
> +
> +/*
> + * Allocate and initialize an attri item
> + */
> +struct xfs_attri_log_item *
> +xfs_attri_init(
> +	struct xfs_mount	*mp)
> +
> +{
> +	struct xfs_attri_log_item	*attrip;
> +	uint			size;
> +
> +	size = (uint)(sizeof(struct xfs_attri_log_item));
> +	attrip = kmem_zalloc(size, KM_SLEEP);
> +
> +	xfs_log_item_init(mp, &(attrip->item), XFS_LI_ATTRI,
> +			  &xfs_attri_item_ops);

No need for those braces around attrip->item, and with those removed we
can reduce this to a single line.

> +	attrip->format.alfi_id = (uintptr_t)(void *)attrip;
> +	atomic_set(&attrip->refcount, 2);
> +
> +	return attrip;
> +}
> +
> +/*
> + * Copy an attr format buffer from the given buf, and into the destination
> + * attr format structure.
> + */
> +int
> +xfs_attri_copy_format(struct xfs_log_iovec *buf,
> +		      struct xfs_attri_log_format *dst_attr_fmt)
> +{
> +	struct xfs_attri_log_format *src_attr_fmt = buf->i_addr;
> +	uint len = sizeof(struct xfs_attri_log_format);
> +
> +	if (buf->i_len == len) {
> +		memcpy((char *)dst_attr_fmt, (char *)src_attr_fmt, len);
> +		return 0;
> +	}
> +	return -EFSCORRUPTED;

Can we invert the logic flow here (and below)? I.e.,

	...
	if (buf->i_len != len)
		return -EFSCORRUPTED;
	memcpy(...);
	return 0;

> +}
> +
> +/*
> + * Copy an attr format buffer from the given buf, and into the destination
> + * attr format structure.
> + */
> +int
> +xfs_attrd_copy_format(struct xfs_log_iovec *buf,
> +		      struct xfs_attrd_log_format *dst_attr_fmt)
> +{
> +	struct xfs_attrd_log_format *src_attr_fmt = buf->i_addr;
> +	uint len = sizeof(struct xfs_attrd_log_format);
> +
> +	if (buf->i_len == len) {
> +		memcpy((char *)dst_attr_fmt, (char *)src_attr_fmt, len);
> +		return 0;
> +	}
> +	return -EFSCORRUPTED;
> +}
> +

This function appears to be unused. The recover code looks like it just
casts the iovec buffer directly to an attrd_log_format to determine the
id.

> +/*
> + * Freeing the attrip requires that we remove it from the AIL if it has already
> + * been placed there. However, the ATTRI may not yet have been placed in the
> + * AIL when called by xfs_attri_release() from ATTRD processing due to the
> + * ordering of committed vs unpin operations in bulk insert operations. Hence
> + * the reference count to ensure only the last caller frees the ATTRI.
> + */
> +void
> +xfs_attri_release(
> +	struct xfs_attri_log_item	*attrip)
> +{
> +	ASSERT(atomic_read(&attrip->refcount) > 0);
> +	if (atomic_dec_and_test(&attrip->refcount)) {
> +		xfs_trans_ail_remove(&attrip->item, SHUTDOWN_LOG_IO_ERROR);
> +		xfs_attri_item_free(attrip);
> +	}
> +}
> +
> +static inline struct xfs_attrd_log_item *ATTRD_ITEM(struct xfs_log_item *lip)
> +{
> +	return container_of(lip, struct xfs_attrd_log_item, item);
> +}
> +
> +STATIC void
> +xfs_attrd_item_free(struct xfs_attrd_log_item *attrdp)
> +{
> +	kmem_free(attrdp->item.li_lv_shadow);
> +	kmem_free(attrdp);
> +}
> +
> +/*
> + * This returns the number of iovecs needed to log the given attrd item.
> + * We only need 1 iovec for an attrd item.  It just logs the attr_log_format
> + * structure.
> + */
> +static inline int
> +xfs_attrd_item_sizeof(
> +	struct xfs_attrd_log_item *attrdp)
> +{
> +	return sizeof(struct xfs_attrd_log_format);
> +}
> +
> +STATIC void
> +xfs_attrd_item_size(
> +	struct xfs_log_item	*lip,
> +	int			*nvecs,
> +	int			*nbytes)
> +{
> +	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
> +	*nvecs += 1;
> +	*nbytes += xfs_attrd_item_sizeof(attrdp);
> +}
> +
> +/*
> + * This is called to fill in the vector of log iovecs for the
> + * given attrd log item. We use only 1 iovec, and we point that
> + * at the attr_log_format structure embedded in the attrd item.
> + * It is at this point that we assert that all of the attr
> + * slots in the attrd item have been filled.
> + */
> +STATIC void
> +xfs_attrd_item_format(
> +	struct xfs_log_item	*lip,
> +	struct xfs_log_vec	*lv)
> +{
> +	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
> +	struct xfs_log_iovec	*vecp = NULL;
> +
> +	attrdp->format.alfd_type = XFS_LI_ATTRD;
> +	attrdp->format.alfd_size = 1;
> +
> +	xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTRD_FORMAT,
> +			&attrdp->format,
> +			xfs_attrd_item_sizeof(attrdp));

The above looks like it could be shrunk to 2 lines as well after 80 char
widening. Note that I'm sure I haven't caught all of these, just
pointing out some examples as I notice them.

FWIW, if you happen to use vim, I sometimes use ':set cc=80' to draw an
80 char line in the viewer that helps to quickly eyeball new code for
this kind of thing.

> +}
> +
> +/*
> + * Pinning has no meaning for an attrd item, so just return.
> + */
> +STATIC void
> +xfs_attrd_item_pin(
> +	struct xfs_log_item	*lip)
> +{
> +}
> +
> +/*
> + * Since pinning has no meaning for an attrd item, unpinning does
> + * not either.
> + */
> +STATIC void
> +xfs_attrd_item_unpin(
> +	struct xfs_log_item	*lip,
> +	int			remove)
> +{
> +}
> +
> +/*
> + * There isn't much you can do to push on an attrd item.  It is simply stuck
> + * waiting for the log to be flushed to disk.
> + */
> +STATIC uint
> +xfs_attrd_item_push(
> +	struct xfs_log_item	*lip,
> +	struct list_head	*buffer_list)
> +{
> +	return XFS_ITEM_PINNED;
> +}
> +
> +/*
> + * The ATTRD is either committed or aborted if the transaction is cancelled. If
> + * the transaction is cancelled, drop our reference to the ATTRI and free the
> + * ATTRD.
> + */
> +STATIC void
> +xfs_attrd_item_unlock(
> +	struct xfs_log_item	*lip)
> +{
> +	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
> +
> +	if (test_bit(XFS_LI_ABORTED, &lip->li_flags)) {
> +		xfs_attri_release(attrdp->attrip);
> +		xfs_attrd_item_free(attrdp);
> +	}
> +}
> +
> +/*
> + * When the attrd item is committed to disk, all we need to do is delete our
> + * reference to our partner attri item and then free ourselves. Since we're
> + * freeing ourselves we must return -1 to keep the transaction code from
> + * further referencing this item.
> + */
> +STATIC xfs_lsn_t
> +xfs_attrd_item_committed(
> +	struct xfs_log_item	*lip,
> +	xfs_lsn_t		lsn)
> +{
> +	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
> +
> +	/*
> +	 * Drop the ATTRI reference regardless of whether the ATTRD has been
> +	 * aborted. Once the ATTRD transaction is constructed, it is the sole
> +	 * responsibility of the ATTRD to release the ATTRI (even if the ATTRI
> +	 * is aborted due to log I/O error).
> +	 */
> +	xfs_attri_release(attrdp->attrip);
> +	xfs_attrd_item_free(attrdp);
> +
> +	return (xfs_lsn_t)-1;
> +}
> +
> +STATIC void
> +xfs_attrd_item_committing(
> +	struct xfs_log_item	*lip,
> +	xfs_lsn_t		lsn)
> +{
> +}
> +
> +/*
> + * This is the ops vector shared by all attrd log items.
> + */
> +static const struct xfs_item_ops xfs_attrd_item_ops = {
> +	.iop_size	= xfs_attrd_item_size,
> +	.iop_format	= xfs_attrd_item_format,
> +	.iop_pin	= xfs_attrd_item_pin,
> +	.iop_unpin	= xfs_attrd_item_unpin,
> +	.iop_unlock	= xfs_attrd_item_unlock,
> +	.iop_committed	= xfs_attrd_item_committed,
> +	.iop_push	= xfs_attrd_item_push,
> +	.iop_committing = xfs_attrd_item_committing
> +};
> +
> +/*
> + * Allocate and initialize an attrd item
> + */
> +struct xfs_attrd_log_item *
> +xfs_attrd_init(
> +	struct xfs_mount	*mp,
> +	struct xfs_attri_log_item	*attrip)
> +
> +{
> +	struct xfs_attrd_log_item	*attrdp;
> +	uint			size;
> +
> +	size = (uint)(sizeof(struct xfs_attrd_log_item));
> +	attrdp = kmem_zalloc(size, KM_SLEEP);
> +
> +	xfs_log_item_init(mp, &attrdp->item, XFS_LI_ATTRD,
> +			  &xfs_attrd_item_ops);
> +	attrdp->attrip = attrip;
> +	attrdp->format.alfd_alf_id = attrip->format.alfi_id;
> +
> +	return attrdp;
> +}
> +
> +/*
> + * Process an attr intent item that was recovered from
> + * the log.  We need to delete the attr that it describes.
> + */

^^^ :)

> +int
> +xfs_attri_recover(
> +	struct xfs_mount		*mp,
> +	struct xfs_attri_log_item	*attrip)
> +{
> +	struct xfs_inode		*ip;
> +	struct xfs_attrd_log_item	*attrdp;
> +	struct xfs_da_args		args;
> +	struct xfs_attri_log_format	*attrp;
> +	struct xfs_trans_res		tres;
> +	int				local;
> +	int				error = 0;
> +	int				rsvd = 0;
> +
> +	ASSERT(!test_bit(XFS_ATTRI_RECOVERED, &attrip->flags));
> +
> +	/*
> +	 * First check the validity of the attr described by the
> +	 * ATTRI.  If any are bad, then assume that all are bad and
> +	 * just toss the ATTRI.
> +	 */
> +	attrp = &attrip->format;
> +	if (
> +	    /*
> +	     * Must have either XFS_ATTR_OP_FLAGS_SET or
> +	     * XFS_ATTR_OP_FLAGS_REMOVE set
> +	     */
> +	    !(attrp->alfi_op_flags == XFS_ATTR_OP_FLAGS_SET ||
> +		attrp->alfi_op_flags == XFS_ATTR_OP_FLAGS_REMOVE) ||
> +
> +	    /* Check size of value and name lengths */
> +	    (attrp->alfi_value_len > XATTR_SIZE_MAX ||
> +		attrp->alfi_name_len > XATTR_NAME_MAX) ||
> +
> +	    /*
> +	     * If the XFS_ATTR_OP_FLAGS_SET flag is set,
> +	     * there must also be a name and value
> +	     */
> +	    (attrp->alfi_op_flags == XFS_ATTR_OP_FLAGS_SET &&
> +		(attrp->alfi_value_len == 0 || attrp->alfi_name_len == 0)) ||

It's been a while since I've played with any attribute stuff, but is
this always the case or can we not have an empty attribute?

> +
> +	    /*
> +	     * If the XFS_ATTR_OP_FLAGS_REMOVE flag is set,
> +	     * there must also be a name
> +	     */
> +	    (attrp->alfi_op_flags == XFS_ATTR_OP_FLAGS_REMOVE &&
> +		(attrp->alfi_name_len == 0))
> +	) {

Comments are always nice of course, but interspersed with logic like
this makes the whole thing hard to read. I'd suggest to just generalize
the comment to include whatever things are non-obvious, condense the if
logic and leave the comment above it.

> +		/*
> +		 * This will pull the ATTRI from the AIL and
> +		 * free the memory associated with it.
> +		 */
> +		set_bit(XFS_ATTRI_RECOVERED, &attrip->flags);
> +		xfs_attri_release(attrip);
> +		return -EIO;
> +	}
> +
> +	attrp = &attrip->format;
> +	error = xfs_iget(mp, 0, attrp->alfi_ino, 0, 0, &ip);
> +	if (error)
> +		return error;
> +
> +	error = xfs_attr_args_init(&args, ip, attrip->name,
> +			attrp->alfi_name_len, attrp->alfi_attr_flags);
> +	if (error)
> +		return error;
> +
> +	args.hashval = xfs_da_hashname(args.name, args.namelen);
> +	args.value = attrip->value;
> +	args.valuelen = attrp->alfi_value_len;
> +	args.op_flags = XFS_DA_OP_OKNOENT;
> +	args.total = xfs_attr_calc_size(&args, &local);
> +
> +	tres.tr_logres = M_RES(mp)->tr_attrsetm.tr_logres +
> +			M_RES(mp)->tr_attrsetrt.tr_logres * args.total;
> +	tres.tr_logcount = XFS_ATTRSET_LOG_COUNT;
> +	tres.tr_logflags = XFS_TRANS_PERM_LOG_RES;
> +
> +	error = xfs_trans_alloc(mp, &tres, args.total,  0,
> +				rsvd ? XFS_TRANS_RESERVE : 0, &args.trans);
> +	if (error)
> +		return error;
> +	attrdp = xfs_trans_get_attrd(args.trans, attrip);
> +
> +	xfs_ilock(ip, XFS_ILOCK_EXCL);
> +
> +	xfs_trans_ijoin(args.trans, ip, 0);
> +	error = xfs_trans_attr(&args, attrdp, attrp->alfi_op_flags);
> +	if (error)
> +		goto abort_error;
> +
> +
> +	set_bit(XFS_ATTRI_RECOVERED, &attrip->flags);
> +	xfs_trans_log_inode(args.trans, ip, XFS_ILOG_CORE | XFS_ILOG_ADATA);
> +	error = xfs_trans_commit(args.trans);
> +	xfs_iunlock(ip, XFS_ILOCK_EXCL);
> +	return error;
> +
> +abort_error:
> +	xfs_trans_cancel(args.trans);
> +	xfs_iunlock(ip, XFS_ILOCK_EXCL);
> +	return error;
> +}
> diff --git a/fs/xfs/xfs_attr_item.h b/fs/xfs/xfs_attr_item.h
> new file mode 100644
> index 0000000..fce7515
> --- /dev/null
> +++ b/fs/xfs/xfs_attr_item.h
> @@ -0,0 +1,103 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +/*
> + * Copyright (C) 2019 Oracle.  All Rights Reserved.
> + * Author: Allison Henderson <allison.henderson@oracle.com>
> + */
> +#ifndef	__XFS_ATTR_ITEM_H__
> +#define	__XFS_ATTR_ITEM_H__
> +
> +/* kernel only ATTRI/ATTRD definitions */
> +
> +struct xfs_mount;
> +struct kmem_zone;
> +
> +/*
> + * Max number of attrs in fast allocation path.
> + */
> +#define XFS_ATTRI_MAX_FAST_ATTRS        1
> +
> +
> +/*
> + * Define ATTR flag bits. Manipulated by set/clear/test_bit operators.
> + */
> +#define	XFS_ATTRI_RECOVERED	1
> +
> +
> +/* nvecs must be in multiples of 4 */
> +#define ATTR_NVEC_SIZE(size) (size == sizeof(int32_t) ? sizeof(int32_t) : \
> +				size + sizeof(int32_t) - \
> +				(size % sizeof(int32_t)))
> +

Why? Also, any reason we couldn't use round_up() or some such here?

> +/*
> + * This is the "attr intention" log item.  It is used to log the fact
> + * that some attrs need to be processed.  It is used in conjunction with the
> + * "attr done" log item described below.
> + *
> + * The ATTRI is reference counted so that it is not freed prior to both the
> + * ATTRI and ATTRD being committed and unpinned. This ensures the ATTRI is
> + * inserted into the AIL even in the event of out of order ATTRI/ATTRD
> + * processing. In other words, an ATTRI is born with two references:
> + *
> + *      1.) an ATTRI held reference to track ATTRI AIL insertion
> + *      2.) an ATTRD held reference to track ATTRD commit
> + *
> + * On allocation, both references are the responsibility of the caller. Once
> + * the ATTRI is added to and dirtied in a transaction, ownership of reference
> + * one transfers to the transaction. The reference is dropped once the ATTRI is
> + * inserted to the AIL or in the event of failure along the way (e.g., commit
> + * failure, log I/O error, etc.). Note that the caller remains responsible for
> + * the ATTRD reference under all circumstances to this point. The caller has no
> + * means to detect failure once the transaction is committed, however.
> + * Therefore, an ATTRD is required after this point, even in the event of
> + * unrelated failure.
> + *
> + * Once an ATTRD is allocated and dirtied in a transaction, reference two
> + * transfers to the transaction. The ATTRD reference is dropped once it reaches
> + * the unpin handler. Similar to the ATTRI, the reference also drops in the
> + * event of commit failure or log I/O errors. Note that the ATTRD is not
> + * inserted in the AIL, so at this point both the ATTI and ATTRD are freed.
> + */
> +struct xfs_attri_log_item {
> +	xfs_log_item_t			item;
> +	atomic_t			refcount;
> +	unsigned long			flags;	/* misc flags */
> +	int				name_len;
> +	void				*name;
> +	int				value_len;
> +	void				*value;
> +	struct xfs_attri_log_format	format;
> +};

I think we usually try to use field prefix names in these various
structures (as you've done in other places). I.e., attri_item,
attrd_item, etc. would probably be consistent with similar structures
like the efi/efd log items.

> +
> +/*
> + * This is the "attr done" log item.  It is used to log
> + * the fact that some attrs earlier mentioned in an attri item
> + * have been freed.
> + */
> +struct xfs_attrd_log_item {
> +	struct xfs_log_item		item;
> +	struct xfs_attri_log_item	*attrip;
> +	struct xfs_attrd_log_format	format;
> +};
> +
> +/*
> + * Max number of attrs in fast allocation path.
> + */
> +#define	XFS_ATTRD_MAX_FAST_ATTRS	1
> +
> +extern struct kmem_zone	*xfs_attri_zone;
> +extern struct kmem_zone	*xfs_attrd_zone;
> +
> +struct xfs_attri_log_item	*xfs_attri_init(struct xfs_mount *mp);
> +struct xfs_attrd_log_item	*xfs_attrd_init(struct xfs_mount *mp,
> +					struct xfs_attri_log_item *attrip);
> +int xfs_attri_copy_format(struct xfs_log_iovec *buf,
> +			   struct xfs_attri_log_format *dst_attri_fmt);
> +int xfs_attrd_copy_format(struct xfs_log_iovec *buf,
> +			   struct xfs_attrd_log_format *dst_attrd_fmt);
> +void			xfs_attri_item_free(struct xfs_attri_log_item *attrip);
> +void			xfs_attri_release(struct xfs_attri_log_item *attrip);
> +
> +int			xfs_attri_recover(struct xfs_mount *mp,
> +					struct xfs_attri_log_item *attrip);
> +
> +#endif	/* __XFS_ATTR_ITEM_H__ */
...
> diff --git a/fs/xfs/xfs_trans_attr.c b/fs/xfs/xfs_trans_attr.c
> new file mode 100644
> index 0000000..3679348
> --- /dev/null
> +++ b/fs/xfs/xfs_trans_attr.c
> @@ -0,0 +1,240 @@
> +// SPDX-License-Identifier: GPL-2.0+
> +/*
> + * Copyright (C) 2019 Oracle.  All Rights Reserved.
> + * Author: Allison Henderson <allison.henderson@oracle.com>
> + */
> +#include "xfs.h"
> +#include "xfs_fs.h"
> +#include "xfs_shared.h"
> +#include "xfs_format.h"
> +#include "xfs_log_format.h"
> +#include "xfs_trans_resv.h"
> +#include "xfs_bit.h"
> +#include "xfs_mount.h"
> +#include "xfs_defer.h"
> +#include "xfs_trans.h"
> +#include "xfs_trans_priv.h"
> +#include "xfs_attr_item.h"
> +#include "xfs_alloc.h"
> +#include "xfs_bmap.h"
> +#include "xfs_trace.h"
> +#include "libxfs/xfs_da_format.h"
> +#include "xfs_da_btree.h"
> +#include "xfs_attr.h"
> +#include "xfs_inode.h"
> +#include "xfs_icache.h"
> +#include "xfs_quota.h"
> +
> +/*
> + * This routine is called to allocate an "attr free done"
> + * log item.
> + */
> +struct xfs_attrd_log_item *
> +xfs_trans_get_attrd(struct xfs_trans		*tp,
> +		  struct xfs_attri_log_item	*attrip)
> +{
> +	struct xfs_attrd_log_item			*attrdp;
> +
> +	ASSERT(tp != NULL);
> +
> +	attrdp = xfs_attrd_init(tp->t_mountp, attrip);
> +	ASSERT(attrdp != NULL);
> +
> +	/*
> +	 * Get a log_item_desc to point at the new item.
> +	 */
> +	xfs_trans_add_item(tp, &attrdp->item);
> +	return attrdp;
> +}
> +
> +/*
> + * Delete an attr and log it to the ATTRD. Note that the transaction is marked
> + * dirty regardless of whether the attr delete succeeds or fails to support the
> + * ATTRI/ATTRD lifecycle rules.
> + */
> +int
> +xfs_trans_attr(
> +	struct xfs_da_args		*args,
> +	struct xfs_attrd_log_item	*attrdp,
> +	uint32_t			op_flags)
> +{
> +	int				error;
> +	struct xfs_buf			*leaf_bp = NULL;
> +
> +	error = xfs_qm_dqattach_locked(args->dp, 0);
> +	if (error)
> +		return error;
> +
> +	switch (op_flags) {
> +	case XFS_ATTR_OP_FLAGS_SET:
> +		args->op_flags |= XFS_DA_OP_ADDNAME;
> +		error = xfs_attr_set_args(args, &leaf_bp, false);
> +		break;
> +	case XFS_ATTR_OP_FLAGS_REMOVE:
> +		ASSERT(XFS_IFORK_Q((args->dp)));
> +		error = xfs_attr_remove_args(args, false);
> +		break;
> +	default:
> +		error = -EFSCORRUPTED;
> +	}
> +
> +	if (error) {
> +		if (leaf_bp)
> +			xfs_trans_brelse(args->trans, leaf_bp);
> +	}
> +
> +	/*
> +	 * Mark the transaction dirty, even on error. This ensures the
> +	 * transaction is aborted, which:
> +	 *
> +	 * 1.) releases the ATTRI and frees the ATTRD
> +	 * 2.) shuts down the filesystem
> +	 */
> +	args->trans->t_flags |= XFS_TRANS_DIRTY;
> +	set_bit(XFS_LI_DIRTY, &attrdp->item.li_flags);
> +
> +	attrdp->attrip->name = (void *)args->name;
> +	attrdp->attrip->value = (void *)args->value;
> +	attrdp->attrip->name_len = args->namelen;
> +	attrdp->attrip->value_len = args->valuelen;
> +

What's the reason for updating the attri here? It's already been
committed by the time we get around to the attrd. Is this used again
somewhere?

> +	return error;
> +}
> +
> +static int
> +xfs_attr_diff_items(
> +	void				*priv,
> +	struct list_head		*a,
> +	struct list_head		*b)
> +{
> +	return 0;
> +}
> +
> +/* Get an ATTRI. */
> +STATIC void *
> +xfs_attr_create_intent(
> +	struct xfs_trans		*tp,
> +	unsigned int			count)
> +{
> +	struct xfs_attri_log_item		*attrip;
> +
> +	ASSERT(tp != NULL);
> +	ASSERT(count == 1);
> +
> +	attrip = xfs_attri_init(tp->t_mountp);
> +	ASSERT(attrip != NULL);
> +
> +	/*
> +	 * Get a log_item_desc to point at the new item.
> +	 */
> +	xfs_trans_add_item(tp, &attrip->item);
> +	return attrip;
> +}
> +
> +/* Log an attr to the intent item. */
> +STATIC void
> +xfs_attr_log_item(
> +	struct xfs_trans		*tp,
> +	void				*intent,
> +	struct list_head		*item)
> +{
> +	struct xfs_attri_log_item	*attrip = intent;
> +	struct xfs_attr_item		*attr;
> +	struct xfs_attri_log_format	*attrp;
> +	char				*name_value;
> +
> +	attr = container_of(item, struct xfs_attr_item, xattri_list);
> +	name_value = ((char *)attr) + sizeof(struct xfs_attr_item);
> +
> +	tp->t_flags |= XFS_TRANS_DIRTY;
> +	set_bit(XFS_LI_DIRTY, &attrip->item.li_flags);
> +
> +	attrp = &attrip->format;
> +	attrp->alfi_ino = attr->xattri_ip->i_ino;
> +	attrp->alfi_op_flags = attr->xattri_op_flags;
> +	attrp->alfi_value_len = attr->xattri_value_len;
> +	attrp->alfi_name_len = attr->xattri_name_len;
> +	attrp->alfi_attr_flags = attr->xattri_flags;
> +
> +	attrip->name = name_value;
> +	attrip->value = &name_value[attr->xattri_name_len];
> +	attrip->name_len = attr->xattri_name_len;
> +	attrip->value_len = attr->xattri_value_len;

So once we're at this point, we've constructed an xfs_attr_item to
describe the high level deferred operation, created an intent log item
and we're now logging that xfs_attri_log_item. We fill in the log format
structure based on the xfs_attr_item and point the xfs_attri_log_item
name/value pointers at the xfs_attr_item memory. It's thus important to
note we've established a subtle relationship between these two data
structures because they may have different lifecycles.

> +}
> +
> +/* Get an ATTRD so we can process all the attrs. */
> +STATIC void *
> +xfs_attr_create_done(
> +	struct xfs_trans		*tp,
> +	void				*intent,
> +	unsigned int			count)
> +{
> +	return xfs_trans_get_attrd(tp, intent);
> +}
> +
> +/* Process an attr. */
> +STATIC int
> +xfs_attr_finish_item(
> +	struct xfs_trans		*tp,
> +	struct list_head		*item,
> +	void				*done_item,
> +	void				**state)
> +{
> +	struct xfs_attr_item		*attr;
> +	char				*name_value;
> +	int				error;
> +	int				local;
> +	struct xfs_da_args		args;
> +
> +	attr = container_of(item, struct xfs_attr_item, xattri_list);
> +	name_value = ((char *)attr) + sizeof(struct xfs_attr_item);
> +
> +	error = xfs_attr_args_init(&args, attr->xattri_ip, name_value,
> +				   attr->xattri_name_len, attr->xattri_flags);
> +	if (error)
> +		goto out;
> +
> +	args.hashval = xfs_da_hashname(args.name, args.namelen);
> +	args.value = &name_value[attr->xattri_name_len];
> +	args.valuelen = attr->xattri_value_len;
> +	args.op_flags = XFS_DA_OP_OKNOENT;
> +	args.total = xfs_attr_calc_size(&args, &local);
> +	args.trans = tp;
> +
> +	error = xfs_trans_attr(&args, done_item,
> +			attr->xattri_op_flags);

So now we've committed/rolled our xfs_attri_log_item intent and
created/attached the xfs_attrd_log_item and thus we're free to perform
the operation...

> +out:
> +	kmem_free(attr);

... and here is where we end up freeing the xfs_attr_item created for
the dfops infrastructure that holds our name and value memory.

Hmm.. I think this means our name/value memory accesses are safe because
the xfs_attri_log_item only accesses them in the ->iop_format()
callback, which occurs during transaction commit of the intent and we're
long past that.

That said, the attri/attrd log items themselves outlive the current
transaction commit sequence (i.e. until the attrd is physically
logged/committed and we free both). That means that once we free the
attr above we technically have an attri passing through the log
infrastructure with a couple invalid pointers, they just don't happen to
be used. It might be worth thinking about how we can clean that up,
whether it be clearing those pointers here, or allocating the name/val
memory separately and transferring it to the attri, etc. Whatever we end
up doing, we should probably add a comment somewhere to explain exactly
what's going on as well.

Brian

> +	return error;
> +}
> +
> +/* Abort all pending ATTRs. */
> +STATIC void
> +xfs_attr_abort_intent(
> +	void				*intent)
> +{
> +	xfs_attri_release(intent);
> +}
> +
> +/* Cancel an attr */
> +STATIC void
> +xfs_attr_cancel_item(
> +	struct list_head		*item)
> +{
> +	struct xfs_attr_item	*attr;
> +
> +	attr = container_of(item, struct xfs_attr_item, xattri_list);
> +	kmem_free(attr);
> +}
> +
> +const struct xfs_defer_op_type xfs_attr_defer_type = {
> +	.max_items	= XFS_ATTRI_MAX_FAST_ATTRS,
> +	.diff_items	= xfs_attr_diff_items,
> +	.create_intent	= xfs_attr_create_intent,
> +	.abort_intent	= xfs_attr_abort_intent,
> +	.log_item	= xfs_attr_log_item,
> +	.create_done	= xfs_attr_create_done,
> +	.finish_item	= xfs_attr_finish_item,
> +	.cancel_item	= xfs_attr_cancel_item,
> +};
> +
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 5/9] xfs: Add xfs_attr_set_deferred and xfs_attr_remove_deferred
  2019-04-12 22:50 ` [PATCH 5/9] xfs: Add xfs_attr_set_deferred and xfs_attr_remove_deferred Allison Henderson
@ 2019-04-18 15:49   ` Brian Foster
  2019-04-18 21:28     ` Allison Henderson
  0 siblings, 1 reply; 48+ messages in thread
From: Brian Foster @ 2019-04-18 15:49 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Fri, Apr 12, 2019 at 03:50:32PM -0700, Allison Henderson wrote:
> These routines set up set and start a new deferred attribute
> operation.  These functions are meant to be called by other
> code needing to initiate a deferred attribute operation.  We
> will use these routines later in the parent pointer patches.
> 

We probably don't need to reference the parent pointer stuff any more
for this, right? I'm assuming we'll be converting generic attr
infrastructure over to this mechanism in subsequent patches..?

> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_attr.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++
>  fs/xfs/libxfs/xfs_attr.h |  7 +++++
>  2 files changed, 87 insertions(+)
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index fadd485..c3477fa7 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -30,6 +30,7 @@
>  #include "xfs_trans_space.h"
>  #include "xfs_trace.h"
>  #include "xfs_attr_item.h"
> +#include "xfs_attr.h"
>  
>  /*
>   * xfs_attr.c
> @@ -429,6 +430,52 @@ xfs_attr_set(
>  	goto out_unlock;
>  }
>  
> +/* Sets an attribute for an inode as a deferred operation */
> +int
> +xfs_attr_set_deferred(
> +	struct xfs_inode	*dp,
> +	struct xfs_trans	*tp,
> +	const unsigned char	*name,
> +	unsigned int		namelen,
> +	const unsigned char	*value,
> +	unsigned int		valuelen,
> +	int			flags)
> +{
> +
> +	struct xfs_attr_item	*new;
> +	char			*name_value;
> +
> +	/*
> +	 * All set operations must have a name
> +	 * but not necessarily a value.
> +	 * Generic 062

Comment formatting, also looks like there's some stale text or
something.

> +	 */
> +	if (!namelen) {
> +		ASSERT(0);
> +		return -EFSCORRUPTED;

This is essentially a requested operation from userspace, right? If so,
I'd think -EINVAL or something makes more sense than -EFSCORRUPTED.

> +	}
> +
> +	new = kmem_alloc(XFS_ATTR_ITEM_SIZEOF(namelen, valuelen),
> +			 KM_SLEEP|KM_NOFS);

This could get interesting with larger attrs (up to 64k IIRC). We might
want to consider kmem_alloc_large().

> +	name_value = ((char *)new) + sizeof(struct xfs_attr_item);
> +	memset(new, 0, XFS_ATTR_ITEM_SIZEOF(namelen, valuelen));
> +	new->xattri_ip = dp;
> +	new->xattri_op_flags = XFS_ATTR_OP_FLAGS_SET;
> +	new->xattri_name_len = namelen;
> +	new->xattri_value_len = valuelen;
> +	new->xattri_flags = flags;
> +	memcpy(&name_value[0], name, namelen);

name_value is just a char pointer. Do we need the whole array index just
to deref thing here? Meh, I guess it's consistent with the value copy
below. No big deal.

> +	new->xattri_name = name_value;
> +	new->xattri_value = name_value + namelen;
> +
> +	if (valuelen > 0)
> +		memcpy(&name_value[namelen], value, valuelen);
> +
> +	xfs_defer_add(tp, XFS_DEFER_OPS_TYPE_ATTR, &new->xattri_list);
> +
> +	return 0;
> +}
> +
>  /*
>   * Generic handler routine to remove a name from an attribute list.
>   * Transitions attribute list from Btree to shortform as necessary.
> @@ -513,6 +560,39 @@ xfs_attr_remove(
>  	return error;
>  }
>  
> +/* Removes an attribute for an inode as a deferred operation */
> +int
> +xfs_attr_remove_deferred(

Hmm.. I'm kind of wondering if we actually need to defer attr removes.
Do we have the same kind of challenges for attr removal as for attr
creation, or is there some future scenario where this is needed?

> +	struct xfs_inode        *dp,
> +	struct xfs_trans	*tp,
> +	const unsigned char	*name,
> +	unsigned int		namelen,
> +	int                     flags)
> +{
> +
> +	struct xfs_attr_item	*new;
> +	char			*name_value;
> +
> +	if (!namelen) {
> +		ASSERT(0);
> +		return -EFSCORRUPTED;

Similar comment around -EFSCORRUPTED vs. -EINVAL (or something else..).

Brian

> +	}
> +
> +	new = kmem_alloc(XFS_ATTR_ITEM_SIZEOF(namelen, 0), KM_SLEEP|KM_NOFS);
> +	name_value = ((char *)new) + sizeof(struct xfs_attr_item);
> +	memset(new, 0, XFS_ATTR_ITEM_SIZEOF(namelen, 0));
> +	new->xattri_ip = dp;
> +	new->xattri_op_flags = XFS_ATTR_OP_FLAGS_REMOVE;
> +	new->xattri_name_len = namelen;
> +	new->xattri_value_len = 0;
> +	new->xattri_flags = flags;
> +	memcpy(name_value, name, namelen);
> +
> +	xfs_defer_add(tp, XFS_DEFER_OPS_TYPE_ATTR, &new->xattri_list);
> +
> +	return 0;
> +}
> +
>  /*========================================================================
>   * External routines when attribute list is inside the inode
>   *========================================================================*/
> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
> index 92d9a15..83b3621 100644
> --- a/fs/xfs/libxfs/xfs_attr.h
> +++ b/fs/xfs/libxfs/xfs_attr.h
> @@ -175,5 +175,12 @@ bool xfs_attr_namecheck(const void *name, size_t length);
>  int xfs_attr_args_init(struct xfs_da_args *args, struct xfs_inode *dp,
>  			const unsigned char *name, size_t namelen, int flags);
>  int xfs_attr_calc_size(struct xfs_da_args *args, int *local);
> +int xfs_attr_set_deferred(struct xfs_inode *dp, struct xfs_trans *tp,
> +			  const unsigned char *name, unsigned int name_len,
> +			  const unsigned char *value, unsigned int valuelen,
> +			  int flags);
> +int xfs_attr_remove_deferred(struct xfs_inode *dp, struct xfs_trans *tp,
> +			    const unsigned char *name, unsigned int namelen,
> +			    int flags);
>  
>  #endif	/* __XFS_ATTR_H__ */
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 3/9] xfs: Add trans toggle to attr routines
  2019-04-18 15:27   ` Brian Foster
@ 2019-04-18 21:23     ` Allison Henderson
  0 siblings, 0 replies; 48+ messages in thread
From: Allison Henderson @ 2019-04-18 21:23 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs



On 4/18/19 8:27 AM, Brian Foster wrote:
> On Fri, Apr 12, 2019 at 03:50:30PM -0700, Allison Henderson wrote:
>> This patch adds a roll_trans parameter to all attribute routines
>> that may roll a transaction. Calling functions may pass true to
>> roll transactions as normal, or false to hold them.
>>
>> This patch is temporary and will be removed later when all code
>> paths have been made to pass a false value.  The temporary boolean
>> assists us to introduce changes across multiple smaller patches instead
>> of handling all affected code paths in one large patch.
>>
>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
>> ---
> 
> A couple more sentences in the commit log with details on the purpose
> for this would be helpful. E.g., the current state of the attr code
> rolls the transaction at various places, the implementation we're moving
> to can't or shouldn't do this because <reasons>, etc.

Sure, I will expand the summary a little bit here.  It's mostly because 
the existing infrastructure for recoding the logs needs to control the 
transaction, so we can't be creating or rolling transactions that it 
doesn't know about.

> 
>>   fs/xfs/libxfs/xfs_attr.c        | 257 +++++++++++++++++++++++-----------------
>>   fs/xfs/libxfs/xfs_attr.h        |   5 +-
>>   fs/xfs/libxfs/xfs_attr_leaf.c   |  20 +++-
>>   fs/xfs/libxfs/xfs_attr_leaf.h   |   8 +-
>>   fs/xfs/libxfs/xfs_attr_remote.c |  50 ++++----
>>   fs/xfs/libxfs/xfs_attr_remote.h |   4 +-
>>   6 files changed, 203 insertions(+), 141 deletions(-)
>>
>> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
>> index 3da6b0d..c50bbf6 100644
>> --- a/fs/xfs/libxfs/xfs_attr.c
>> +++ b/fs/xfs/libxfs/xfs_attr.c
> ...
>> @@ -743,7 +762,8 @@ xfs_attr_leaf_addname(
>>    */
>>   STATIC int
>>   xfs_attr_leaf_removename(
>> -	struct xfs_da_args	*args)
>> +	struct xfs_da_args	*args,
>> +	bool roll_trans)
> 
> Indentation ^
> 
> Nits aside this looks like fairly mechanical plumbing, doesn't change
> behavior in the current attr codepaths and seems reasonable as a
> transient step:
> 
> Reviewed-by: Brian Foster <bfoster@redhat.com>

Alrighty, I will get these updated and sent out in the next revision.

Thanks!
Allison

> 
>>   {
>>   	struct xfs_inode	*dp;
>>   	struct xfs_buf		*bp;
>> @@ -776,9 +796,12 @@ xfs_attr_leaf_removename(
>>   		/* bp is gone due to xfs_da_shrink_inode */
>>   		if (error)
>>   			return error;
>> -		error = xfs_defer_finish(&args->trans);
>> -		if (error)
>> -			return error;
>> +
>> +		if (roll_trans) {
>> +			error = xfs_defer_finish(&args->trans);
>> +			if (error)
>> +				return error;
>> +		}
>>   	}
>>   	return 0;
>>   }
>> @@ -831,7 +854,8 @@ xfs_attr_leaf_get(xfs_da_args_t *args)
>>    */
>>   STATIC int
>>   xfs_attr_node_addname(
>> -	struct xfs_da_args	*args)
>> +	struct xfs_da_args	*args,
>> +	bool			roll_trans)
>>   {
>>   	struct xfs_da_state	*state;
>>   	struct xfs_da_state_blk	*blk;
>> @@ -899,17 +923,20 @@ xfs_attr_node_addname(
>>   			error = xfs_attr3_leaf_to_node(args);
>>   			if (error)
>>   				goto out;
>> -			error = xfs_defer_finish(&args->trans);
>> -			if (error)
>> -				goto out;
>>   
>> -			/*
>> -			 * Commit the node conversion and start the next
>> -			 * trans in the chain.
>> -			 */
>> -			error = xfs_trans_roll_inode(&args->trans, dp);
>> -			if (error)
>> -				goto out;
>> +			if (roll_trans) {
>> +				error = xfs_defer_finish(&args->trans);
>> +				if (error)
>> +					goto out;
>> +
>> +				/*
>> +				 * Commit the node conversion and start the next
>> +				 * trans in the chain.
>> +				 */
>> +				error = xfs_trans_roll_inode(&args->trans, dp);
>> +				if (error)
>> +					goto out;
>> +			}
>>   
>>   			goto restart;
>>   		}
>> @@ -923,9 +950,13 @@ xfs_attr_node_addname(
>>   		error = xfs_da3_split(state);
>>   		if (error)
>>   			goto out;
>> -		error = xfs_defer_finish(&args->trans);
>> -		if (error)
>> -			goto out;
>> +
>> +		if (roll_trans) {
>> +			error = xfs_defer_finish(&args->trans);
>> +			if (error)
>> +				goto out;
>> +		}
>> +
>>   	} else {
>>   		/*
>>   		 * Addition succeeded, update Btree hashvals.
>> @@ -944,9 +975,11 @@ xfs_attr_node_addname(
>>   	 * Commit the leaf addition or btree split and start the next
>>   	 * trans in the chain.
>>   	 */
>> -	error = xfs_trans_roll_inode(&args->trans, dp);
>> -	if (error)
>> -		goto out;
>> +	if (roll_trans) {
>> +		error = xfs_trans_roll_inode(&args->trans, dp);
>> +		if (error)
>> +			goto out;
>> +	}
>>   
>>   	/*
>>   	 * If there was an out-of-line value, allocate the blocks we
>> @@ -955,7 +988,7 @@ xfs_attr_node_addname(
>>   	 * maximum size of a transaction and/or hit a deadlock.
>>   	 */
>>   	if (args->rmtblkno > 0) {
>> -		error = xfs_attr_rmtval_set(args);
>> +		error = xfs_attr_rmtval_set(args, roll_trans);
>>   		if (error)
>>   			return error;
>>   	}
>> @@ -971,7 +1004,7 @@ xfs_attr_node_addname(
>>   		 * In a separate transaction, set the incomplete flag on the
>>   		 * "old" attr and clear the incomplete flag on the "new" attr.
>>   		 */
>> -		error = xfs_attr3_leaf_flipflags(args);
>> +		error = xfs_attr3_leaf_flipflags(args, roll_trans);
>>   		if (error)
>>   			goto out;
>>   
>> @@ -985,7 +1018,7 @@ xfs_attr_node_addname(
>>   		args->rmtblkcnt = args->rmtblkcnt2;
>>   		args->rmtvaluelen = args->rmtvaluelen2;
>>   		if (args->rmtblkno) {
>> -			error = xfs_attr_rmtval_remove(args);
>> +			error = xfs_attr_rmtval_remove(args, roll_trans);
>>   			if (error)
>>   				return error;
>>   		}
>> @@ -1019,9 +1052,11 @@ xfs_attr_node_addname(
>>   			error = xfs_da3_join(state);
>>   			if (error)
>>   				goto out;
>> -			error = xfs_defer_finish(&args->trans);
>> -			if (error)
>> -				goto out;
>> +			if (roll_trans) {
>> +				error = xfs_defer_finish(&args->trans);
>> +				if (error)
>> +					goto out;
>> +			}
>>   		}
>>   
>>   		/*
>> @@ -1035,7 +1070,7 @@ xfs_attr_node_addname(
>>   		/*
>>   		 * Added a "remote" value, just clear the incomplete flag.
>>   		 */
>> -		error = xfs_attr3_leaf_clearflag(args);
>> +		error = xfs_attr3_leaf_clearflag(args, roll_trans);
>>   		if (error)
>>   			goto out;
>>   	}
>> @@ -1058,7 +1093,8 @@ xfs_attr_node_addname(
>>    */
>>   STATIC int
>>   xfs_attr_node_removename(
>> -	struct xfs_da_args	*args)
>> +	struct xfs_da_args	*args,
>> +	bool			roll_trans)
>>   {
>>   	struct xfs_da_state	*state;
>>   	struct xfs_da_state_blk	*blk;
>> @@ -1108,10 +1144,10 @@ xfs_attr_node_removename(
>>   		 * Mark the attribute as INCOMPLETE, then bunmapi() the
>>   		 * remote value.
>>   		 */
>> -		error = xfs_attr3_leaf_setflag(args);
>> +		error = xfs_attr3_leaf_setflag(args, roll_trans);
>>   		if (error)
>>   			goto out;
>> -		error = xfs_attr_rmtval_remove(args);
>> +		error = xfs_attr_rmtval_remove(args, roll_trans);
>>   		if (error)
>>   			goto out;
>>   
>> @@ -1139,15 +1175,19 @@ xfs_attr_node_removename(
>>   		error = xfs_da3_join(state);
>>   		if (error)
>>   			goto out;
>> -		error = xfs_defer_finish(&args->trans);
>> -		if (error)
>> -			goto out;
>> -		/*
>> -		 * Commit the Btree join operation and start a new trans.
>> -		 */
>> -		error = xfs_trans_roll_inode(&args->trans, dp);
>> -		if (error)
>> -			goto out;
>> +
>> +		if (roll_trans) {
>> +			error = xfs_defer_finish(&args->trans);
>> +			if (error)
>> +				goto out;
>> +			/*
>> +			 * Commit the Btree join operation and start
>> +			 * a new trans.
>> +			 */
>> +			error = xfs_trans_roll_inode(&args->trans, dp);
>> +			if (error)
>> +				goto out;
>> +		}
>>   	}
>>   
>>   	/*
>> @@ -1170,9 +1210,12 @@ xfs_attr_node_removename(
>>   			/* bp is gone due to xfs_da_shrink_inode */
>>   			if (error)
>>   				goto out;
>> -			error = xfs_defer_finish(&args->trans);
>> -			if (error)
>> -				goto out;
>> +
>> +			if (roll_trans) {
>> +				error = xfs_defer_finish(&args->trans);
>> +				if (error)
>> +					goto out;
>> +			}
>>   		} else
>>   			xfs_trans_brelse(args->trans, bp);
>>   	}
>> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
>> index 52f63dc..f0e91bf 100644
>> --- a/fs/xfs/libxfs/xfs_attr.h
>> +++ b/fs/xfs/libxfs/xfs_attr.h
>> @@ -142,10 +142,11 @@ int xfs_attr_get(struct xfs_inode *ip, const unsigned char *name,
>>   int xfs_attr_set(struct xfs_inode *dp, const unsigned char *name,
>>   		 size_t namelen, unsigned char *value, int valuelen,
>>   		 int flags);
>> -int xfs_attr_set_args(struct xfs_da_args *args, struct xfs_buf **leaf_bp);
>> +int xfs_attr_set_args(struct xfs_da_args *args, struct xfs_buf **leaf_bp,
>> +		 bool roll_trans);
>>   int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name,
>>   		    size_t namelen, int flags);
>> -int xfs_attr_remove_args(struct xfs_da_args *args);
>> +int xfs_attr_remove_args(struct xfs_da_args *args, bool roll_trans);
>>   int xfs_attr_list(struct xfs_inode *dp, char *buffer, int bufsize,
>>   		  int flags, struct attrlist_cursor_kern *cursor);
>>   bool xfs_attr_namecheck(const void *name, size_t length);
>> diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c
>> index 1f6e396..128bfe9 100644
>> --- a/fs/xfs/libxfs/xfs_attr_leaf.c
>> +++ b/fs/xfs/libxfs/xfs_attr_leaf.c
>> @@ -2637,7 +2637,8 @@ xfs_attr_leaf_newentsize(
>>    */
>>   int
>>   xfs_attr3_leaf_clearflag(
>> -	struct xfs_da_args	*args)
>> +	struct xfs_da_args	*args,
>> +	bool			roll_trans)
>>   {
>>   	struct xfs_attr_leafblock *leaf;
>>   	struct xfs_attr_leaf_entry *entry;
>> @@ -2698,7 +2699,9 @@ xfs_attr3_leaf_clearflag(
>>   	/*
>>   	 * Commit the flag value change and start the next trans in series.
>>   	 */
>> -	return xfs_trans_roll_inode(&args->trans, args->dp);
>> +	if (roll_trans)
>> +		error = xfs_trans_roll_inode(&args->trans, args->dp);
>> +	return error;
>>   }
>>   
>>   /*
>> @@ -2706,7 +2709,8 @@ xfs_attr3_leaf_clearflag(
>>    */
>>   int
>>   xfs_attr3_leaf_setflag(
>> -	struct xfs_da_args	*args)
>> +	struct xfs_da_args	*args,
>> +	bool			roll_trans)
>>   {
>>   	struct xfs_attr_leafblock *leaf;
>>   	struct xfs_attr_leaf_entry *entry;
>> @@ -2749,7 +2753,9 @@ xfs_attr3_leaf_setflag(
>>   	/*
>>   	 * Commit the flag value change and start the next trans in series.
>>   	 */
>> -	return xfs_trans_roll_inode(&args->trans, args->dp);
>> +	if (roll_trans)
>> +		error = xfs_trans_roll_inode(&args->trans, args->dp);
>> +	return error;
>>   }
>>   
>>   /*
>> @@ -2761,7 +2767,8 @@ xfs_attr3_leaf_setflag(
>>    */
>>   int
>>   xfs_attr3_leaf_flipflags(
>> -	struct xfs_da_args	*args)
>> +	struct xfs_da_args	*args,
>> +	bool			roll_trans)
>>   {
>>   	struct xfs_attr_leafblock *leaf1;
>>   	struct xfs_attr_leafblock *leaf2;
>> @@ -2867,7 +2874,8 @@ xfs_attr3_leaf_flipflags(
>>   	/*
>>   	 * Commit the flag value change and start the next trans in series.
>>   	 */
>> -	error = xfs_trans_roll_inode(&args->trans, args->dp);
>> +	if (roll_trans)
>> +		error = xfs_trans_roll_inode(&args->trans, args->dp);
>>   
>>   	return error;
>>   }
>> diff --git a/fs/xfs/libxfs/xfs_attr_leaf.h b/fs/xfs/libxfs/xfs_attr_leaf.h
>> index 7b74e18..9d830ec 100644
>> --- a/fs/xfs/libxfs/xfs_attr_leaf.h
>> +++ b/fs/xfs/libxfs/xfs_attr_leaf.h
>> @@ -49,10 +49,10 @@ void	xfs_attr_fork_remove(struct xfs_inode *ip, struct xfs_trans *tp);
>>    */
>>   int	xfs_attr3_leaf_to_node(struct xfs_da_args *args);
>>   int	xfs_attr3_leaf_to_shortform(struct xfs_buf *bp,
>> -				   struct xfs_da_args *args, int forkoff);
>> -int	xfs_attr3_leaf_clearflag(struct xfs_da_args *args);
>> -int	xfs_attr3_leaf_setflag(struct xfs_da_args *args);
>> -int	xfs_attr3_leaf_flipflags(struct xfs_da_args *args);
>> +			struct xfs_da_args *args, int forkoff);
>> +int	xfs_attr3_leaf_clearflag(struct xfs_da_args *args, bool roll_trans);
>> +int	xfs_attr3_leaf_setflag(struct xfs_da_args *args, bool roll_trans);
>> +int	xfs_attr3_leaf_flipflags(struct xfs_da_args *args, bool roll_trans);
>>   
>>   /*
>>    * Routines used for growing the Btree.
>> diff --git a/fs/xfs/libxfs/xfs_attr_remote.c b/fs/xfs/libxfs/xfs_attr_remote.c
>> index 65ff600..18fbd22 100644
>> --- a/fs/xfs/libxfs/xfs_attr_remote.c
>> +++ b/fs/xfs/libxfs/xfs_attr_remote.c
>> @@ -435,7 +435,8 @@ xfs_attr_rmtval_get(
>>    */
>>   int
>>   xfs_attr_rmtval_set(
>> -	struct xfs_da_args	*args)
>> +	struct xfs_da_args	*args,
>> +	bool			roll_trans)
>>   {
>>   	struct xfs_inode	*dp = args->dp;
>>   	struct xfs_mount	*mp = dp->i_mount;
>> @@ -488,9 +489,12 @@ xfs_attr_rmtval_set(
>>   				  &nmap);
>>   		if (error)
>>   			return error;
>> -		error = xfs_defer_finish(&args->trans);
>> -		if (error)
>> -			return error;
>> +
>> +		if (roll_trans) {
>> +			error = xfs_defer_finish(&args->trans);
>> +			if (error)
>> +				return error;
>> +		}
>>   
>>   		ASSERT(nmap == 1);
>>   		ASSERT((map.br_startblock != DELAYSTARTBLOCK) &&
>> @@ -498,12 +502,14 @@ xfs_attr_rmtval_set(
>>   		lblkno += map.br_blockcount;
>>   		blkcnt -= map.br_blockcount;
>>   
>> -		/*
>> -		 * Start the next trans in the chain.
>> -		 */
>> -		error = xfs_trans_roll_inode(&args->trans, dp);
>> -		if (error)
>> -			return error;
>> +		if (roll_trans) {
>> +			/*
>> +			 * Start the next trans in the chain.
>> +			 */
>> +			error = xfs_trans_roll_inode(&args->trans, dp);
>> +			if (error)
>> +				return error;
>> +		}
>>   	}
>>   
>>   	/*
>> @@ -563,7 +569,8 @@ xfs_attr_rmtval_set(
>>    */
>>   int
>>   xfs_attr_rmtval_remove(
>> -	struct xfs_da_args	*args)
>> +	struct xfs_da_args	*args,
>> +	bool			roll_trans)
>>   {
>>   	struct xfs_mount	*mp = args->dp->i_mount;
>>   	xfs_dablk_t		lblkno;
>> @@ -625,16 +632,19 @@ xfs_attr_rmtval_remove(
>>   				    XFS_BMAPI_ATTRFORK, 1, &done);
>>   		if (error)
>>   			return error;
>> -		error = xfs_defer_finish(&args->trans);
>> -		if (error)
>> -			return error;
>>   
>> -		/*
>> -		 * Close out trans and start the next one in the chain.
>> -		 */
>> -		error = xfs_trans_roll_inode(&args->trans, args->dp);
>> -		if (error)
>> -			return error;
>> +		if (roll_trans) {
>> +			error = xfs_defer_finish(&args->trans);
>> +			if (error)
>> +				return error;
>> +
>> +			/*
>> +			 * Close out trans and start the next one in the chain.
>> +			 */
>> +			error = xfs_trans_roll_inode(&args->trans, args->dp);
>> +			if (error)
>> +				return error;
>> +		}
>>   	}
>>   	return 0;
>>   }
>> diff --git a/fs/xfs/libxfs/xfs_attr_remote.h b/fs/xfs/libxfs/xfs_attr_remote.h
>> index 9d20b66..c7c073d 100644
>> --- a/fs/xfs/libxfs/xfs_attr_remote.h
>> +++ b/fs/xfs/libxfs/xfs_attr_remote.h
>> @@ -9,7 +9,7 @@
>>   int xfs_attr3_rmt_blocks(struct xfs_mount *mp, int attrlen);
>>   
>>   int xfs_attr_rmtval_get(struct xfs_da_args *args);
>> -int xfs_attr_rmtval_set(struct xfs_da_args *args);
>> -int xfs_attr_rmtval_remove(struct xfs_da_args *args);
>> +int xfs_attr_rmtval_set(struct xfs_da_args *args, bool roll_trans);
>> +int xfs_attr_rmtval_remove(struct xfs_da_args *args, bool roll_trans);
>>   
>>   #endif /* __XFS_ATTR_REMOTE_H__ */
>> -- 
>> 2.7.4
>>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 4/9] xfs: Set up infastructure for deferred attribute operations
  2019-04-18 15:48   ` Brian Foster
@ 2019-04-18 21:27     ` Allison Henderson
  2019-04-22 11:00       ` Brian Foster
  0 siblings, 1 reply; 48+ messages in thread
From: Allison Henderson @ 2019-04-18 21:27 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On 4/18/19 8:48 AM, Brian Foster wrote:
> On Fri, Apr 12, 2019 at 03:50:31PM -0700, Allison Henderson wrote:
>> This patch adds two new log item types for setting or
>> removing attributes as deferred operations.  The
>> xfs_attri_log_item logs an intent to set or remove an
>> attribute.  The corresponding xfs_attrd_log_item holds
>> a reference to the xfs_attri_log_item and is freed once
>> the transaction is done.  Both log items use a generic
>> xfs_attr_log_format structure that contains the attribute
>> name, value, flags, inode, and an op_flag that indicates
>> if the operations is a set or remove.
>>
>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
>> ---
> 
> This mostly looks sane to me on a first high level pass. We're adding
> the intent/done log item infrastructure for attrs, associated dfops
> processing code and log recovery hooks. I'll probably have to go back
> through this once I get further through the series and have grokked more
> context, but so far I think I just have some various nits and aesthetic
> comments.
> 
> Firstly, note that git complained about an extra blank line at EOF of
> xfs_trans_attr.c when I applied this patch. Also, the commit log above
> looks like it could be widened (I think 68 chars is the standard) and
> could probably include a bit more context on the big picture changes
> associated with this work. In general, I think the commit log should
> (briefly) explain 1.) how attrs currently work 2.) how things are
> expected to work based on this infrastructure and 3.) the advantage(s)
> of doing so.

Sure, I will get these suggestions added in the next update


> 
> For example, one thing that is glossed over is that this implies we'll
> be logging xattr values even in remote attribute block cases. BTW, do we
> need to update the transaction reservation to account for that? I didn't
> notice that being changed anwhere (yet)..

Hmm, the pptr set does some accounting for the extra attrs in create, 
move and remove operations, but I dont think there's any new adjustments 
for remote attribute blocks.  I will look into that.  Thx!

> 
>>   fs/xfs/Makefile                |   2 +
>>   fs/xfs/libxfs/xfs_attr.c       |   5 +-
>>   fs/xfs/libxfs/xfs_attr.h       |  25 ++
>>   fs/xfs/libxfs/xfs_defer.c      |   1 +
>>   fs/xfs/libxfs/xfs_defer.h      |   3 +
>>   fs/xfs/libxfs/xfs_log_format.h |  44 +++-
>>   fs/xfs/libxfs/xfs_types.h      |   1 +
>>   fs/xfs/xfs_attr_item.c         | 558 +++++++++++++++++++++++++++++++++++++++++
>>   fs/xfs/xfs_attr_item.h         | 103 ++++++++
>>   fs/xfs/xfs_log_recover.c       | 172 +++++++++++++
>>   fs/xfs/xfs_ondisk.h            |   2 +
>>   fs/xfs/xfs_trans.h             |  10 +
>>   fs/xfs/xfs_trans_attr.c        | 240 ++++++++++++++++++
>>   13 files changed, 1162 insertions(+), 4 deletions(-)
>>
> ...
>> diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
>> new file mode 100644
>> index 0000000..0ea19b4
>> --- /dev/null
>> +++ b/fs/xfs/xfs_attr_item.c
>> @@ -0,0 +1,558 @@
>> +// SPDX-License-Identifier: GPL-2.0+
>> +/*
>> + * Copyright (C) 2019 Oracle.  All Rights Reserved.
>> + * Author: Allison Henderson <allison.henderson@oracle.com>
>> + */
>> +#include "xfs.h"
>> +#include "xfs_fs.h"
>> +#include "xfs_format.h"
>> +#include "xfs_log_format.h"
>> +#include "xfs_trans_resv.h"
>> +#include "xfs_bit.h"
>> +#include "xfs_mount.h"
>> +#include "xfs_trans.h"
>> +#include "xfs_trans_priv.h"
>> +#include "xfs_buf_item.h"
>> +#include "xfs_attr_item.h"
>> +#include "xfs_log.h"
>> +#include "xfs_btree.h"
>> +#include "xfs_rmap.h"
>> +#include "xfs_inode.h"
>> +#include "xfs_icache.h"
>> +#include "xfs_attr.h"
>> +#include "xfs_shared.h"
>> +#include "xfs_da_format.h"
>> +#include "xfs_da_btree.h"
>> +
>> +static inline struct xfs_attri_log_item *ATTRI_ITEM(struct xfs_log_item *lip)
>> +{
>> +	return container_of(lip, struct xfs_attri_log_item, item);
>> +}
>> +
>> +void
>> +xfs_attri_item_free(
>> +	struct xfs_attri_log_item	*attrip)
>> +{
>> +	kmem_free(attrip->item.li_lv_shadow);
>> +	kmem_free(attrip);
>> +}
>> +
>> +/*
>> + * This returns the number of iovecs needed to log the given attri item.
>> + * We only need 1 iovec for an attri item.  It just logs the attr_log_format
>> + * structure.
>> + */
>> +static inline int
>> +xfs_attri_item_sizeof(
>> +	struct xfs_attri_log_item *attrip)
>> +{
>> +	return sizeof(struct xfs_attri_log_format);
>> +}
>> +
>> +STATIC void
>> +xfs_attri_item_size(
>> +	struct xfs_log_item	*lip,
>> +	int			*nvecs,
>> +	int			*nbytes)
>> +{
>> +	struct xfs_attri_log_item       *attrip = ATTRI_ITEM(lip);
>> +
>> +	*nvecs += 1;
>> +	*nbytes += xfs_attri_item_sizeof(attrip);
>> +
>> +	if (attrip->name_len > 0) {
>> +		*nvecs += 1;
>> +		*nbytes += ATTR_NVEC_SIZE(attrip->name_len);
>> +	}
>> +
>> +	if (attrip->value_len > 0) {
>> +		*nvecs += 1;
>> +		*nbytes += ATTR_NVEC_SIZE(attrip->value_len);
>> +	}
>> +}
>> +
>> +/*
>> + * This is called to fill in the vector of log iovecs for the
>> + * given attri log item. We use only 1 iovec, and we point that
>> + * at the attri_log_format structure embedded in the attri item.
>> + * It is at this point that we assert that all of the attr
>> + * slots in the attri item have been filled.
>> + */
> 
> I see a bunch of places throughout this patch such as above where the
> line length formatting looks inconsistent. The above comment should be
> widened to 80 chars. I'm sure much of this code was boilerplate brought
> over from other log items and such, but we should take the opportunity
> to properly format the new code we're adding.
Yes, I loosely modeled it of the efi code at the time.  I will go 
through and do some clean up with the line lengths

> 
>> +STATIC void
>> +xfs_attri_item_format(
>> +	struct xfs_log_item	*lip,
>> +	struct xfs_log_vec	*lv)
>> +{
>> +	struct xfs_attri_log_item	*attrip = ATTRI_ITEM(lip);
>> +	struct xfs_log_iovec	*vecp = NULL;
>> +
>> +	attrip->format.alfi_type = XFS_LI_ATTRI;
>> +	attrip->format.alfi_size = 1;
>> +	if (attrip->name_len > 0)
>> +		attrip->format.alfi_size++;
>> +	if (attrip->value_len > 0)
>> +		attrip->format.alfi_size++;
>> +
> 
> I'd move these afli_size updates to the equivalent if checks below.
Alrighty, will do

> 
>> +	xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTRI_FORMAT,
>> +			&attrip->format,
>> +			xfs_attri_item_sizeof(attrip));
>> +	if (attrip->name_len > 0)
>> +		xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTR_NAME,
>> +				attrip->name, ATTR_NVEC_SIZE(attrip->name_len));
>> +
>> +	if (attrip->value_len > 0)
>> +		xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTR_VALUE,
>> +				attrip->value,
>> +				ATTR_NVEC_SIZE(attrip->value_len));
>> +}
>> +
>> +
>> +/*
>> + * Pinning has no meaning for an attri item, so just return.
>> + */
>> +STATIC void
>> +xfs_attri_item_pin(
>> +	struct xfs_log_item	*lip)
>> +{
>> +}
>> +
>> +/*
>> + * The unpin operation is the last place an ATTRI is manipulated in the log. It
>> + * is either inserted in the AIL or aborted in the event of a log I/O error. In
>> + * either case, the ATTRI transaction has been successfully committed to make it
>> + * this far. Therefore, we expect whoever committed the ATTRI to either
>> + * construct and commit the ATTRD or drop the ATTRD's reference in the event of
>> + * error. Simply drop the log's ATTRI reference now that the log is done with
>> + * it.
>> + */
>> +STATIC void
>> +xfs_attri_item_unpin(
>> +	struct xfs_log_item	*lip,
>> +	int			remove)
>> +{
>> +	struct xfs_attri_log_item	*attrip = ATTRI_ITEM(lip);
>> +
>> +	xfs_attri_release(attrip);
>> +}
>> +
>> +/*
>> + * attri items have no locking or pushing.  However, since ATTRIs are pulled
>> + * from the AIL when their corresponding ATTRDs are committed to disk, their
>> + * situation is very similar to being pinned.  Return XFS_ITEM_PINNED so that
>> + * the caller will eventually flush the log.  This should help in getting the
>> + * ATTRI out of the AIL.
>> + */
>> +STATIC uint
>> +xfs_attri_item_push(
>> +	struct xfs_log_item	*lip,
>> +	struct list_head	*buffer_list)
>> +{
>> +	return XFS_ITEM_PINNED;
>> +}
>> +
>> +/*
>> + * The ATTRI has been either committed or aborted if the transaction has been
>> + * cancelled. If the transaction was cancelled, an ATTRD isn't going to be
>> + * constructed and thus we free the ATTRI here directly.
>> + */
>> +STATIC void
>> +xfs_attri_item_unlock(
>> +	struct xfs_log_item	*lip)
>> +{
>> +	if (test_bit(XFS_LI_ABORTED, &lip->li_flags))
>> +		xfs_attri_release(ATTRI_ITEM(lip));
>> +}
>> +
>> +/*
>> + * The ATTRI is logged only once and cannot be moved in the log, so simply
>> + * return the lsn at which it's been logged.
>> + */
>> +STATIC xfs_lsn_t
>> +xfs_attri_item_committed(
>> +	struct xfs_log_item	*lip,
>> +	xfs_lsn_t		lsn)
>> +{
>> +	return lsn;
>> +}
>> +
>> +STATIC void
>> +xfs_attri_item_committing(
>> +	struct xfs_log_item	*lip,
>> +	xfs_lsn_t		lsn)
>> +{
>> +}
>> +
>> +/*
>> + * This is the ops vector shared by all attri log items.
>> + */
>> +static const struct xfs_item_ops xfs_attri_item_ops = {
>> +	.iop_size	= xfs_attri_item_size,
>> +	.iop_format	= xfs_attri_item_format,
>> +	.iop_pin	= xfs_attri_item_pin,
>> +	.iop_unpin	= xfs_attri_item_unpin,
>> +	.iop_unlock	= xfs_attri_item_unlock,
>> +	.iop_committed	= xfs_attri_item_committed,
>> +	.iop_push	= xfs_attri_item_push,
>> +	.iop_committing = xfs_attri_item_committing
>> +};
>> +
>> +
>> +/*
>> + * Allocate and initialize an attri item
>> + */
>> +struct xfs_attri_log_item *
>> +xfs_attri_init(
>> +	struct xfs_mount	*mp)
>> +
>> +{
>> +	struct xfs_attri_log_item	*attrip;
>> +	uint			size;
>> +
>> +	size = (uint)(sizeof(struct xfs_attri_log_item));
>> +	attrip = kmem_zalloc(size, KM_SLEEP);
>> +
>> +	xfs_log_item_init(mp, &(attrip->item), XFS_LI_ATTRI,
>> +			  &xfs_attri_item_ops);
> 
> No need for those braces around attrip->item, and with those removed we
> can reduce this to a single line.
> 
>> +	attrip->format.alfi_id = (uintptr_t)(void *)attrip;
>> +	atomic_set(&attrip->refcount, 2);
>> +
>> +	return attrip;
>> +}
>> +
>> +/*
>> + * Copy an attr format buffer from the given buf, and into the destination
>> + * attr format structure.
>> + */
>> +int
>> +xfs_attri_copy_format(struct xfs_log_iovec *buf,
>> +		      struct xfs_attri_log_format *dst_attr_fmt)
>> +{
>> +	struct xfs_attri_log_format *src_attr_fmt = buf->i_addr;
>> +	uint len = sizeof(struct xfs_attri_log_format);
>> +
>> +	if (buf->i_len == len) {
>> +		memcpy((char *)dst_attr_fmt, (char *)src_attr_fmt, len);
>> +		return 0;
>> +	}
>> +	return -EFSCORRUPTED;
> 
> Can we invert the logic flow here (and below)? I.e.,
> 
> 	...
> 	if (buf->i_len != len)
> 		return -EFSCORRUPTED;
> 	memcpy(...);
> 	return 0;
Sure, I think that looks simpler too.

> 
>> +}
>> +
>> +/*
>> + * Copy an attr format buffer from the given buf, and into the destination
>> + * attr format structure.
>> + */
>> +int
>> +xfs_attrd_copy_format(struct xfs_log_iovec *buf,
>> +		      struct xfs_attrd_log_format *dst_attr_fmt)
>> +{
>> +	struct xfs_attrd_log_format *src_attr_fmt = buf->i_addr;
>> +	uint len = sizeof(struct xfs_attrd_log_format);
>> +
>> +	if (buf->i_len == len) {
>> +		memcpy((char *)dst_attr_fmt, (char *)src_attr_fmt, len);
>> +		return 0;
>> +	}
>> +	return -EFSCORRUPTED;
>> +}
>> +
> 
> This function appears to be unused. The recover code looks like it just
> casts the iovec buffer directly to an attrd_log_format to determine the
> id.
Ok, I will see if I can take it out then.

> 
>> +/*
>> + * Freeing the attrip requires that we remove it from the AIL if it has already
>> + * been placed there. However, the ATTRI may not yet have been placed in the
>> + * AIL when called by xfs_attri_release() from ATTRD processing due to the
>> + * ordering of committed vs unpin operations in bulk insert operations. Hence
>> + * the reference count to ensure only the last caller frees the ATTRI.
>> + */
>> +void
>> +xfs_attri_release(
>> +	struct xfs_attri_log_item	*attrip)
>> +{
>> +	ASSERT(atomic_read(&attrip->refcount) > 0);
>> +	if (atomic_dec_and_test(&attrip->refcount)) {
>> +		xfs_trans_ail_remove(&attrip->item, SHUTDOWN_LOG_IO_ERROR);
>> +		xfs_attri_item_free(attrip);
>> +	}
>> +}
>> +
>> +static inline struct xfs_attrd_log_item *ATTRD_ITEM(struct xfs_log_item *lip)
>> +{
>> +	return container_of(lip, struct xfs_attrd_log_item, item);
>> +}
>> +
>> +STATIC void
>> +xfs_attrd_item_free(struct xfs_attrd_log_item *attrdp)
>> +{
>> +	kmem_free(attrdp->item.li_lv_shadow);
>> +	kmem_free(attrdp);
>> +}
>> +
>> +/*
>> + * This returns the number of iovecs needed to log the given attrd item.
>> + * We only need 1 iovec for an attrd item.  It just logs the attr_log_format
>> + * structure.
>> + */
>> +static inline int
>> +xfs_attrd_item_sizeof(
>> +	struct xfs_attrd_log_item *attrdp)
>> +{
>> +	return sizeof(struct xfs_attrd_log_format);
>> +}
>> +
>> +STATIC void
>> +xfs_attrd_item_size(
>> +	struct xfs_log_item	*lip,
>> +	int			*nvecs,
>> +	int			*nbytes)
>> +{
>> +	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
>> +	*nvecs += 1;
>> +	*nbytes += xfs_attrd_item_sizeof(attrdp);
>> +}
>> +
>> +/*
>> + * This is called to fill in the vector of log iovecs for the
>> + * given attrd log item. We use only 1 iovec, and we point that
>> + * at the attr_log_format structure embedded in the attrd item.
>> + * It is at this point that we assert that all of the attr
>> + * slots in the attrd item have been filled.
>> + */
>> +STATIC void
>> +xfs_attrd_item_format(
>> +	struct xfs_log_item	*lip,
>> +	struct xfs_log_vec	*lv)
>> +{
>> +	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
>> +	struct xfs_log_iovec	*vecp = NULL;
>> +
>> +	attrdp->format.alfd_type = XFS_LI_ATTRD;
>> +	attrdp->format.alfd_size = 1;
>> +
>> +	xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTRD_FORMAT,
>> +			&attrdp->format,
>> +			xfs_attrd_item_sizeof(attrdp));
> 
> The above looks like it could be shrunk to 2 lines as well after 80 char
> widening. Note that I'm sure I haven't caught all of these, just
> pointing out some examples as I notice them.
> 
> FWIW, if you happen to use vim, I sometimes use ':set cc=80' to draw an
> 80 char line in the viewer that helps to quickly eyeball new code for
> this kind of thing.
I do use vim, so this is very helpful!  I will add that to my config.  Thx!

> 
>> +}
>> +
>> +/*
>> + * Pinning has no meaning for an attrd item, so just return.
>> + */
>> +STATIC void
>> +xfs_attrd_item_pin(
>> +	struct xfs_log_item	*lip)
>> +{
>> +}
>> +
>> +/*
>> + * Since pinning has no meaning for an attrd item, unpinning does
>> + * not either.
>> + */
>> +STATIC void
>> +xfs_attrd_item_unpin(
>> +	struct xfs_log_item	*lip,
>> +	int			remove)
>> +{
>> +}
>> +
>> +/*
>> + * There isn't much you can do to push on an attrd item.  It is simply stuck
>> + * waiting for the log to be flushed to disk.
>> + */
>> +STATIC uint
>> +xfs_attrd_item_push(
>> +	struct xfs_log_item	*lip,
>> +	struct list_head	*buffer_list)
>> +{
>> +	return XFS_ITEM_PINNED;
>> +}
>> +
>> +/*
>> + * The ATTRD is either committed or aborted if the transaction is cancelled. If
>> + * the transaction is cancelled, drop our reference to the ATTRI and free the
>> + * ATTRD.
>> + */
>> +STATIC void
>> +xfs_attrd_item_unlock(
>> +	struct xfs_log_item	*lip)
>> +{
>> +	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
>> +
>> +	if (test_bit(XFS_LI_ABORTED, &lip->li_flags)) {
>> +		xfs_attri_release(attrdp->attrip);
>> +		xfs_attrd_item_free(attrdp);
>> +	}
>> +}
>> +
>> +/*
>> + * When the attrd item is committed to disk, all we need to do is delete our
>> + * reference to our partner attri item and then free ourselves. Since we're
>> + * freeing ourselves we must return -1 to keep the transaction code from
>> + * further referencing this item.
>> + */
>> +STATIC xfs_lsn_t
>> +xfs_attrd_item_committed(
>> +	struct xfs_log_item	*lip,
>> +	xfs_lsn_t		lsn)
>> +{
>> +	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
>> +
>> +	/*
>> +	 * Drop the ATTRI reference regardless of whether the ATTRD has been
>> +	 * aborted. Once the ATTRD transaction is constructed, it is the sole
>> +	 * responsibility of the ATTRD to release the ATTRI (even if the ATTRI
>> +	 * is aborted due to log I/O error).
>> +	 */
>> +	xfs_attri_release(attrdp->attrip);
>> +	xfs_attrd_item_free(attrdp);
>> +
>> +	return (xfs_lsn_t)-1;
>> +}
>> +
>> +STATIC void
>> +xfs_attrd_item_committing(
>> +	struct xfs_log_item	*lip,
>> +	xfs_lsn_t		lsn)
>> +{
>> +}
>> +
>> +/*
>> + * This is the ops vector shared by all attrd log items.
>> + */
>> +static const struct xfs_item_ops xfs_attrd_item_ops = {
>> +	.iop_size	= xfs_attrd_item_size,
>> +	.iop_format	= xfs_attrd_item_format,
>> +	.iop_pin	= xfs_attrd_item_pin,
>> +	.iop_unpin	= xfs_attrd_item_unpin,
>> +	.iop_unlock	= xfs_attrd_item_unlock,
>> +	.iop_committed	= xfs_attrd_item_committed,
>> +	.iop_push	= xfs_attrd_item_push,
>> +	.iop_committing = xfs_attrd_item_committing
>> +};
>> +
>> +/*
>> + * Allocate and initialize an attrd item
>> + */
>> +struct xfs_attrd_log_item *
>> +xfs_attrd_init(
>> +	struct xfs_mount	*mp,
>> +	struct xfs_attri_log_item	*attrip)
>> +
>> +{
>> +	struct xfs_attrd_log_item	*attrdp;
>> +	uint			size;
>> +
>> +	size = (uint)(sizeof(struct xfs_attrd_log_item));
>> +	attrdp = kmem_zalloc(size, KM_SLEEP);
>> +
>> +	xfs_log_item_init(mp, &attrdp->item, XFS_LI_ATTRD,
>> +			  &xfs_attrd_item_ops);
>> +	attrdp->attrip = attrip;
>> +	attrdp->format.alfd_alf_id = attrip->format.alfi_id;
>> +
>> +	return attrdp;
>> +}
>> +
>> +/*
>> + * Process an attr intent item that was recovered from
>> + * the log.  We need to delete the attr that it describes.
>> + */
> 
> ^^^ :)
> 
>> +int
>> +xfs_attri_recover(
>> +	struct xfs_mount		*mp,
>> +	struct xfs_attri_log_item	*attrip)
>> +{
>> +	struct xfs_inode		*ip;
>> +	struct xfs_attrd_log_item	*attrdp;
>> +	struct xfs_da_args		args;
>> +	struct xfs_attri_log_format	*attrp;
>> +	struct xfs_trans_res		tres;
>> +	int				local;
>> +	int				error = 0;
>> +	int				rsvd = 0;
>> +
>> +	ASSERT(!test_bit(XFS_ATTRI_RECOVERED, &attrip->flags));
>> +
>> +	/*
>> +	 * First check the validity of the attr described by the
>> +	 * ATTRI.  If any are bad, then assume that all are bad and
>> +	 * just toss the ATTRI.
>> +	 */
>> +	attrp = &attrip->format;
>> +	if (
>> +	    /*
>> +	     * Must have either XFS_ATTR_OP_FLAGS_SET or
>> +	     * XFS_ATTR_OP_FLAGS_REMOVE set
>> +	     */
>> +	    !(attrp->alfi_op_flags == XFS_ATTR_OP_FLAGS_SET ||
>> +		attrp->alfi_op_flags == XFS_ATTR_OP_FLAGS_REMOVE) ||
>> +
>> +	    /* Check size of value and name lengths */
>> +	    (attrp->alfi_value_len > XATTR_SIZE_MAX ||
>> +		attrp->alfi_name_len > XATTR_NAME_MAX) ||
>> +
>> +	    /*
>> +	     * If the XFS_ATTR_OP_FLAGS_SET flag is set,
>> +	     * there must also be a name and value
>> +	     */
>> +	    (attrp->alfi_op_flags == XFS_ATTR_OP_FLAGS_SET &&
>> +		(attrp->alfi_value_len == 0 || attrp->alfi_name_len == 0)) ||
> 
> It's been a while since I've played with any attribute stuff, but is
> this always the case or can we not have an empty attribute?

I remember us having some discussion about this in an older review, 
where in we thought all set operations have a to have value.  But after 
digging around a bit, I think generic 062 does expect that you can set 
an attribute to nothing.

Since the test does not force a recovery, we probably have never 
encountered the scenario of recovering an attribute with no value. So I 
think we got away with the alfi_value_len == 0 check even though we 
should not have.

I will adjust the logic here.  Maybe when we get this set finished out, 
it might be a good idea to have a test case that checks for that?

Thx for the catch!
> 
>> +
>> +	    /*
>> +	     * If the XFS_ATTR_OP_FLAGS_REMOVE flag is set,
>> +	     * there must also be a name
>> +	     */
>> +	    (attrp->alfi_op_flags == XFS_ATTR_OP_FLAGS_REMOVE &&
>> +		(attrp->alfi_name_len == 0))
>> +	) {
> 
> Comments are always nice of course, but interspersed with logic like
> this makes the whole thing hard to read. I'd suggest to just generalize
> the comment to include whatever things are non-obvious, condense the if
> logic and leave the comment above it.
Ok, I think probably we only need to check namelen anyway based off the 
above observation too.

> 
>> +		/*
>> +		 * This will pull the ATTRI from the AIL and
>> +		 * free the memory associated with it.
>> +		 */
>> +		set_bit(XFS_ATTRI_RECOVERED, &attrip->flags);
>> +		xfs_attri_release(attrip);
>> +		return -EIO;
>> +	}
>> +
>> +	attrp = &attrip->format;
>> +	error = xfs_iget(mp, 0, attrp->alfi_ino, 0, 0, &ip);
>> +	if (error)
>> +		return error;
>> +
>> +	error = xfs_attr_args_init(&args, ip, attrip->name,
>> +			attrp->alfi_name_len, attrp->alfi_attr_flags);
>> +	if (error)
>> +		return error;
>> +
>> +	args.hashval = xfs_da_hashname(args.name, args.namelen);
>> +	args.value = attrip->value;
>> +	args.valuelen = attrp->alfi_value_len;
>> +	args.op_flags = XFS_DA_OP_OKNOENT;
>> +	args.total = xfs_attr_calc_size(&args, &local);
>> +
>> +	tres.tr_logres = M_RES(mp)->tr_attrsetm.tr_logres +
>> +			M_RES(mp)->tr_attrsetrt.tr_logres * args.total;
>> +	tres.tr_logcount = XFS_ATTRSET_LOG_COUNT;
>> +	tres.tr_logflags = XFS_TRANS_PERM_LOG_RES;
>> +
>> +	error = xfs_trans_alloc(mp, &tres, args.total,  0,
>> +				rsvd ? XFS_TRANS_RESERVE : 0, &args.trans);
>> +	if (error)
>> +		return error;
>> +	attrdp = xfs_trans_get_attrd(args.trans, attrip);
>> +
>> +	xfs_ilock(ip, XFS_ILOCK_EXCL);
>> +
>> +	xfs_trans_ijoin(args.trans, ip, 0);
>> +	error = xfs_trans_attr(&args, attrdp, attrp->alfi_op_flags);
>> +	if (error)
>> +		goto abort_error;
>> +
>> +
>> +	set_bit(XFS_ATTRI_RECOVERED, &attrip->flags);
>> +	xfs_trans_log_inode(args.trans, ip, XFS_ILOG_CORE | XFS_ILOG_ADATA);
>> +	error = xfs_trans_commit(args.trans);
>> +	xfs_iunlock(ip, XFS_ILOCK_EXCL);
>> +	return error;
>> +
>> +abort_error:
>> +	xfs_trans_cancel(args.trans);
>> +	xfs_iunlock(ip, XFS_ILOCK_EXCL);
>> +	return error;
>> +}
>> diff --git a/fs/xfs/xfs_attr_item.h b/fs/xfs/xfs_attr_item.h
>> new file mode 100644
>> index 0000000..fce7515
>> --- /dev/null
>> +++ b/fs/xfs/xfs_attr_item.h
>> @@ -0,0 +1,103 @@
>> +// SPDX-License-Identifier: GPL-2.0+
>> +/*
>> + * Copyright (C) 2019 Oracle.  All Rights Reserved.
>> + * Author: Allison Henderson <allison.henderson@oracle.com>
>> + */
>> +#ifndef	__XFS_ATTR_ITEM_H__
>> +#define	__XFS_ATTR_ITEM_H__
>> +
>> +/* kernel only ATTRI/ATTRD definitions */
>> +
>> +struct xfs_mount;
>> +struct kmem_zone;
>> +
>> +/*
>> + * Max number of attrs in fast allocation path.
>> + */
>> +#define XFS_ATTRI_MAX_FAST_ATTRS        1
>> +
>> +
>> +/*
>> + * Define ATTR flag bits. Manipulated by set/clear/test_bit operators.
>> + */
>> +#define	XFS_ATTRI_RECOVERED	1
>> +
>> +
>> +/* nvecs must be in multiples of 4 */
>> +#define ATTR_NVEC_SIZE(size) (size == sizeof(int32_t) ? sizeof(int32_t) : \
>> +				size + sizeof(int32_t) - \
>> +				(size % sizeof(int32_t)))
>> +
> 
> Why? Also, any reason we couldn't use round_up() or some such here?
There's an assertion that checks for this in the recovery.  Without this 
padding I can quickly recreate it:

Assertion failed: reg->i_len % sizeof(int32_t) == 0, file: 
fs/xfs/xfs_log.c, line: 2484

It wasnt entirly clear from the context as to why, I assumed it must be 
something to do with not wanting log items falling onto odd ball byte 
alignments?


> 
>> +/*
>> + * This is the "attr intention" log item.  It is used to log the fact
>> + * that some attrs need to be processed.  It is used in conjunction with the
>> + * "attr done" log item described below.
>> + *
>> + * The ATTRI is reference counted so that it is not freed prior to both the
>> + * ATTRI and ATTRD being committed and unpinned. This ensures the ATTRI is
>> + * inserted into the AIL even in the event of out of order ATTRI/ATTRD
>> + * processing. In other words, an ATTRI is born with two references:
>> + *
>> + *      1.) an ATTRI held reference to track ATTRI AIL insertion
>> + *      2.) an ATTRD held reference to track ATTRD commit
>> + *
>> + * On allocation, both references are the responsibility of the caller. Once
>> + * the ATTRI is added to and dirtied in a transaction, ownership of reference
>> + * one transfers to the transaction. The reference is dropped once the ATTRI is
>> + * inserted to the AIL or in the event of failure along the way (e.g., commit
>> + * failure, log I/O error, etc.). Note that the caller remains responsible for
>> + * the ATTRD reference under all circumstances to this point. The caller has no
>> + * means to detect failure once the transaction is committed, however.
>> + * Therefore, an ATTRD is required after this point, even in the event of
>> + * unrelated failure.
>> + *
>> + * Once an ATTRD is allocated and dirtied in a transaction, reference two
>> + * transfers to the transaction. The ATTRD reference is dropped once it reaches
>> + * the unpin handler. Similar to the ATTRI, the reference also drops in the
>> + * event of commit failure or log I/O errors. Note that the ATTRD is not
>> + * inserted in the AIL, so at this point both the ATTI and ATTRD are freed.
>> + */
>> +struct xfs_attri_log_item {
>> +	xfs_log_item_t			item;
>> +	atomic_t			refcount;
>> +	unsigned long			flags;	/* misc flags */
>> +	int				name_len;
>> +	void				*name;
>> +	int				value_len;
>> +	void				*value;
>> +	struct xfs_attri_log_format	format;
>> +};
> 
> I think we usually try to use field prefix names in these various
> structures (as you've done in other places). I.e., attri_item,
> attrd_item, etc. would probably be consistent with similar structures
> like the efi/efd log items.
Sure, I can tack on the attri_* prefix here

> 
>> +
>> +/*
>> + * This is the "attr done" log item.  It is used to log
>> + * the fact that some attrs earlier mentioned in an attri item
>> + * have been freed.
>> + */
>> +struct xfs_attrd_log_item {
>> +	struct xfs_log_item		item;
>> +	struct xfs_attri_log_item	*attrip;
>> +	struct xfs_attrd_log_format	format;
>> +};
>> +
>> +/*
>> + * Max number of attrs in fast allocation path.
>> + */
>> +#define	XFS_ATTRD_MAX_FAST_ATTRS	1
>> +
>> +extern struct kmem_zone	*xfs_attri_zone;
>> +extern struct kmem_zone	*xfs_attrd_zone;
>> +
>> +struct xfs_attri_log_item	*xfs_attri_init(struct xfs_mount *mp);
>> +struct xfs_attrd_log_item	*xfs_attrd_init(struct xfs_mount *mp,
>> +					struct xfs_attri_log_item *attrip);
>> +int xfs_attri_copy_format(struct xfs_log_iovec *buf,
>> +			   struct xfs_attri_log_format *dst_attri_fmt);
>> +int xfs_attrd_copy_format(struct xfs_log_iovec *buf,
>> +			   struct xfs_attrd_log_format *dst_attrd_fmt);
>> +void			xfs_attri_item_free(struct xfs_attri_log_item *attrip);
>> +void			xfs_attri_release(struct xfs_attri_log_item *attrip);
>> +
>> +int			xfs_attri_recover(struct xfs_mount *mp,
>> +					struct xfs_attri_log_item *attrip);
>> +
>> +#endif	/* __XFS_ATTR_ITEM_H__ */
> ...
>> diff --git a/fs/xfs/xfs_trans_attr.c b/fs/xfs/xfs_trans_attr.c
>> new file mode 100644
>> index 0000000..3679348
>> --- /dev/null
>> +++ b/fs/xfs/xfs_trans_attr.c
>> @@ -0,0 +1,240 @@
>> +// SPDX-License-Identifier: GPL-2.0+
>> +/*
>> + * Copyright (C) 2019 Oracle.  All Rights Reserved.
>> + * Author: Allison Henderson <allison.henderson@oracle.com>
>> + */
>> +#include "xfs.h"
>> +#include "xfs_fs.h"
>> +#include "xfs_shared.h"
>> +#include "xfs_format.h"
>> +#include "xfs_log_format.h"
>> +#include "xfs_trans_resv.h"
>> +#include "xfs_bit.h"
>> +#include "xfs_mount.h"
>> +#include "xfs_defer.h"
>> +#include "xfs_trans.h"
>> +#include "xfs_trans_priv.h"
>> +#include "xfs_attr_item.h"
>> +#include "xfs_alloc.h"
>> +#include "xfs_bmap.h"
>> +#include "xfs_trace.h"
>> +#include "libxfs/xfs_da_format.h"
>> +#include "xfs_da_btree.h"
>> +#include "xfs_attr.h"
>> +#include "xfs_inode.h"
>> +#include "xfs_icache.h"
>> +#include "xfs_quota.h"
>> +
>> +/*
>> + * This routine is called to allocate an "attr free done"
>> + * log item.
>> + */
>> +struct xfs_attrd_log_item *
>> +xfs_trans_get_attrd(struct xfs_trans		*tp,
>> +		  struct xfs_attri_log_item	*attrip)
>> +{
>> +	struct xfs_attrd_log_item			*attrdp;
>> +
>> +	ASSERT(tp != NULL);
>> +
>> +	attrdp = xfs_attrd_init(tp->t_mountp, attrip);
>> +	ASSERT(attrdp != NULL);
>> +
>> +	/*
>> +	 * Get a log_item_desc to point at the new item.
>> +	 */
>> +	xfs_trans_add_item(tp, &attrdp->item);
>> +	return attrdp;
>> +}
>> +
>> +/*
>> + * Delete an attr and log it to the ATTRD. Note that the transaction is marked
>> + * dirty regardless of whether the attr delete succeeds or fails to support the
>> + * ATTRI/ATTRD lifecycle rules.
>> + */
>> +int
>> +xfs_trans_attr(
>> +	struct xfs_da_args		*args,
>> +	struct xfs_attrd_log_item	*attrdp,
>> +	uint32_t			op_flags)
>> +{
>> +	int				error;
>> +	struct xfs_buf			*leaf_bp = NULL;
>> +
>> +	error = xfs_qm_dqattach_locked(args->dp, 0);
>> +	if (error)
>> +		return error;
>> +
>> +	switch (op_flags) {
>> +	case XFS_ATTR_OP_FLAGS_SET:
>> +		args->op_flags |= XFS_DA_OP_ADDNAME;
>> +		error = xfs_attr_set_args(args, &leaf_bp, false);
>> +		break;
>> +	case XFS_ATTR_OP_FLAGS_REMOVE:
>> +		ASSERT(XFS_IFORK_Q((args->dp)));
>> +		error = xfs_attr_remove_args(args, false);
>> +		break;
>> +	default:
>> +		error = -EFSCORRUPTED;
>> +	}
>> +
>> +	if (error) {
>> +		if (leaf_bp)
>> +			xfs_trans_brelse(args->trans, leaf_bp);
>> +	}
>> +
>> +	/*
>> +	 * Mark the transaction dirty, even on error. This ensures the
>> +	 * transaction is aborted, which:
>> +	 *
>> +	 * 1.) releases the ATTRI and frees the ATTRD
>> +	 * 2.) shuts down the filesystem
>> +	 */
>> +	args->trans->t_flags |= XFS_TRANS_DIRTY;
>> +	set_bit(XFS_LI_DIRTY, &attrdp->item.li_flags);
>> +
>> +	attrdp->attrip->name = (void *)args->name;
>> +	attrdp->attrip->value = (void *)args->value;
>> +	attrdp->attrip->name_len = args->namelen;
>> +	attrdp->attrip->value_len = args->valuelen;
>> +
> 
> What's the reason for updating the attri here? It's already been
> committed by the time we get around to the attrd. Is this used again
> somewhere?
I think I may have observed it in other code I was using as a model at 
the time. It seems to be able to get along without it though, so I dont 
think it's used again.  I will go ahead and take it out.

> 
>> +	return error;
>> +}
>> +
>> +static int
>> +xfs_attr_diff_items(
>> +	void				*priv,
>> +	struct list_head		*a,
>> +	struct list_head		*b)
>> +{
>> +	return 0;
>> +}
>> +
>> +/* Get an ATTRI. */
>> +STATIC void *
>> +xfs_attr_create_intent(
>> +	struct xfs_trans		*tp,
>> +	unsigned int			count)
>> +{
>> +	struct xfs_attri_log_item		*attrip;
>> +
>> +	ASSERT(tp != NULL);
>> +	ASSERT(count == 1);
>> +
>> +	attrip = xfs_attri_init(tp->t_mountp);
>> +	ASSERT(attrip != NULL);
>> +
>> +	/*
>> +	 * Get a log_item_desc to point at the new item.
>> +	 */
>> +	xfs_trans_add_item(tp, &attrip->item);
>> +	return attrip;
>> +}
>> +
>> +/* Log an attr to the intent item. */
>> +STATIC void
>> +xfs_attr_log_item(
>> +	struct xfs_trans		*tp,
>> +	void				*intent,
>> +	struct list_head		*item)
>> +{
>> +	struct xfs_attri_log_item	*attrip = intent;
>> +	struct xfs_attr_item		*attr;
>> +	struct xfs_attri_log_format	*attrp;
>> +	char				*name_value;
>> +
>> +	attr = container_of(item, struct xfs_attr_item, xattri_list);
>> +	name_value = ((char *)attr) + sizeof(struct xfs_attr_item);
>> +
>> +	tp->t_flags |= XFS_TRANS_DIRTY;
>> +	set_bit(XFS_LI_DIRTY, &attrip->item.li_flags);
>> +
>> +	attrp = &attrip->format;
>> +	attrp->alfi_ino = attr->xattri_ip->i_ino;
>> +	attrp->alfi_op_flags = attr->xattri_op_flags;
>> +	attrp->alfi_value_len = attr->xattri_value_len;
>> +	attrp->alfi_name_len = attr->xattri_name_len;
>> +	attrp->alfi_attr_flags = attr->xattri_flags;
>> +
>> +	attrip->name = name_value;
>> +	attrip->value = &name_value[attr->xattri_name_len];
>> +	attrip->name_len = attr->xattri_name_len;
>> +	attrip->value_len = attr->xattri_value_len;
> 
> So once we're at this point, we've constructed an xfs_attr_item to
> describe the high level deferred operation, created an intent log item
> and we're now logging that xfs_attri_log_item. We fill in the log format
> structure based on the xfs_attr_item and point the xfs_attri_log_item
> name/value pointers at the xfs_attr_item memory. It's thus important to
> note we've established a subtle relationship between these two data
> structures because they may have different lifecycles.

Right, I can add some comments if you like?  I guess i assume people 
have seen these patterns enough to not need them, but the extra 
explaining never hurts I suppose :-)

> 
>> +}
>> +
>> +/* Get an ATTRD so we can process all the attrs. */
>> +STATIC void *
>> +xfs_attr_create_done(
>> +	struct xfs_trans		*tp,
>> +	void				*intent,
>> +	unsigned int			count)
>> +{
>> +	return xfs_trans_get_attrd(tp, intent);
>> +}
>> +
>> +/* Process an attr. */
>> +STATIC int
>> +xfs_attr_finish_item(
>> +	struct xfs_trans		*tp,
>> +	struct list_head		*item,
>> +	void				*done_item,
>> +	void				**state)
>> +{
>> +	struct xfs_attr_item		*attr;
>> +	char				*name_value;
>> +	int				error;
>> +	int				local;
>> +	struct xfs_da_args		args;
>> +
>> +	attr = container_of(item, struct xfs_attr_item, xattri_list);
>> +	name_value = ((char *)attr) + sizeof(struct xfs_attr_item);
>> +
>> +	error = xfs_attr_args_init(&args, attr->xattri_ip, name_value,
>> +				   attr->xattri_name_len, attr->xattri_flags);
>> +	if (error)
>> +		goto out;
>> +
>> +	args.hashval = xfs_da_hashname(args.name, args.namelen);
>> +	args.value = &name_value[attr->xattri_name_len];
>> +	args.valuelen = attr->xattri_value_len;
>> +	args.op_flags = XFS_DA_OP_OKNOENT;
>> +	args.total = xfs_attr_calc_size(&args, &local);
>> +	args.trans = tp;
>> +
>> +	error = xfs_trans_attr(&args, done_item,
>> +			attr->xattri_op_flags);
> 
> So now we've committed/rolled our xfs_attri_log_item intent and
> created/attached the xfs_attrd_log_item and thus we're free to perform
> the operation...
> 
>> +out:
>> +	kmem_free(attr);
> 
> ... and here is where we end up freeing the xfs_attr_item created for
> the dfops infrastructure that holds our name and value memory.
> 
> Hmm.. I think this means our name/value memory accesses are safe because
> the xfs_attri_log_item only accesses them in the ->iop_format()
> callback, which occurs during transaction commit of the intent and we're
> long past that.
> 
> That said, the attri/attrd log items themselves outlive the current
> transaction commit sequence (i.e. until the attrd is physically
> logged/committed and we free both). That means that once we free the
> attr above we technically have an attri passing through the log
> infrastructure with a couple invalid pointers, they just don't happen to
> be used. It might be worth thinking about how we can clean that up,
> whether it be clearing those pointers here, or allocating the name/val
> memory separately and transferring it to the attri, etc. Whatever we end
> up doing, we should probably add a comment somewhere to explain exactly
> what's going on as well.
> 
> Brian

I see, thats a good observation.  I'll see if I can work in some clean 
up code and be sure to add some comentary to point it out.  Thanks for 
the thorough review!!  Much appreciated!!

Allison

> 
>> +	return error;
>> +}
>> +
>> +/* Abort all pending ATTRs. */
>> +STATIC void
>> +xfs_attr_abort_intent(
>> +	void				*intent)
>> +{
>> +	xfs_attri_release(intent);
>> +}
>> +
>> +/* Cancel an attr */
>> +STATIC void
>> +xfs_attr_cancel_item(
>> +	struct list_head		*item)
>> +{
>> +	struct xfs_attr_item	*attr;
>> +
>> +	attr = container_of(item, struct xfs_attr_item, xattri_list);
>> +	kmem_free(attr);
>> +}
>> +
>> +const struct xfs_defer_op_type xfs_attr_defer_type = {
>> +	.max_items	= XFS_ATTRI_MAX_FAST_ATTRS,
>> +	.diff_items	= xfs_attr_diff_items,
>> +	.create_intent	= xfs_attr_create_intent,
>> +	.abort_intent	= xfs_attr_abort_intent,
>> +	.log_item	= xfs_attr_log_item,
>> +	.create_done	= xfs_attr_create_done,
>> +	.finish_item	= xfs_attr_finish_item,
>> +	.cancel_item	= xfs_attr_cancel_item,
>> +};
>> +
>> -- 
>> 2.7.4
>>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 5/9] xfs: Add xfs_attr_set_deferred and xfs_attr_remove_deferred
  2019-04-18 15:49   ` Brian Foster
@ 2019-04-18 21:28     ` Allison Henderson
  2019-04-22 11:01       ` Brian Foster
  0 siblings, 1 reply; 48+ messages in thread
From: Allison Henderson @ 2019-04-18 21:28 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On 4/18/19 8:49 AM, Brian Foster wrote:
> On Fri, Apr 12, 2019 at 03:50:32PM -0700, Allison Henderson wrote:
>> These routines set up set and start a new deferred attribute
>> operation.  These functions are meant to be called by other
>> code needing to initiate a deferred attribute operation.  We
>> will use these routines later in the parent pointer patches.
>>
> 
> We probably don't need to reference the parent pointer stuff any more
> for this, right? I'm assuming we'll be converting generic attr
> infrastructure over to this mechanism in subsequent patches..?

Right, some of these comments are a little stale.  I will clean then up 
a bit.

> 
>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
>> ---
>>   fs/xfs/libxfs/xfs_attr.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++
>>   fs/xfs/libxfs/xfs_attr.h |  7 +++++
>>   2 files changed, 87 insertions(+)
>>
>> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
>> index fadd485..c3477fa7 100644
>> --- a/fs/xfs/libxfs/xfs_attr.c
>> +++ b/fs/xfs/libxfs/xfs_attr.c
>> @@ -30,6 +30,7 @@
>>   #include "xfs_trans_space.h"
>>   #include "xfs_trace.h"
>>   #include "xfs_attr_item.h"
>> +#include "xfs_attr.h"
>>   
>>   /*
>>    * xfs_attr.c
>> @@ -429,6 +430,52 @@ xfs_attr_set(
>>   	goto out_unlock;
>>   }
>>   
>> +/* Sets an attribute for an inode as a deferred operation */
>> +int
>> +xfs_attr_set_deferred(
>> +	struct xfs_inode	*dp,
>> +	struct xfs_trans	*tp,
>> +	const unsigned char	*name,
>> +	unsigned int		namelen,
>> +	const unsigned char	*value,
>> +	unsigned int		valuelen,
>> +	int			flags)
>> +{
>> +
>> +	struct xfs_attr_item	*new;
>> +	char			*name_value;
>> +
>> +	/*
>> +	 * All set operations must have a name
>> +	 * but not necessarily a value.
>> +	 * Generic 062
> 
> Comment formatting, also looks like there's some stale text or
> something.
I think I left that as a reminder to myself at one point and forgot to 
take it out :-)  I believe there was some discussion in earlier reviews 
about checking both name and value length, but later I ran into test 
cases that expect to be able to set an attribute with no value, so I 
guess not.  In any case, I will clean up the commentary here.

> 
>> +	 */
>> +	if (!namelen) {
>> +		ASSERT(0);
>> +		return -EFSCORRUPTED;
> 
> This is essentially a requested operation from userspace, right? If so,
> I'd think -EINVAL or something makes more sense than -EFSCORRUPTED.

Yeah, I think initially the plan was to have only parent pointers use 
the defer operations, but since now we are using them for all attr 
operations, it should probably be EINVAL.

> 
>> +	}
>> +
>> +	new = kmem_alloc(XFS_ATTR_ITEM_SIZEOF(namelen, valuelen),
>> +			 KM_SLEEP|KM_NOFS);
> 
> This could get interesting with larger attrs (up to 64k IIRC). We might
> want to consider kmem_alloc_large().

Thats a good point, I'll move it to the larger allocation.  Maybe I can 
make a test case for it as well.

> 
>> +	name_value = ((char *)new) + sizeof(struct xfs_attr_item);
>> +	memset(new, 0, XFS_ATTR_ITEM_SIZEOF(namelen, valuelen));
>> +	new->xattri_ip = dp;
>> +	new->xattri_op_flags = XFS_ATTR_OP_FLAGS_SET;
>> +	new->xattri_name_len = namelen;
>> +	new->xattri_value_len = valuelen;
>> +	new->xattri_flags = flags;
>> +	memcpy(&name_value[0], name, namelen);
> 
> name_value is just a char pointer. Do we need the whole array index just
> to deref thing here? Meh, I guess it's consistent with the value copy
> below. No big deal.
It's not needed.  I guess it just looks a little more consistent since 
we have things getting copied out at different offsets in the buffer.

> 
>> +	new->xattri_name = name_value;
>> +	new->xattri_value = name_value + namelen;
>> +
>> +	if (valuelen > 0)
>> +		memcpy(&name_value[namelen], value, valuelen);
>> +
>> +	xfs_defer_add(tp, XFS_DEFER_OPS_TYPE_ATTR, &new->xattri_list);
>> +
>> +	return 0;
>> +}
>> +
>>   /*
>>    * Generic handler routine to remove a name from an attribute list.
>>    * Transitions attribute list from Btree to shortform as necessary.
>> @@ -513,6 +560,39 @@ xfs_attr_remove(
>>   	return error;
>>   }
>>   
>> +/* Removes an attribute for an inode as a deferred operation */
>> +int
>> +xfs_attr_remove_deferred(
> 
> Hmm.. I'm kind of wondering if we actually need to defer attr removes.
> Do we have the same kind of challenges for attr removal as for attr
> creation, or is there some future scenario where this is needed?

I suppose we don't have to have it?  The motivation was to help break up 
the amount of transaction activity that happens on inode 
create/rename/remove operations once pptrs go in.  Attr remove does not 
look as complex as attr set, but I suppose it helps to some degree?

> 
>> +	struct xfs_inode        *dp,
>> +	struct xfs_trans	*tp,
>> +	const unsigned char	*name,
>> +	unsigned int		namelen,
>> +	int                     flags)
>> +{
>> +
>> +	struct xfs_attr_item	*new;
>> +	char			*name_value;
>> +
>> +	if (!namelen) {
>> +		ASSERT(0);
>> +		return -EFSCORRUPTED;
> 
> Similar comment around -EFSCORRUPTED vs. -EINVAL (or something else..).
Ok, I will change to EINVAL here too.

Thanks again for the reviews!!  They are very helpful!

Allison
> 
> Brian
> 
>> +	}
>> +
>> +	new = kmem_alloc(XFS_ATTR_ITEM_SIZEOF(namelen, 0), KM_SLEEP|KM_NOFS);
>> +	name_value = ((char *)new) + sizeof(struct xfs_attr_item);
>> +	memset(new, 0, XFS_ATTR_ITEM_SIZEOF(namelen, 0));
>> +	new->xattri_ip = dp;
>> +	new->xattri_op_flags = XFS_ATTR_OP_FLAGS_REMOVE;
>> +	new->xattri_name_len = namelen;
>> +	new->xattri_value_len = 0;
>> +	new->xattri_flags = flags;
>> +	memcpy(name_value, name, namelen);
>> +
>> +	xfs_defer_add(tp, XFS_DEFER_OPS_TYPE_ATTR, &new->xattri_list);
>> +
>> +	return 0;
>> +}
>> +
>>   /*========================================================================
>>    * External routines when attribute list is inside the inode
>>    *========================================================================*/
>> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
>> index 92d9a15..83b3621 100644
>> --- a/fs/xfs/libxfs/xfs_attr.h
>> +++ b/fs/xfs/libxfs/xfs_attr.h
>> @@ -175,5 +175,12 @@ bool xfs_attr_namecheck(const void *name, size_t length);
>>   int xfs_attr_args_init(struct xfs_da_args *args, struct xfs_inode *dp,
>>   			const unsigned char *name, size_t namelen, int flags);
>>   int xfs_attr_calc_size(struct xfs_da_args *args, int *local);
>> +int xfs_attr_set_deferred(struct xfs_inode *dp, struct xfs_trans *tp,
>> +			  const unsigned char *name, unsigned int name_len,
>> +			  const unsigned char *value, unsigned int valuelen,
>> +			  int flags);
>> +int xfs_attr_remove_deferred(struct xfs_inode *dp, struct xfs_trans *tp,
>> +			    const unsigned char *name, unsigned int namelen,
>> +			    int flags);
>>   
>>   #endif	/* __XFS_ATTR_H__ */
>> -- 
>> 2.7.4
>>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 4/9] xfs: Set up infastructure for deferred attribute operations
  2019-04-18 21:27     ` Allison Henderson
@ 2019-04-22 11:00       ` Brian Foster
  2019-04-22 22:00         ` Allison Henderson
  0 siblings, 1 reply; 48+ messages in thread
From: Brian Foster @ 2019-04-22 11:00 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Thu, Apr 18, 2019 at 02:27:15PM -0700, Allison Henderson wrote:
> On 4/18/19 8:48 AM, Brian Foster wrote:
> > On Fri, Apr 12, 2019 at 03:50:31PM -0700, Allison Henderson wrote:
> > > This patch adds two new log item types for setting or
> > > removing attributes as deferred operations.  The
> > > xfs_attri_log_item logs an intent to set or remove an
> > > attribute.  The corresponding xfs_attrd_log_item holds
> > > a reference to the xfs_attri_log_item and is freed once
> > > the transaction is done.  Both log items use a generic
> > > xfs_attr_log_format structure that contains the attribute
> > > name, value, flags, inode, and an op_flag that indicates
> > > if the operations is a set or remove.
> > > 
> > > Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> > > ---
> > 
> > This mostly looks sane to me on a first high level pass. We're adding
> > the intent/done log item infrastructure for attrs, associated dfops
> > processing code and log recovery hooks. I'll probably have to go back
> > through this once I get further through the series and have grokked more
> > context, but so far I think I just have some various nits and aesthetic
> > comments.
> > 
> > Firstly, note that git complained about an extra blank line at EOF of
> > xfs_trans_attr.c when I applied this patch. Also, the commit log above
> > looks like it could be widened (I think 68 chars is the standard) and
> > could probably include a bit more context on the big picture changes
> > associated with this work. In general, I think the commit log should
> > (briefly) explain 1.) how attrs currently work 2.) how things are
> > expected to work based on this infrastructure and 3.) the advantage(s)
> > of doing so.
> 
> Sure, I will get these suggestions added in the next update
> 
> 
> > 
> > For example, one thing that is glossed over is that this implies we'll
> > be logging xattr values even in remote attribute block cases. BTW, do we
> > need to update the transaction reservation to account for that? I didn't
> > notice that being changed anwhere (yet)..
> 
> Hmm, the pptr set does some accounting for the extra attrs in create, move
> and remove operations, but I dont think there's any new adjustments for
> remote attribute blocks.  I will look into that.  Thx!
> 
> > 
> > >   fs/xfs/Makefile                |   2 +
> > >   fs/xfs/libxfs/xfs_attr.c       |   5 +-
> > >   fs/xfs/libxfs/xfs_attr.h       |  25 ++
> > >   fs/xfs/libxfs/xfs_defer.c      |   1 +
> > >   fs/xfs/libxfs/xfs_defer.h      |   3 +
> > >   fs/xfs/libxfs/xfs_log_format.h |  44 +++-
> > >   fs/xfs/libxfs/xfs_types.h      |   1 +
> > >   fs/xfs/xfs_attr_item.c         | 558 +++++++++++++++++++++++++++++++++++++++++
> > >   fs/xfs/xfs_attr_item.h         | 103 ++++++++
> > >   fs/xfs/xfs_log_recover.c       | 172 +++++++++++++
> > >   fs/xfs/xfs_ondisk.h            |   2 +
> > >   fs/xfs/xfs_trans.h             |  10 +
> > >   fs/xfs/xfs_trans_attr.c        | 240 ++++++++++++++++++
> > >   13 files changed, 1162 insertions(+), 4 deletions(-)
> > > 
> > ...
> > > diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
> > > new file mode 100644
> > > index 0000000..0ea19b4
> > > --- /dev/null
> > > +++ b/fs/xfs/xfs_attr_item.c
> > > @@ -0,0 +1,558 @@
> > > +// SPDX-License-Identifier: GPL-2.0+
> > > +/*
> > > + * Copyright (C) 2019 Oracle.  All Rights Reserved.
> > > + * Author: Allison Henderson <allison.henderson@oracle.com>
> > > + */
> > > +#include "xfs.h"
> > > +#include "xfs_fs.h"
> > > +#include "xfs_format.h"
> > > +#include "xfs_log_format.h"
> > > +#include "xfs_trans_resv.h"
> > > +#include "xfs_bit.h"
> > > +#include "xfs_mount.h"
> > > +#include "xfs_trans.h"
> > > +#include "xfs_trans_priv.h"
> > > +#include "xfs_buf_item.h"
> > > +#include "xfs_attr_item.h"
> > > +#include "xfs_log.h"
> > > +#include "xfs_btree.h"
> > > +#include "xfs_rmap.h"
> > > +#include "xfs_inode.h"
> > > +#include "xfs_icache.h"
> > > +#include "xfs_attr.h"
> > > +#include "xfs_shared.h"
> > > +#include "xfs_da_format.h"
> > > +#include "xfs_da_btree.h"
> > > +
> > > +static inline struct xfs_attri_log_item *ATTRI_ITEM(struct xfs_log_item *lip)
> > > +{
> > > +	return container_of(lip, struct xfs_attri_log_item, item);
> > > +}
> > > +
> > > +void
> > > +xfs_attri_item_free(
> > > +	struct xfs_attri_log_item	*attrip)
> > > +{
> > > +	kmem_free(attrip->item.li_lv_shadow);
> > > +	kmem_free(attrip);
> > > +}
> > > +
> > > +/*
> > > + * This returns the number of iovecs needed to log the given attri item.
> > > + * We only need 1 iovec for an attri item.  It just logs the attr_log_format
> > > + * structure.
> > > + */
> > > +static inline int
> > > +xfs_attri_item_sizeof(
> > > +	struct xfs_attri_log_item *attrip)
> > > +{
> > > +	return sizeof(struct xfs_attri_log_format);
> > > +}
> > > +
> > > +STATIC void
> > > +xfs_attri_item_size(
> > > +	struct xfs_log_item	*lip,
> > > +	int			*nvecs,
> > > +	int			*nbytes)
> > > +{
> > > +	struct xfs_attri_log_item       *attrip = ATTRI_ITEM(lip);
> > > +
> > > +	*nvecs += 1;
> > > +	*nbytes += xfs_attri_item_sizeof(attrip);
> > > +
> > > +	if (attrip->name_len > 0) {
> > > +		*nvecs += 1;
> > > +		*nbytes += ATTR_NVEC_SIZE(attrip->name_len);
> > > +	}
> > > +
> > > +	if (attrip->value_len > 0) {
> > > +		*nvecs += 1;
> > > +		*nbytes += ATTR_NVEC_SIZE(attrip->value_len);
> > > +	}
> > > +}
> > > +
> > > +/*
> > > + * This is called to fill in the vector of log iovecs for the
> > > + * given attri log item. We use only 1 iovec, and we point that
> > > + * at the attri_log_format structure embedded in the attri item.
> > > + * It is at this point that we assert that all of the attr
> > > + * slots in the attri item have been filled.
> > > + */
> > 
> > I see a bunch of places throughout this patch such as above where the
> > line length formatting looks inconsistent. The above comment should be
> > widened to 80 chars. I'm sure much of this code was boilerplate brought
> > over from other log items and such, but we should take the opportunity
> > to properly format the new code we're adding.
> Yes, I loosely modeled it of the efi code at the time.  I will go through
> and do some clean up with the line lengths
> 
> > 
> > > +STATIC void
> > > +xfs_attri_item_format(
> > > +	struct xfs_log_item	*lip,
> > > +	struct xfs_log_vec	*lv)
> > > +{
> > > +	struct xfs_attri_log_item	*attrip = ATTRI_ITEM(lip);
> > > +	struct xfs_log_iovec	*vecp = NULL;
> > > +
> > > +	attrip->format.alfi_type = XFS_LI_ATTRI;
> > > +	attrip->format.alfi_size = 1;
> > > +	if (attrip->name_len > 0)
> > > +		attrip->format.alfi_size++;
> > > +	if (attrip->value_len > 0)
> > > +		attrip->format.alfi_size++;
> > > +
> > 
> > I'd move these afli_size updates to the equivalent if checks below.
> Alrighty, will do
> 
> > 
> > > +	xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTRI_FORMAT,
> > > +			&attrip->format,
> > > +			xfs_attri_item_sizeof(attrip));
> > > +	if (attrip->name_len > 0)
> > > +		xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTR_NAME,
> > > +				attrip->name, ATTR_NVEC_SIZE(attrip->name_len));
> > > +
> > > +	if (attrip->value_len > 0)
> > > +		xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTR_VALUE,
> > > +				attrip->value,
> > > +				ATTR_NVEC_SIZE(attrip->value_len));
> > > +}
> > > +
> > > +
> > > +/*
> > > + * Pinning has no meaning for an attri item, so just return.
> > > + */
> > > +STATIC void
> > > +xfs_attri_item_pin(
> > > +	struct xfs_log_item	*lip)
> > > +{
> > > +}
> > > +
> > > +/*
> > > + * The unpin operation is the last place an ATTRI is manipulated in the log. It
> > > + * is either inserted in the AIL or aborted in the event of a log I/O error. In
> > > + * either case, the ATTRI transaction has been successfully committed to make it
> > > + * this far. Therefore, we expect whoever committed the ATTRI to either
> > > + * construct and commit the ATTRD or drop the ATTRD's reference in the event of
> > > + * error. Simply drop the log's ATTRI reference now that the log is done with
> > > + * it.
> > > + */
> > > +STATIC void
> > > +xfs_attri_item_unpin(
> > > +	struct xfs_log_item	*lip,
> > > +	int			remove)
> > > +{
> > > +	struct xfs_attri_log_item	*attrip = ATTRI_ITEM(lip);
> > > +
> > > +	xfs_attri_release(attrip);
> > > +}
> > > +
> > > +/*
> > > + * attri items have no locking or pushing.  However, since ATTRIs are pulled
> > > + * from the AIL when their corresponding ATTRDs are committed to disk, their
> > > + * situation is very similar to being pinned.  Return XFS_ITEM_PINNED so that
> > > + * the caller will eventually flush the log.  This should help in getting the
> > > + * ATTRI out of the AIL.
> > > + */
> > > +STATIC uint
> > > +xfs_attri_item_push(
> > > +	struct xfs_log_item	*lip,
> > > +	struct list_head	*buffer_list)
> > > +{
> > > +	return XFS_ITEM_PINNED;
> > > +}
> > > +
> > > +/*
> > > + * The ATTRI has been either committed or aborted if the transaction has been
> > > + * cancelled. If the transaction was cancelled, an ATTRD isn't going to be
> > > + * constructed and thus we free the ATTRI here directly.
> > > + */
> > > +STATIC void
> > > +xfs_attri_item_unlock(
> > > +	struct xfs_log_item	*lip)
> > > +{
> > > +	if (test_bit(XFS_LI_ABORTED, &lip->li_flags))
> > > +		xfs_attri_release(ATTRI_ITEM(lip));
> > > +}
> > > +
> > > +/*
> > > + * The ATTRI is logged only once and cannot be moved in the log, so simply
> > > + * return the lsn at which it's been logged.
> > > + */
> > > +STATIC xfs_lsn_t
> > > +xfs_attri_item_committed(
> > > +	struct xfs_log_item	*lip,
> > > +	xfs_lsn_t		lsn)
> > > +{
> > > +	return lsn;
> > > +}
> > > +
> > > +STATIC void
> > > +xfs_attri_item_committing(
> > > +	struct xfs_log_item	*lip,
> > > +	xfs_lsn_t		lsn)
> > > +{
> > > +}
> > > +
> > > +/*
> > > + * This is the ops vector shared by all attri log items.
> > > + */
> > > +static const struct xfs_item_ops xfs_attri_item_ops = {
> > > +	.iop_size	= xfs_attri_item_size,
> > > +	.iop_format	= xfs_attri_item_format,
> > > +	.iop_pin	= xfs_attri_item_pin,
> > > +	.iop_unpin	= xfs_attri_item_unpin,
> > > +	.iop_unlock	= xfs_attri_item_unlock,
> > > +	.iop_committed	= xfs_attri_item_committed,
> > > +	.iop_push	= xfs_attri_item_push,
> > > +	.iop_committing = xfs_attri_item_committing
> > > +};
> > > +
> > > +
> > > +/*
> > > + * Allocate and initialize an attri item
> > > + */
> > > +struct xfs_attri_log_item *
> > > +xfs_attri_init(
> > > +	struct xfs_mount	*mp)
> > > +
> > > +{
> > > +	struct xfs_attri_log_item	*attrip;
> > > +	uint			size;
> > > +
> > > +	size = (uint)(sizeof(struct xfs_attri_log_item));
> > > +	attrip = kmem_zalloc(size, KM_SLEEP);
> > > +
> > > +	xfs_log_item_init(mp, &(attrip->item), XFS_LI_ATTRI,
> > > +			  &xfs_attri_item_ops);
> > 
> > No need for those braces around attrip->item, and with those removed we
> > can reduce this to a single line.
> > 
> > > +	attrip->format.alfi_id = (uintptr_t)(void *)attrip;
> > > +	atomic_set(&attrip->refcount, 2);
> > > +
> > > +	return attrip;
> > > +}
> > > +
> > > +/*
> > > + * Copy an attr format buffer from the given buf, and into the destination
> > > + * attr format structure.
> > > + */
> > > +int
> > > +xfs_attri_copy_format(struct xfs_log_iovec *buf,
> > > +		      struct xfs_attri_log_format *dst_attr_fmt)
> > > +{
> > > +	struct xfs_attri_log_format *src_attr_fmt = buf->i_addr;
> > > +	uint len = sizeof(struct xfs_attri_log_format);
> > > +
> > > +	if (buf->i_len == len) {
> > > +		memcpy((char *)dst_attr_fmt, (char *)src_attr_fmt, len);
> > > +		return 0;
> > > +	}
> > > +	return -EFSCORRUPTED;
> > 
> > Can we invert the logic flow here (and below)? I.e.,
> > 
> > 	...
> > 	if (buf->i_len != len)
> > 		return -EFSCORRUPTED;
> > 	memcpy(...);
> > 	return 0;
> Sure, I think that looks simpler too.
> 
> > 
> > > +}
> > > +
> > > +/*
> > > + * Copy an attr format buffer from the given buf, and into the destination
> > > + * attr format structure.
> > > + */
> > > +int
> > > +xfs_attrd_copy_format(struct xfs_log_iovec *buf,
> > > +		      struct xfs_attrd_log_format *dst_attr_fmt)
> > > +{
> > > +	struct xfs_attrd_log_format *src_attr_fmt = buf->i_addr;
> > > +	uint len = sizeof(struct xfs_attrd_log_format);
> > > +
> > > +	if (buf->i_len == len) {
> > > +		memcpy((char *)dst_attr_fmt, (char *)src_attr_fmt, len);
> > > +		return 0;
> > > +	}
> > > +	return -EFSCORRUPTED;
> > > +}
> > > +
> > 
> > This function appears to be unused. The recover code looks like it just
> > casts the iovec buffer directly to an attrd_log_format to determine the
> > id.
> Ok, I will see if I can take it out then.
> 
> > 
> > > +/*
> > > + * Freeing the attrip requires that we remove it from the AIL if it has already
> > > + * been placed there. However, the ATTRI may not yet have been placed in the
> > > + * AIL when called by xfs_attri_release() from ATTRD processing due to the
> > > + * ordering of committed vs unpin operations in bulk insert operations. Hence
> > > + * the reference count to ensure only the last caller frees the ATTRI.
> > > + */
> > > +void
> > > +xfs_attri_release(
> > > +	struct xfs_attri_log_item	*attrip)
> > > +{
> > > +	ASSERT(atomic_read(&attrip->refcount) > 0);
> > > +	if (atomic_dec_and_test(&attrip->refcount)) {
> > > +		xfs_trans_ail_remove(&attrip->item, SHUTDOWN_LOG_IO_ERROR);
> > > +		xfs_attri_item_free(attrip);
> > > +	}
> > > +}
> > > +
> > > +static inline struct xfs_attrd_log_item *ATTRD_ITEM(struct xfs_log_item *lip)
> > > +{
> > > +	return container_of(lip, struct xfs_attrd_log_item, item);
> > > +}
> > > +
> > > +STATIC void
> > > +xfs_attrd_item_free(struct xfs_attrd_log_item *attrdp)
> > > +{
> > > +	kmem_free(attrdp->item.li_lv_shadow);
> > > +	kmem_free(attrdp);
> > > +}
> > > +
> > > +/*
> > > + * This returns the number of iovecs needed to log the given attrd item.
> > > + * We only need 1 iovec for an attrd item.  It just logs the attr_log_format
> > > + * structure.
> > > + */
> > > +static inline int
> > > +xfs_attrd_item_sizeof(
> > > +	struct xfs_attrd_log_item *attrdp)
> > > +{
> > > +	return sizeof(struct xfs_attrd_log_format);
> > > +}
> > > +
> > > +STATIC void
> > > +xfs_attrd_item_size(
> > > +	struct xfs_log_item	*lip,
> > > +	int			*nvecs,
> > > +	int			*nbytes)
> > > +{
> > > +	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
> > > +	*nvecs += 1;
> > > +	*nbytes += xfs_attrd_item_sizeof(attrdp);
> > > +}
> > > +
> > > +/*
> > > + * This is called to fill in the vector of log iovecs for the
> > > + * given attrd log item. We use only 1 iovec, and we point that
> > > + * at the attr_log_format structure embedded in the attrd item.
> > > + * It is at this point that we assert that all of the attr
> > > + * slots in the attrd item have been filled.
> > > + */
> > > +STATIC void
> > > +xfs_attrd_item_format(
> > > +	struct xfs_log_item	*lip,
> > > +	struct xfs_log_vec	*lv)
> > > +{
> > > +	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
> > > +	struct xfs_log_iovec	*vecp = NULL;
> > > +
> > > +	attrdp->format.alfd_type = XFS_LI_ATTRD;
> > > +	attrdp->format.alfd_size = 1;
> > > +
> > > +	xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTRD_FORMAT,
> > > +			&attrdp->format,
> > > +			xfs_attrd_item_sizeof(attrdp));
> > 
> > The above looks like it could be shrunk to 2 lines as well after 80 char
> > widening. Note that I'm sure I haven't caught all of these, just
> > pointing out some examples as I notice them.
> > 
> > FWIW, if you happen to use vim, I sometimes use ':set cc=80' to draw an
> > 80 char line in the viewer that helps to quickly eyeball new code for
> > this kind of thing.
> I do use vim, so this is very helpful!  I will add that to my config.  Thx!
> 

I mentioned it on IRC, but FYI see the following link for how to easily
rewrap text in vim as well:

https://thoughtbot.com/blog/wrap-existing-text-at-80-characters-in-vim

> > 
> > > +}
> > > +
> > > +/*
> > > + * Pinning has no meaning for an attrd item, so just return.
> > > + */
> > > +STATIC void
> > > +xfs_attrd_item_pin(
> > > +	struct xfs_log_item	*lip)
> > > +{
> > > +}
> > > +
> > > +/*
> > > + * Since pinning has no meaning for an attrd item, unpinning does
> > > + * not either.
> > > + */
> > > +STATIC void
> > > +xfs_attrd_item_unpin(
> > > +	struct xfs_log_item	*lip,
> > > +	int			remove)
> > > +{
> > > +}
> > > +
> > > +/*
> > > + * There isn't much you can do to push on an attrd item.  It is simply stuck
> > > + * waiting for the log to be flushed to disk.
> > > + */
> > > +STATIC uint
> > > +xfs_attrd_item_push(
> > > +	struct xfs_log_item	*lip,
> > > +	struct list_head	*buffer_list)
> > > +{
> > > +	return XFS_ITEM_PINNED;
> > > +}
> > > +
> > > +/*
> > > + * The ATTRD is either committed or aborted if the transaction is cancelled. If
> > > + * the transaction is cancelled, drop our reference to the ATTRI and free the
> > > + * ATTRD.
> > > + */
> > > +STATIC void
> > > +xfs_attrd_item_unlock(
> > > +	struct xfs_log_item	*lip)
> > > +{
> > > +	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
> > > +
> > > +	if (test_bit(XFS_LI_ABORTED, &lip->li_flags)) {
> > > +		xfs_attri_release(attrdp->attrip);
> > > +		xfs_attrd_item_free(attrdp);
> > > +	}
> > > +}
> > > +
> > > +/*
> > > + * When the attrd item is committed to disk, all we need to do is delete our
> > > + * reference to our partner attri item and then free ourselves. Since we're
> > > + * freeing ourselves we must return -1 to keep the transaction code from
> > > + * further referencing this item.
> > > + */
> > > +STATIC xfs_lsn_t
> > > +xfs_attrd_item_committed(
> > > +	struct xfs_log_item	*lip,
> > > +	xfs_lsn_t		lsn)
> > > +{
> > > +	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
> > > +
> > > +	/*
> > > +	 * Drop the ATTRI reference regardless of whether the ATTRD has been
> > > +	 * aborted. Once the ATTRD transaction is constructed, it is the sole
> > > +	 * responsibility of the ATTRD to release the ATTRI (even if the ATTRI
> > > +	 * is aborted due to log I/O error).
> > > +	 */
> > > +	xfs_attri_release(attrdp->attrip);
> > > +	xfs_attrd_item_free(attrdp);
> > > +
> > > +	return (xfs_lsn_t)-1;
> > > +}
> > > +
> > > +STATIC void
> > > +xfs_attrd_item_committing(
> > > +	struct xfs_log_item	*lip,
> > > +	xfs_lsn_t		lsn)
> > > +{
> > > +}
> > > +
> > > +/*
> > > + * This is the ops vector shared by all attrd log items.
> > > + */
> > > +static const struct xfs_item_ops xfs_attrd_item_ops = {
> > > +	.iop_size	= xfs_attrd_item_size,
> > > +	.iop_format	= xfs_attrd_item_format,
> > > +	.iop_pin	= xfs_attrd_item_pin,
> > > +	.iop_unpin	= xfs_attrd_item_unpin,
> > > +	.iop_unlock	= xfs_attrd_item_unlock,
> > > +	.iop_committed	= xfs_attrd_item_committed,
> > > +	.iop_push	= xfs_attrd_item_push,
> > > +	.iop_committing = xfs_attrd_item_committing
> > > +};
> > > +
> > > +/*
> > > + * Allocate and initialize an attrd item
> > > + */
> > > +struct xfs_attrd_log_item *
> > > +xfs_attrd_init(
> > > +	struct xfs_mount	*mp,
> > > +	struct xfs_attri_log_item	*attrip)
> > > +
> > > +{
> > > +	struct xfs_attrd_log_item	*attrdp;
> > > +	uint			size;
> > > +
> > > +	size = (uint)(sizeof(struct xfs_attrd_log_item));
> > > +	attrdp = kmem_zalloc(size, KM_SLEEP);
> > > +
> > > +	xfs_log_item_init(mp, &attrdp->item, XFS_LI_ATTRD,
> > > +			  &xfs_attrd_item_ops);
> > > +	attrdp->attrip = attrip;
> > > +	attrdp->format.alfd_alf_id = attrip->format.alfi_id;
> > > +
> > > +	return attrdp;
> > > +}
> > > +
> > > +/*
> > > + * Process an attr intent item that was recovered from
> > > + * the log.  We need to delete the attr that it describes.
> > > + */
> > 
> > ^^^ :)
> > 
> > > +int
> > > +xfs_attri_recover(
> > > +	struct xfs_mount		*mp,
> > > +	struct xfs_attri_log_item	*attrip)
> > > +{
> > > +	struct xfs_inode		*ip;
> > > +	struct xfs_attrd_log_item	*attrdp;
> > > +	struct xfs_da_args		args;
> > > +	struct xfs_attri_log_format	*attrp;
> > > +	struct xfs_trans_res		tres;
> > > +	int				local;
> > > +	int				error = 0;
> > > +	int				rsvd = 0;
> > > +
> > > +	ASSERT(!test_bit(XFS_ATTRI_RECOVERED, &attrip->flags));
> > > +
> > > +	/*
> > > +	 * First check the validity of the attr described by the
> > > +	 * ATTRI.  If any are bad, then assume that all are bad and
> > > +	 * just toss the ATTRI.
> > > +	 */
> > > +	attrp = &attrip->format;
> > > +	if (
> > > +	    /*
> > > +	     * Must have either XFS_ATTR_OP_FLAGS_SET or
> > > +	     * XFS_ATTR_OP_FLAGS_REMOVE set
> > > +	     */
> > > +	    !(attrp->alfi_op_flags == XFS_ATTR_OP_FLAGS_SET ||
> > > +		attrp->alfi_op_flags == XFS_ATTR_OP_FLAGS_REMOVE) ||
> > > +
> > > +	    /* Check size of value and name lengths */
> > > +	    (attrp->alfi_value_len > XATTR_SIZE_MAX ||
> > > +		attrp->alfi_name_len > XATTR_NAME_MAX) ||
> > > +
> > > +	    /*
> > > +	     * If the XFS_ATTR_OP_FLAGS_SET flag is set,
> > > +	     * there must also be a name and value
> > > +	     */
> > > +	    (attrp->alfi_op_flags == XFS_ATTR_OP_FLAGS_SET &&
> > > +		(attrp->alfi_value_len == 0 || attrp->alfi_name_len == 0)) ||
> > 
> > It's been a while since I've played with any attribute stuff, but is
> > this always the case or can we not have an empty attribute?
> 
> I remember us having some discussion about this in an older review, where in
> we thought all set operations have a to have value.  But after digging
> around a bit, I think generic 062 does expect that you can set an attribute
> to nothing.
> 
> Since the test does not force a recovery, we probably have never encountered
> the scenario of recovering an attribute with no value. So I think we got
> away with the alfi_value_len == 0 check even though we should not have.
> 
> I will adjust the logic here.  Maybe when we get this set finished out, it
> might be a good idea to have a test case that checks for that?
> 

Indeed, this is probably a good opportunity to audit our xattr test
coverage for this kind of thing. In particular, I think we should make
sure we have good xattr log recovery coverage. I know we have a few
general log recovery tests, but I'm not sure off the top of my head if
they perform xattr ops, and if so, whether they'll introduce xattrs
large enough for remote blocks and whatnot (where this series is most
notably changing behavior). A new test might be useful to fill any gaps.

> Thx for the catch!
> > 
> > > +
> > > +	    /*
> > > +	     * If the XFS_ATTR_OP_FLAGS_REMOVE flag is set,
> > > +	     * there must also be a name
> > > +	     */
> > > +	    (attrp->alfi_op_flags == XFS_ATTR_OP_FLAGS_REMOVE &&
> > > +		(attrp->alfi_name_len == 0))
> > > +	) {
> > 
> > Comments are always nice of course, but interspersed with logic like
> > this makes the whole thing hard to read. I'd suggest to just generalize
> > the comment to include whatever things are non-obvious, condense the if
> > logic and leave the comment above it.
> Ok, I think probably we only need to check namelen anyway based off the
> above observation too.
> 
> > 
> > > +		/*
> > > +		 * This will pull the ATTRI from the AIL and
> > > +		 * free the memory associated with it.
> > > +		 */
> > > +		set_bit(XFS_ATTRI_RECOVERED, &attrip->flags);
> > > +		xfs_attri_release(attrip);
> > > +		return -EIO;
> > > +	}
> > > +
> > > +	attrp = &attrip->format;
> > > +	error = xfs_iget(mp, 0, attrp->alfi_ino, 0, 0, &ip);
> > > +	if (error)
> > > +		return error;
> > > +
> > > +	error = xfs_attr_args_init(&args, ip, attrip->name,
> > > +			attrp->alfi_name_len, attrp->alfi_attr_flags);
> > > +	if (error)
> > > +		return error;
> > > +
> > > +	args.hashval = xfs_da_hashname(args.name, args.namelen);
> > > +	args.value = attrip->value;
> > > +	args.valuelen = attrp->alfi_value_len;
> > > +	args.op_flags = XFS_DA_OP_OKNOENT;
> > > +	args.total = xfs_attr_calc_size(&args, &local);
> > > +
> > > +	tres.tr_logres = M_RES(mp)->tr_attrsetm.tr_logres +
> > > +			M_RES(mp)->tr_attrsetrt.tr_logres * args.total;
> > > +	tres.tr_logcount = XFS_ATTRSET_LOG_COUNT;
> > > +	tres.tr_logflags = XFS_TRANS_PERM_LOG_RES;
> > > +
> > > +	error = xfs_trans_alloc(mp, &tres, args.total,  0,
> > > +				rsvd ? XFS_TRANS_RESERVE : 0, &args.trans);
> > > +	if (error)
> > > +		return error;
> > > +	attrdp = xfs_trans_get_attrd(args.trans, attrip);
> > > +
> > > +	xfs_ilock(ip, XFS_ILOCK_EXCL);
> > > +
> > > +	xfs_trans_ijoin(args.trans, ip, 0);
> > > +	error = xfs_trans_attr(&args, attrdp, attrp->alfi_op_flags);
> > > +	if (error)
> > > +		goto abort_error;
> > > +
> > > +
> > > +	set_bit(XFS_ATTRI_RECOVERED, &attrip->flags);
> > > +	xfs_trans_log_inode(args.trans, ip, XFS_ILOG_CORE | XFS_ILOG_ADATA);
> > > +	error = xfs_trans_commit(args.trans);
> > > +	xfs_iunlock(ip, XFS_ILOCK_EXCL);
> > > +	return error;
> > > +
> > > +abort_error:
> > > +	xfs_trans_cancel(args.trans);
> > > +	xfs_iunlock(ip, XFS_ILOCK_EXCL);
> > > +	return error;
> > > +}
> > > diff --git a/fs/xfs/xfs_attr_item.h b/fs/xfs/xfs_attr_item.h
> > > new file mode 100644
> > > index 0000000..fce7515
> > > --- /dev/null
> > > +++ b/fs/xfs/xfs_attr_item.h
> > > @@ -0,0 +1,103 @@
> > > +// SPDX-License-Identifier: GPL-2.0+
> > > +/*
> > > + * Copyright (C) 2019 Oracle.  All Rights Reserved.
> > > + * Author: Allison Henderson <allison.henderson@oracle.com>
> > > + */
> > > +#ifndef	__XFS_ATTR_ITEM_H__
> > > +#define	__XFS_ATTR_ITEM_H__
> > > +
> > > +/* kernel only ATTRI/ATTRD definitions */
> > > +
> > > +struct xfs_mount;
> > > +struct kmem_zone;
> > > +
> > > +/*
> > > + * Max number of attrs in fast allocation path.
> > > + */
> > > +#define XFS_ATTRI_MAX_FAST_ATTRS        1
> > > +
> > > +
> > > +/*
> > > + * Define ATTR flag bits. Manipulated by set/clear/test_bit operators.
> > > + */
> > > +#define	XFS_ATTRI_RECOVERED	1
> > > +
> > > +
> > > +/* nvecs must be in multiples of 4 */
> > > +#define ATTR_NVEC_SIZE(size) (size == sizeof(int32_t) ? sizeof(int32_t) : \
> > > +				size + sizeof(int32_t) - \
> > > +				(size % sizeof(int32_t)))
> > > +
> > 
> > Why? Also, any reason we couldn't use round_up() or some such here?
> There's an assertion that checks for this in the recovery.  Without this
> padding I can quickly recreate it:
> 
> Assertion failed: reg->i_len % sizeof(int32_t) == 0, file: fs/xfs/xfs_log.c,
> line: 2484
> 
> It wasnt entirly clear from the context as to why, I assumed it must be
> something to do with not wanting log items falling onto odd ball byte
> alignments?
> 

Yeah, probably something like that :P. I figured it was here for a
reason, that reason just wasn't clear to me. TBH I'm still not
explicitly sure without digging around further, but that assert at least
documents that the log writing infrastructure expects 32-bit aligned
iovec lengths. I guess it makes sense that we'd need to do that here
where name/value lengths are byte aligned (and user defined) and most
other log item sizes are based on (presumably) properly sized data
structures.

Could we update the comment a bit? For example:

	/* iovec length must be 32-bit aligned */

> 
> > 
> > > +/*
> > > + * This is the "attr intention" log item.  It is used to log the fact
> > > + * that some attrs need to be processed.  It is used in conjunction with the
> > > + * "attr done" log item described below.
> > > + *
> > > + * The ATTRI is reference counted so that it is not freed prior to both the
> > > + * ATTRI and ATTRD being committed and unpinned. This ensures the ATTRI is
> > > + * inserted into the AIL even in the event of out of order ATTRI/ATTRD
> > > + * processing. In other words, an ATTRI is born with two references:
> > > + *
> > > + *      1.) an ATTRI held reference to track ATTRI AIL insertion
> > > + *      2.) an ATTRD held reference to track ATTRD commit
> > > + *
> > > + * On allocation, both references are the responsibility of the caller. Once
> > > + * the ATTRI is added to and dirtied in a transaction, ownership of reference
> > > + * one transfers to the transaction. The reference is dropped once the ATTRI is
> > > + * inserted to the AIL or in the event of failure along the way (e.g., commit
> > > + * failure, log I/O error, etc.). Note that the caller remains responsible for
> > > + * the ATTRD reference under all circumstances to this point. The caller has no
> > > + * means to detect failure once the transaction is committed, however.
> > > + * Therefore, an ATTRD is required after this point, even in the event of
> > > + * unrelated failure.
> > > + *
> > > + * Once an ATTRD is allocated and dirtied in a transaction, reference two
> > > + * transfers to the transaction. The ATTRD reference is dropped once it reaches
> > > + * the unpin handler. Similar to the ATTRI, the reference also drops in the
> > > + * event of commit failure or log I/O errors. Note that the ATTRD is not
> > > + * inserted in the AIL, so at this point both the ATTI and ATTRD are freed.
> > > + */
> > > +struct xfs_attri_log_item {
> > > +	xfs_log_item_t			item;
> > > +	atomic_t			refcount;
> > > +	unsigned long			flags;	/* misc flags */
> > > +	int				name_len;
> > > +	void				*name;
> > > +	int				value_len;
> > > +	void				*value;
> > > +	struct xfs_attri_log_format	format;
> > > +};
> > 
> > I think we usually try to use field prefix names in these various
> > structures (as you've done in other places). I.e., attri_item,
> > attrd_item, etc. would probably be consistent with similar structures
> > like the efi/efd log items.
> Sure, I can tack on the attri_* prefix here
> 
> > 
> > > +
> > > +/*
> > > + * This is the "attr done" log item.  It is used to log
> > > + * the fact that some attrs earlier mentioned in an attri item
> > > + * have been freed.
> > > + */
> > > +struct xfs_attrd_log_item {
> > > +	struct xfs_log_item		item;
> > > +	struct xfs_attri_log_item	*attrip;
> > > +	struct xfs_attrd_log_format	format;
> > > +};
> > > +
> > > +/*
> > > + * Max number of attrs in fast allocation path.
> > > + */
> > > +#define	XFS_ATTRD_MAX_FAST_ATTRS	1
> > > +
> > > +extern struct kmem_zone	*xfs_attri_zone;
> > > +extern struct kmem_zone	*xfs_attrd_zone;
> > > +
> > > +struct xfs_attri_log_item	*xfs_attri_init(struct xfs_mount *mp);
> > > +struct xfs_attrd_log_item	*xfs_attrd_init(struct xfs_mount *mp,
> > > +					struct xfs_attri_log_item *attrip);
> > > +int xfs_attri_copy_format(struct xfs_log_iovec *buf,
> > > +			   struct xfs_attri_log_format *dst_attri_fmt);
> > > +int xfs_attrd_copy_format(struct xfs_log_iovec *buf,
> > > +			   struct xfs_attrd_log_format *dst_attrd_fmt);
> > > +void			xfs_attri_item_free(struct xfs_attri_log_item *attrip);
> > > +void			xfs_attri_release(struct xfs_attri_log_item *attrip);
> > > +
> > > +int			xfs_attri_recover(struct xfs_mount *mp,
> > > +					struct xfs_attri_log_item *attrip);
> > > +
> > > +#endif	/* __XFS_ATTR_ITEM_H__ */
> > ...
> > > diff --git a/fs/xfs/xfs_trans_attr.c b/fs/xfs/xfs_trans_attr.c
> > > new file mode 100644
> > > index 0000000..3679348
> > > --- /dev/null
> > > +++ b/fs/xfs/xfs_trans_attr.c
> > > @@ -0,0 +1,240 @@
> > > +// SPDX-License-Identifier: GPL-2.0+
> > > +/*
> > > + * Copyright (C) 2019 Oracle.  All Rights Reserved.
> > > + * Author: Allison Henderson <allison.henderson@oracle.com>
> > > + */
> > > +#include "xfs.h"
> > > +#include "xfs_fs.h"
> > > +#include "xfs_shared.h"
> > > +#include "xfs_format.h"
> > > +#include "xfs_log_format.h"
> > > +#include "xfs_trans_resv.h"
> > > +#include "xfs_bit.h"
> > > +#include "xfs_mount.h"
> > > +#include "xfs_defer.h"
> > > +#include "xfs_trans.h"
> > > +#include "xfs_trans_priv.h"
> > > +#include "xfs_attr_item.h"
> > > +#include "xfs_alloc.h"
> > > +#include "xfs_bmap.h"
> > > +#include "xfs_trace.h"
> > > +#include "libxfs/xfs_da_format.h"
> > > +#include "xfs_da_btree.h"
> > > +#include "xfs_attr.h"
> > > +#include "xfs_inode.h"
> > > +#include "xfs_icache.h"
> > > +#include "xfs_quota.h"
> > > +
> > > +/*
> > > + * This routine is called to allocate an "attr free done"
> > > + * log item.
> > > + */
> > > +struct xfs_attrd_log_item *
> > > +xfs_trans_get_attrd(struct xfs_trans		*tp,
> > > +		  struct xfs_attri_log_item	*attrip)
> > > +{
> > > +	struct xfs_attrd_log_item			*attrdp;
> > > +
> > > +	ASSERT(tp != NULL);
> > > +
> > > +	attrdp = xfs_attrd_init(tp->t_mountp, attrip);
> > > +	ASSERT(attrdp != NULL);
> > > +
> > > +	/*
> > > +	 * Get a log_item_desc to point at the new item.
> > > +	 */
> > > +	xfs_trans_add_item(tp, &attrdp->item);
> > > +	return attrdp;
> > > +}
> > > +
> > > +/*
> > > + * Delete an attr and log it to the ATTRD. Note that the transaction is marked
> > > + * dirty regardless of whether the attr delete succeeds or fails to support the
> > > + * ATTRI/ATTRD lifecycle rules.
> > > + */
> > > +int
> > > +xfs_trans_attr(
> > > +	struct xfs_da_args		*args,
> > > +	struct xfs_attrd_log_item	*attrdp,
> > > +	uint32_t			op_flags)
> > > +{
> > > +	int				error;
> > > +	struct xfs_buf			*leaf_bp = NULL;
> > > +
> > > +	error = xfs_qm_dqattach_locked(args->dp, 0);
> > > +	if (error)
> > > +		return error;
> > > +
> > > +	switch (op_flags) {
> > > +	case XFS_ATTR_OP_FLAGS_SET:
> > > +		args->op_flags |= XFS_DA_OP_ADDNAME;
> > > +		error = xfs_attr_set_args(args, &leaf_bp, false);
> > > +		break;
> > > +	case XFS_ATTR_OP_FLAGS_REMOVE:
> > > +		ASSERT(XFS_IFORK_Q((args->dp)));
> > > +		error = xfs_attr_remove_args(args, false);
> > > +		break;
> > > +	default:
> > > +		error = -EFSCORRUPTED;
> > > +	}
> > > +
> > > +	if (error) {
> > > +		if (leaf_bp)
> > > +			xfs_trans_brelse(args->trans, leaf_bp);
> > > +	}
> > > +
> > > +	/*
> > > +	 * Mark the transaction dirty, even on error. This ensures the
> > > +	 * transaction is aborted, which:
> > > +	 *
> > > +	 * 1.) releases the ATTRI and frees the ATTRD
> > > +	 * 2.) shuts down the filesystem
> > > +	 */
> > > +	args->trans->t_flags |= XFS_TRANS_DIRTY;
> > > +	set_bit(XFS_LI_DIRTY, &attrdp->item.li_flags);
> > > +
> > > +	attrdp->attrip->name = (void *)args->name;
> > > +	attrdp->attrip->value = (void *)args->value;
> > > +	attrdp->attrip->name_len = args->namelen;
> > > +	attrdp->attrip->value_len = args->valuelen;
> > > +
> > 
> > What's the reason for updating the attri here? It's already been
> > committed by the time we get around to the attrd. Is this used again
> > somewhere?
> I think I may have observed it in other code I was using as a model at the
> time. It seems to be able to get along without it though, so I dont think
> it's used again.  I will go ahead and take it out.
> 
> > 
> > > +	return error;
> > > +}
> > > +
> > > +static int
> > > +xfs_attr_diff_items(
> > > +	void				*priv,
> > > +	struct list_head		*a,
> > > +	struct list_head		*b)
> > > +{
> > > +	return 0;
> > > +}
> > > +
> > > +/* Get an ATTRI. */
> > > +STATIC void *
> > > +xfs_attr_create_intent(
> > > +	struct xfs_trans		*tp,
> > > +	unsigned int			count)
> > > +{
> > > +	struct xfs_attri_log_item		*attrip;
> > > +
> > > +	ASSERT(tp != NULL);
> > > +	ASSERT(count == 1);
> > > +
> > > +	attrip = xfs_attri_init(tp->t_mountp);
> > > +	ASSERT(attrip != NULL);
> > > +
> > > +	/*
> > > +	 * Get a log_item_desc to point at the new item.
> > > +	 */
> > > +	xfs_trans_add_item(tp, &attrip->item);
> > > +	return attrip;
> > > +}
> > > +
> > > +/* Log an attr to the intent item. */
> > > +STATIC void
> > > +xfs_attr_log_item(
> > > +	struct xfs_trans		*tp,
> > > +	void				*intent,
> > > +	struct list_head		*item)
> > > +{
> > > +	struct xfs_attri_log_item	*attrip = intent;
> > > +	struct xfs_attr_item		*attr;
> > > +	struct xfs_attri_log_format	*attrp;
> > > +	char				*name_value;
> > > +
> > > +	attr = container_of(item, struct xfs_attr_item, xattri_list);
> > > +	name_value = ((char *)attr) + sizeof(struct xfs_attr_item);
> > > +
> > > +	tp->t_flags |= XFS_TRANS_DIRTY;
> > > +	set_bit(XFS_LI_DIRTY, &attrip->item.li_flags);
> > > +
> > > +	attrp = &attrip->format;
> > > +	attrp->alfi_ino = attr->xattri_ip->i_ino;
> > > +	attrp->alfi_op_flags = attr->xattri_op_flags;
> > > +	attrp->alfi_value_len = attr->xattri_value_len;
> > > +	attrp->alfi_name_len = attr->xattri_name_len;
> > > +	attrp->alfi_attr_flags = attr->xattri_flags;
> > > +
> > > +	attrip->name = name_value;
> > > +	attrip->value = &name_value[attr->xattri_name_len];
> > > +	attrip->name_len = attr->xattri_name_len;
> > > +	attrip->value_len = attr->xattri_value_len;
> > 
> > So once we're at this point, we've constructed an xfs_attr_item to
> > describe the high level deferred operation, created an intent log item
> > and we're now logging that xfs_attri_log_item. We fill in the log format
> > structure based on the xfs_attr_item and point the xfs_attri_log_item
> > name/value pointers at the xfs_attr_item memory. It's thus important to
> > note we've established a subtle relationship between these two data
> > structures because they may have different lifecycles.
> 
> Right, I can add some comments if you like?  I guess i assume people have
> seen these patterns enough to not need them, but the extra explaining never
> hurts I suppose :-)
> 

No need to re-document the common patterns, but I think that the log
item pointing at the attr item memory is fairly unique and subtle. It's
not self-documenting because the attr structure isn't reference counted
or anything, it's just allocated and supplied into the dfops
infrastucture for consumption. A comment somewhere to document that
dependency would definitely be useful, either in the header or where
those pointers are set/cleared.

Brian

> > 
> > > +}
> > > +
> > > +/* Get an ATTRD so we can process all the attrs. */
> > > +STATIC void *
> > > +xfs_attr_create_done(
> > > +	struct xfs_trans		*tp,
> > > +	void				*intent,
> > > +	unsigned int			count)
> > > +{
> > > +	return xfs_trans_get_attrd(tp, intent);
> > > +}
> > > +
> > > +/* Process an attr. */
> > > +STATIC int
> > > +xfs_attr_finish_item(
> > > +	struct xfs_trans		*tp,
> > > +	struct list_head		*item,
> > > +	void				*done_item,
> > > +	void				**state)
> > > +{
> > > +	struct xfs_attr_item		*attr;
> > > +	char				*name_value;
> > > +	int				error;
> > > +	int				local;
> > > +	struct xfs_da_args		args;
> > > +
> > > +	attr = container_of(item, struct xfs_attr_item, xattri_list);
> > > +	name_value = ((char *)attr) + sizeof(struct xfs_attr_item);
> > > +
> > > +	error = xfs_attr_args_init(&args, attr->xattri_ip, name_value,
> > > +				   attr->xattri_name_len, attr->xattri_flags);
> > > +	if (error)
> > > +		goto out;
> > > +
> > > +	args.hashval = xfs_da_hashname(args.name, args.namelen);
> > > +	args.value = &name_value[attr->xattri_name_len];
> > > +	args.valuelen = attr->xattri_value_len;
> > > +	args.op_flags = XFS_DA_OP_OKNOENT;
> > > +	args.total = xfs_attr_calc_size(&args, &local);
> > > +	args.trans = tp;
> > > +
> > > +	error = xfs_trans_attr(&args, done_item,
> > > +			attr->xattri_op_flags);
> > 
> > So now we've committed/rolled our xfs_attri_log_item intent and
> > created/attached the xfs_attrd_log_item and thus we're free to perform
> > the operation...
> > 
> > > +out:
> > > +	kmem_free(attr);
> > 
> > ... and here is where we end up freeing the xfs_attr_item created for
> > the dfops infrastructure that holds our name and value memory.
> > 
> > Hmm.. I think this means our name/value memory accesses are safe because
> > the xfs_attri_log_item only accesses them in the ->iop_format()
> > callback, which occurs during transaction commit of the intent and we're
> > long past that.
> > 
> > That said, the attri/attrd log items themselves outlive the current
> > transaction commit sequence (i.e. until the attrd is physically
> > logged/committed and we free both). That means that once we free the
> > attr above we technically have an attri passing through the log
> > infrastructure with a couple invalid pointers, they just don't happen to
> > be used. It might be worth thinking about how we can clean that up,
> > whether it be clearing those pointers here, or allocating the name/val
> > memory separately and transferring it to the attri, etc. Whatever we end
> > up doing, we should probably add a comment somewhere to explain exactly
> > what's going on as well.
> > 
> > Brian
> 
> I see, thats a good observation.  I'll see if I can work in some clean up
> code and be sure to add some comentary to point it out.  Thanks for the
> thorough review!!  Much appreciated!!
> 
> Allison
> 
> > 
> > > +	return error;
> > > +}
> > > +
> > > +/* Abort all pending ATTRs. */
> > > +STATIC void
> > > +xfs_attr_abort_intent(
> > > +	void				*intent)
> > > +{
> > > +	xfs_attri_release(intent);
> > > +}
> > > +
> > > +/* Cancel an attr */
> > > +STATIC void
> > > +xfs_attr_cancel_item(
> > > +	struct list_head		*item)
> > > +{
> > > +	struct xfs_attr_item	*attr;
> > > +
> > > +	attr = container_of(item, struct xfs_attr_item, xattri_list);
> > > +	kmem_free(attr);
> > > +}
> > > +
> > > +const struct xfs_defer_op_type xfs_attr_defer_type = {
> > > +	.max_items	= XFS_ATTRI_MAX_FAST_ATTRS,
> > > +	.diff_items	= xfs_attr_diff_items,
> > > +	.create_intent	= xfs_attr_create_intent,
> > > +	.abort_intent	= xfs_attr_abort_intent,
> > > +	.log_item	= xfs_attr_log_item,
> > > +	.create_done	= xfs_attr_create_done,
> > > +	.finish_item	= xfs_attr_finish_item,
> > > +	.cancel_item	= xfs_attr_cancel_item,
> > > +};
> > > +
> > > -- 
> > > 2.7.4
> > > 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 5/9] xfs: Add xfs_attr_set_deferred and xfs_attr_remove_deferred
  2019-04-18 21:28     ` Allison Henderson
@ 2019-04-22 11:01       ` Brian Foster
  2019-04-22 22:01         ` Allison Henderson
  0 siblings, 1 reply; 48+ messages in thread
From: Brian Foster @ 2019-04-22 11:01 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Thu, Apr 18, 2019 at 02:28:00PM -0700, Allison Henderson wrote:
> On 4/18/19 8:49 AM, Brian Foster wrote:
> > On Fri, Apr 12, 2019 at 03:50:32PM -0700, Allison Henderson wrote:
> > > These routines set up set and start a new deferred attribute
> > > operation.  These functions are meant to be called by other
> > > code needing to initiate a deferred attribute operation.  We
> > > will use these routines later in the parent pointer patches.
> > > 
> > 
> > We probably don't need to reference the parent pointer stuff any more
> > for this, right? I'm assuming we'll be converting generic attr
> > infrastructure over to this mechanism in subsequent patches..?
> 
> Right, some of these comments are a little stale.  I will clean then up a
> bit.
> 
> > 
> > > Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> > > ---
> > >   fs/xfs/libxfs/xfs_attr.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++
> > >   fs/xfs/libxfs/xfs_attr.h |  7 +++++
> > >   2 files changed, 87 insertions(+)
> > > 
> > > diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> > > index fadd485..c3477fa7 100644
> > > --- a/fs/xfs/libxfs/xfs_attr.c
> > > +++ b/fs/xfs/libxfs/xfs_attr.c
...
> > > @@ -513,6 +560,39 @@ xfs_attr_remove(
> > >   	return error;
> > >   }
> > > +/* Removes an attribute for an inode as a deferred operation */
> > > +int
> > > +xfs_attr_remove_deferred(
> > 
> > Hmm.. I'm kind of wondering if we actually need to defer attr removes.
> > Do we have the same kind of challenges for attr removal as for attr
> > creation, or is there some future scenario where this is needed?
> 
> I suppose we don't have to have it?  The motivation was to help break up the
> amount of transaction activity that happens on inode create/rename/remove
> operations once pptrs go in.  Attr remove does not look as complex as attr
> set, but I suppose it helps to some degree?
> 

Ok, this probably needs more thought. On one hand, I'm not a huge fan of
using complex infrastructure where not required just because it's there.
On the other, it could just be more simple to have consistency between
xattr ops. As you note above, perhaps we do want the ability to defer
xattr removes so we can use it in particular contexts (parent pointer
updates) and not others (direct xattr remove requests from userspace).
Perhaps the right thing to do for the time being is to continue on with
the support for deferred xattr remove but don't invoke it from the
direct xattr remove codepath..?

Note that if we took that approach, we could add a DEBUG option and/or
an errortag to (randomly) defer xattr removes in the common path for
test coverage purposes.

Brian

> > 
> > > +	struct xfs_inode        *dp,
> > > +	struct xfs_trans	*tp,
> > > +	const unsigned char	*name,
> > > +	unsigned int		namelen,
> > > +	int                     flags)
> > > +{
> > > +
> > > +	struct xfs_attr_item	*new;
> > > +	char			*name_value;
> > > +
> > > +	if (!namelen) {
> > > +		ASSERT(0);
> > > +		return -EFSCORRUPTED;
> > 
> > Similar comment around -EFSCORRUPTED vs. -EINVAL (or something else..).
> Ok, I will change to EINVAL here too.
> 
> Thanks again for the reviews!!  They are very helpful!
> 
> Allison
> > 
> > Brian
> > 
> > > +	}
> > > +
> > > +	new = kmem_alloc(XFS_ATTR_ITEM_SIZEOF(namelen, 0), KM_SLEEP|KM_NOFS);
> > > +	name_value = ((char *)new) + sizeof(struct xfs_attr_item);
> > > +	memset(new, 0, XFS_ATTR_ITEM_SIZEOF(namelen, 0));
> > > +	new->xattri_ip = dp;
> > > +	new->xattri_op_flags = XFS_ATTR_OP_FLAGS_REMOVE;
> > > +	new->xattri_name_len = namelen;
> > > +	new->xattri_value_len = 0;
> > > +	new->xattri_flags = flags;
> > > +	memcpy(name_value, name, namelen);
> > > +
> > > +	xfs_defer_add(tp, XFS_DEFER_OPS_TYPE_ATTR, &new->xattri_list);
> > > +
> > > +	return 0;
> > > +}
> > > +
> > >   /*========================================================================
> > >    * External routines when attribute list is inside the inode
> > >    *========================================================================*/
> > > diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
> > > index 92d9a15..83b3621 100644
> > > --- a/fs/xfs/libxfs/xfs_attr.h
> > > +++ b/fs/xfs/libxfs/xfs_attr.h
> > > @@ -175,5 +175,12 @@ bool xfs_attr_namecheck(const void *name, size_t length);
> > >   int xfs_attr_args_init(struct xfs_da_args *args, struct xfs_inode *dp,
> > >   			const unsigned char *name, size_t namelen, int flags);
> > >   int xfs_attr_calc_size(struct xfs_da_args *args, int *local);
> > > +int xfs_attr_set_deferred(struct xfs_inode *dp, struct xfs_trans *tp,
> > > +			  const unsigned char *name, unsigned int name_len,
> > > +			  const unsigned char *value, unsigned int valuelen,
> > > +			  int flags);
> > > +int xfs_attr_remove_deferred(struct xfs_inode *dp, struct xfs_trans *tp,
> > > +			    const unsigned char *name, unsigned int namelen,
> > > +			    int flags);
> > >   #endif	/* __XFS_ATTR_H__ */
> > > -- 
> > > 2.7.4
> > > 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 6/9] xfs: Add xfs_has_attr and subroutines
  2019-04-12 22:50 ` [PATCH 6/9] xfs: Add xfs_has_attr and subroutines Allison Henderson
  2019-04-15  2:46   ` Su Yue
@ 2019-04-22 13:00   ` Brian Foster
  2019-04-22 22:01     ` Allison Henderson
  1 sibling, 1 reply; 48+ messages in thread
From: Brian Foster @ 2019-04-22 13:00 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Fri, Apr 12, 2019 at 03:50:33PM -0700, Allison Henderson wrote:
> This patch adds a new functions to check for the existence of
> an attribute.  Subroutines are also added to handle the cases
> of leaf blocks, nodes or shortform.  We will need this later
> for delayed attributes since delayed operations cannot return
> error codes.
> 
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_attr.c      | 78 +++++++++++++++++++++++++++++++++++++++++++
>  fs/xfs/libxfs/xfs_attr.h      |  1 +
>  fs/xfs/libxfs/xfs_attr_leaf.c | 33 ++++++++++++++++++
>  fs/xfs/libxfs/xfs_attr_leaf.h |  1 +
>  4 files changed, 113 insertions(+)
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index c3477fa7..0042708 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -53,6 +53,7 @@ STATIC int xfs_attr_shortform_addname(xfs_da_args_t *args);
>  STATIC int xfs_attr_leaf_get(xfs_da_args_t *args);
>  STATIC int xfs_attr_leaf_addname(xfs_da_args_t *args, bool roll_trans);
>  STATIC int xfs_attr_leaf_removename(xfs_da_args_t *args, bool roll_trans);
> +STATIC int xfs_leaf_has_attr(xfs_da_args_t *args);
>  
>  /*
>   * Internal routines when attribute list is more than one block.
> @@ -60,6 +61,7 @@ STATIC int xfs_attr_leaf_removename(xfs_da_args_t *args, bool roll_trans);
>  STATIC int xfs_attr_node_get(xfs_da_args_t *args);
>  STATIC int xfs_attr_node_addname(xfs_da_args_t *args, bool roll_trans);
>  STATIC int xfs_attr_node_removename(xfs_da_args_t *args, bool roll_trans);
> +STATIC int xfs_attr_node_hasname(xfs_da_args_t *args);
>  STATIC int xfs_attr_fillstate(xfs_da_state_t *state);
>  STATIC int xfs_attr_refillstate(xfs_da_state_t *state);
>  
> @@ -301,6 +303,29 @@ xfs_attr_set_args(
>  }
>  
>  /*
> + * Return successful if attr is found, or ENOATTR if not
> + */
> +int
> +xfs_has_attr(
> +	struct xfs_da_args      *args)
> +{
> +	struct xfs_inode        *dp = args->dp;
> +	int                     error;
> +
> +	if (!xfs_inode_hasattr(dp))
> +		error = -ENOATTR;
> +	else if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL) {
> +		ASSERT(dp->i_afp->if_flags & XFS_IFINLINE);
> +		error = xfs_shortform_has_attr(args);
> +	} else if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
> +		error = xfs_leaf_has_attr(args);
> +	else
> +		error = xfs_attr_node_hasname(args);

I think it's usually expected to keep the {} braces around each branch
of a multi-branch if/else when at least one branch has multiple lines.

Also, I see that at least some of this code is pulled from existing
xattr functions. For example, the xfs_shortform_has_attr() code
currently exists in xfs_attr_shortform_remove(),
xfs_attr_shortform_add(), etc. Similar for xfs_leaf_has_attr() and
xfs_attr_leaf_removename(), etc. 

Rather than just adding new, not yet used functions, can we turn this
patch more into a refactor where these new functions are reused by
existing code where applicable? That reduces duplication and also
facilitates review.

Brian

> +
> +	return error;
> +}
> +
> +/*
>   * Remove the attribute specified in @args.
>   */
>  int
> @@ -836,6 +861,29 @@ xfs_attr_leaf_addname(
>  }
>  
>  /*
> + * Return successful if attr is found, or ENOATTR if not
> + */
> +STATIC int
> +xfs_leaf_has_attr(
> +	struct xfs_da_args      *args)
> +{
> +	struct xfs_buf          *bp;
> +	int                     error = 0;
> +
> +	args->blkno = 0;
> +	error = xfs_attr3_leaf_read(args->trans, args->dp,
> +			args->blkno, -1, &bp);
> +	if (error)
> +		return error;
> +
> +	error = xfs_attr3_leaf_lookup_int(bp, args);
> +	error = (error == -ENOATTR) ? -ENOATTR : 0;
> +	xfs_trans_brelse(args->trans, bp);
> +
> +	return error;
> +}
> +
> +/*
>   * Remove a name from the leaf attribute list structure
>   *
>   * This leaf block cannot have a "remote" value, we only call this routine
> @@ -1166,6 +1214,36 @@ xfs_attr_node_addname(
>  }
>  
>  /*
> + * Return successful if attr is found, or ENOATTR if not
> + */
> +STATIC int
> +xfs_attr_node_hasname(
> +	struct xfs_da_args	*args)
> +{
> +	struct xfs_da_state	*state;
> +	struct xfs_inode	*dp;
> +	int			retval, error;
> +
> +	/*
> +	 * Tie a string around our finger to remind us where we are.
> +	 */
> +	dp = args->dp;
> +	state = xfs_da_state_alloc();
> +	state->args = args;
> +	state->mp = dp->i_mount;
> +
> +	/*
> +	 * Search to see if name exists, and get back a pointer to it.
> +	 */
> +	error = xfs_da3_node_lookup_int(state, &retval);
> +	if (error || (retval != -EEXIST)) {
> +		if (error == 0)
> +			error = retval;
> +	}
> +	return error;
> +}
> +
> +/*
>   * Remove a name from a B-tree attribute list.
>   *
>   * This will involve walking down the Btree, and may involve joining
> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
> index 83b3621..974c963 100644
> --- a/fs/xfs/libxfs/xfs_attr.h
> +++ b/fs/xfs/libxfs/xfs_attr.h
> @@ -168,6 +168,7 @@ int xfs_attr_set_args(struct xfs_da_args *args, struct xfs_buf **leaf_bp,
>  		 bool roll_trans);
>  int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name,
>  		    size_t namelen, int flags);
> +int xfs_has_attr(struct xfs_da_args *args);
>  int xfs_attr_remove_args(struct xfs_da_args *args, bool roll_trans);
>  int xfs_attr_list(struct xfs_inode *dp, char *buffer, int bufsize,
>  		  int flags, struct attrlist_cursor_kern *cursor);
> diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c
> index 128bfe9..e9f2f53 100644
> --- a/fs/xfs/libxfs/xfs_attr_leaf.c
> +++ b/fs/xfs/libxfs/xfs_attr_leaf.c
> @@ -622,6 +622,39 @@ xfs_attr_fork_remove(
>  }
>  
>  /*
> + * Return successful if attr is found, or ENOATTR if not
> + */
> +int
> +xfs_shortform_has_attr(
> +	struct xfs_da_args	 *args)
> +{
> +	struct xfs_attr_shortform *sf;
> +	struct xfs_attr_sf_entry *sfe;
> +	int			base = sizeof(struct xfs_attr_sf_hdr);
> +	int			size = 0;
> +	int			end;
> +	int			i;
> +
> +	sf = (struct xfs_attr_shortform *)args->dp->i_afp->if_u1.if_data;
> +	sfe = &sf->list[0];
> +	end = sf->hdr.count;
> +	for (i = 0; i < end; sfe = XFS_ATTR_SF_NEXTENTRY(sfe),
> +			base += size, i++) {
> +		size = XFS_ATTR_SF_ENTSIZE(sfe);
> +		if (sfe->namelen != args->namelen)
> +			continue;
> +		if (memcmp(sfe->nameval, args->name, args->namelen) != 0)
> +			continue;
> +		if (!xfs_attr_namesp_match(args->flags, sfe->flags))
> +			continue;
> +		break;
> +	}
> +	if (i == end)
> +		return -ENOATTR;
> +	return 0;
> +}
> +
> +/*
>   * Remove an attribute from the shortform attribute list structure.
>   */
>  int
> diff --git a/fs/xfs/libxfs/xfs_attr_leaf.h b/fs/xfs/libxfs/xfs_attr_leaf.h
> index 9d830ec..98dd169 100644
> --- a/fs/xfs/libxfs/xfs_attr_leaf.h
> +++ b/fs/xfs/libxfs/xfs_attr_leaf.h
> @@ -39,6 +39,7 @@ int	xfs_attr_shortform_getvalue(struct xfs_da_args *args);
>  int	xfs_attr_shortform_to_leaf(struct xfs_da_args *args,
>  			struct xfs_buf **leaf_bp);
>  int	xfs_attr_shortform_remove(struct xfs_da_args *args);
> +int	xfs_shortform_has_attr(struct xfs_da_args *args);
>  int	xfs_attr_shortform_allfit(struct xfs_buf *bp, struct xfs_inode *dp);
>  int	xfs_attr_shortform_bytesfit(struct xfs_inode *dp, int bytes);
>  xfs_failaddr_t xfs_attr_shortform_verify(struct xfs_inode *ip);
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 7/9] xfs: Add attr context to log item
  2019-04-12 22:50 ` [PATCH 7/9] xfs: Add attr context to log item Allison Henderson
  2019-04-15 22:50   ` Darrick J. Wong
@ 2019-04-22 13:03   ` Brian Foster
  2019-04-22 22:01     ` Allison Henderson
  1 sibling, 1 reply; 48+ messages in thread
From: Brian Foster @ 2019-04-22 13:03 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Fri, Apr 12, 2019 at 03:50:34PM -0700, Allison Henderson wrote:
> This patch modifies xfs_attr_item to store a xfs_da_args, a xfs_buf pointer
> and a new state type. We will use these in the next patch when
> we modify xfs_set_attr_args to roll transactions by returning EAGAIN.
> Because the subroutines of this function modify the contents of these
> structures, we need to find a place to store them where they remain
> instantiated across multiple calls to xfs_set_attr_args.
> 
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---

I see Darrick has already commented on the whole state thing. I'll
probably have to grok the next patch to comment further, but just a
couple initial thoughts:

First, I hit a build failure with this patch. It looks like there's a
missed include in the scrub code:

  ...
  CC [M]  fs/xfs/scrub/repair.o
In file included from fs/xfs/scrub/repair.c:32:
fs/xfs/libxfs/xfs_attr.h:105:21: error: field ‘xattri_args’ has incomplete type
  struct xfs_da_args xattri_args;   /* args context */
  ...

Second, the commit log suggests that the states will reflect the current
transaction roll points (i.e., establishing re-entry points down in
xfs_attr_set_args(). I'm kind of wondering if we should break these
xattr set sub-sequences down into smaller helper functions (refactoring
the existing code as we go) such that the mechanism could technically be
used deferred or not. Re: the previous thought on whether to defer xattr
removes or not, there might also be cases where there's not a need to
defer xattr sets.

E.g., taking a quick peek into the next patch, the state 1 case in
xfs_attr_try_sf_addname() is actually a transaction commit, which I
think means we're done. We'd have done an attr memory allocation,
deferred op and transaction roll where none was necessary so it might
not be worth it to defer in that scenario. Hmm, it also looks like we
return -EAGAIN in places where we've not actually done any work, like if
a shortform add attempt returns -ENOSPC (or the -EAGAIN return before we
even attempt the sf add). That kind of looks like a waste of transaction
rolls and further suggests it might be cleaner to break this whole path
down into helpers and put it back together in a way more conducive to
deferred operations.

Brian


>  fs/xfs/libxfs/xfs_attr.h | 18 +++++++++++++++++-
>  fs/xfs/scrub/common.c    |  2 ++
>  fs/xfs/xfs_acl.c         |  2 ++
>  fs/xfs/xfs_attr_item.c   |  2 +-
>  fs/xfs/xfs_ioctl.c       |  2 ++
>  fs/xfs/xfs_ioctl32.c     |  2 ++
>  fs/xfs/xfs_iops.c        |  1 +
>  fs/xfs/xfs_xattr.c       |  1 +
>  8 files changed, 28 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
> index 974c963..4ce3b0a 100644
> --- a/fs/xfs/libxfs/xfs_attr.h
> +++ b/fs/xfs/libxfs/xfs_attr.h
> @@ -77,6 +77,13 @@ typedef struct attrlist_ent {	/* data from attr_list() */
>  	char	a_name[1];	/* attr name (NULL terminated) */
>  } attrlist_ent_t;
>  
> +/* Attr state machine types */
> +enum xfs_attr_state {
> +	XFS_ATTR_STATE1 = 1,
> +	XFS_ATTR_STATE2 = 2,
> +	XFS_ATTR_STATE3 = 3,
> +};
> +
>  /*
>   * List of attrs to commit later.
>   */
> @@ -88,7 +95,16 @@ struct xfs_attr_item {
>  	void		  *xattri_name;	      /* attr name */
>  	uint32_t	  xattri_name_len;    /* length of name */
>  	uint32_t	  xattri_flags;       /* attr flags */
> -	struct list_head  xattri_list;
> +
> +	/*
> +	 * Delayed attr parameters that need to remain instantiated
> +	 * across transaction rolls during the defer finish
> +	 */
> +	struct xfs_buf		*xattri_leaf_bp;  /* Leaf buf to release */
> +	enum xfs_attr_state	xattri_state;	  /* state machine marker */
> +	struct xfs_da_args	xattri_args;	  /* args context */
> +
> +	struct list_head	xattri_list;
>  
>  	/*
>  	 * A byte array follows the header containing the file name and
> diff --git a/fs/xfs/scrub/common.c b/fs/xfs/scrub/common.c
> index 0c54ff5..270c32e 100644
> --- a/fs/xfs/scrub/common.c
> +++ b/fs/xfs/scrub/common.c
> @@ -30,6 +30,8 @@
>  #include "xfs_rmap_btree.h"
>  #include "xfs_log.h"
>  #include "xfs_trans_priv.h"
> +#include "xfs_da_format.h"
> +#include "xfs_da_btree.h"
>  #include "xfs_attr.h"
>  #include "xfs_reflink.h"
>  #include "scrub/xfs_scrub.h"
> diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c
> index 142de8d..9b1b93e 100644
> --- a/fs/xfs/xfs_acl.c
> +++ b/fs/xfs/xfs_acl.c
> @@ -10,6 +10,8 @@
>  #include "xfs_mount.h"
>  #include "xfs_inode.h"
>  #include "xfs_acl.h"
> +#include "xfs_da_format.h"
> +#include "xfs_da_btree.h"
>  #include "xfs_attr.h"
>  #include "xfs_trace.h"
>  #include <linux/slab.h>
> diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
> index 0ea19b4..36e6d1e 100644
> --- a/fs/xfs/xfs_attr_item.c
> +++ b/fs/xfs/xfs_attr_item.c
> @@ -19,10 +19,10 @@
>  #include "xfs_rmap.h"
>  #include "xfs_inode.h"
>  #include "xfs_icache.h"
> -#include "xfs_attr.h"
>  #include "xfs_shared.h"
>  #include "xfs_da_format.h"
>  #include "xfs_da_btree.h"
> +#include "xfs_attr.h"
>  
>  static inline struct xfs_attri_log_item *ATTRI_ITEM(struct xfs_log_item *lip)
>  {
> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> index ab341d6..c8728ca 100644
> --- a/fs/xfs/xfs_ioctl.c
> +++ b/fs/xfs/xfs_ioctl.c
> @@ -16,6 +16,8 @@
>  #include "xfs_rtalloc.h"
>  #include "xfs_itable.h"
>  #include "xfs_error.h"
> +#include "xfs_da_format.h"
> +#include "xfs_da_btree.h"
>  #include "xfs_attr.h"
>  #include "xfs_bmap.h"
>  #include "xfs_bmap_util.h"
> diff --git a/fs/xfs/xfs_ioctl32.c b/fs/xfs/xfs_ioctl32.c
> index 5001dca..23f6990 100644
> --- a/fs/xfs/xfs_ioctl32.c
> +++ b/fs/xfs/xfs_ioctl32.c
> @@ -21,6 +21,8 @@
>  #include "xfs_fsops.h"
>  #include "xfs_alloc.h"
>  #include "xfs_rtalloc.h"
> +#include "xfs_da_format.h"
> +#include "xfs_da_btree.h"
>  #include "xfs_attr.h"
>  #include "xfs_ioctl.h"
>  #include "xfs_ioctl32.h"
> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> index e73c21a..561c467 100644
> --- a/fs/xfs/xfs_iops.c
> +++ b/fs/xfs/xfs_iops.c
> @@ -17,6 +17,7 @@
>  #include "xfs_acl.h"
>  #include "xfs_quota.h"
>  #include "xfs_error.h"
> +#include "xfs_da_btree.h"
>  #include "xfs_attr.h"
>  #include "xfs_trans.h"
>  #include "xfs_trace.h"
> diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c
> index 3013746..938e81d 100644
> --- a/fs/xfs/xfs_xattr.c
> +++ b/fs/xfs/xfs_xattr.c
> @@ -11,6 +11,7 @@
>  #include "xfs_mount.h"
>  #include "xfs_da_format.h"
>  #include "xfs_inode.h"
> +#include "xfs_da_btree.h"
>  #include "xfs_attr.h"
>  #include "xfs_attr_leaf.h"
>  #include "xfs_acl.h"
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 4/9] xfs: Set up infastructure for deferred attribute operations
  2019-04-22 11:00       ` Brian Foster
@ 2019-04-22 22:00         ` Allison Henderson
  0 siblings, 0 replies; 48+ messages in thread
From: Allison Henderson @ 2019-04-22 22:00 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs



On 4/22/19 4:00 AM, Brian Foster wrote:
> On Thu, Apr 18, 2019 at 02:27:15PM -0700, Allison Henderson wrote:
>> On 4/18/19 8:48 AM, Brian Foster wrote:
>>> On Fri, Apr 12, 2019 at 03:50:31PM -0700, Allison Henderson wrote:
>>>> This patch adds two new log item types for setting or
>>>> removing attributes as deferred operations.  The
>>>> xfs_attri_log_item logs an intent to set or remove an
>>>> attribute.  The corresponding xfs_attrd_log_item holds
>>>> a reference to the xfs_attri_log_item and is freed once
>>>> the transaction is done.  Both log items use a generic
>>>> xfs_attr_log_format structure that contains the attribute
>>>> name, value, flags, inode, and an op_flag that indicates
>>>> if the operations is a set or remove.
>>>>
>>>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
>>>> ---
>>>
>>> This mostly looks sane to me on a first high level pass. We're adding
>>> the intent/done log item infrastructure for attrs, associated dfops
>>> processing code and log recovery hooks. I'll probably have to go back
>>> through this once I get further through the series and have grokked more
>>> context, but so far I think I just have some various nits and aesthetic
>>> comments.
>>>
>>> Firstly, note that git complained about an extra blank line at EOF of
>>> xfs_trans_attr.c when I applied this patch. Also, the commit log above
>>> looks like it could be widened (I think 68 chars is the standard) and
>>> could probably include a bit more context on the big picture changes
>>> associated with this work. In general, I think the commit log should
>>> (briefly) explain 1.) how attrs currently work 2.) how things are
>>> expected to work based on this infrastructure and 3.) the advantage(s)
>>> of doing so.
>>
>> Sure, I will get these suggestions added in the next update
>>
>>
>>>
>>> For example, one thing that is glossed over is that this implies we'll
>>> be logging xattr values even in remote attribute block cases. BTW, do we
>>> need to update the transaction reservation to account for that? I didn't
>>> notice that being changed anwhere (yet)..
>>
>> Hmm, the pptr set does some accounting for the extra attrs in create, move
>> and remove operations, but I dont think there's any new adjustments for
>> remote attribute blocks.  I will look into that.  Thx!
>>
>>>
>>>>    fs/xfs/Makefile                |   2 +
>>>>    fs/xfs/libxfs/xfs_attr.c       |   5 +-
>>>>    fs/xfs/libxfs/xfs_attr.h       |  25 ++
>>>>    fs/xfs/libxfs/xfs_defer.c      |   1 +
>>>>    fs/xfs/libxfs/xfs_defer.h      |   3 +
>>>>    fs/xfs/libxfs/xfs_log_format.h |  44 +++-
>>>>    fs/xfs/libxfs/xfs_types.h      |   1 +
>>>>    fs/xfs/xfs_attr_item.c         | 558 +++++++++++++++++++++++++++++++++++++++++
>>>>    fs/xfs/xfs_attr_item.h         | 103 ++++++++
>>>>    fs/xfs/xfs_log_recover.c       | 172 +++++++++++++
>>>>    fs/xfs/xfs_ondisk.h            |   2 +
>>>>    fs/xfs/xfs_trans.h             |  10 +
>>>>    fs/xfs/xfs_trans_attr.c        | 240 ++++++++++++++++++
>>>>    13 files changed, 1162 insertions(+), 4 deletions(-)
>>>>
>>> ...
>>>> diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
>>>> new file mode 100644
>>>> index 0000000..0ea19b4
>>>> --- /dev/null
>>>> +++ b/fs/xfs/xfs_attr_item.c
>>>> @@ -0,0 +1,558 @@
>>>> +// SPDX-License-Identifier: GPL-2.0+
>>>> +/*
>>>> + * Copyright (C) 2019 Oracle.  All Rights Reserved.
>>>> + * Author: Allison Henderson <allison.henderson@oracle.com>
>>>> + */
>>>> +#include "xfs.h"
>>>> +#include "xfs_fs.h"
>>>> +#include "xfs_format.h"
>>>> +#include "xfs_log_format.h"
>>>> +#include "xfs_trans_resv.h"
>>>> +#include "xfs_bit.h"
>>>> +#include "xfs_mount.h"
>>>> +#include "xfs_trans.h"
>>>> +#include "xfs_trans_priv.h"
>>>> +#include "xfs_buf_item.h"
>>>> +#include "xfs_attr_item.h"
>>>> +#include "xfs_log.h"
>>>> +#include "xfs_btree.h"
>>>> +#include "xfs_rmap.h"
>>>> +#include "xfs_inode.h"
>>>> +#include "xfs_icache.h"
>>>> +#include "xfs_attr.h"
>>>> +#include "xfs_shared.h"
>>>> +#include "xfs_da_format.h"
>>>> +#include "xfs_da_btree.h"
>>>> +
>>>> +static inline struct xfs_attri_log_item *ATTRI_ITEM(struct xfs_log_item *lip)
>>>> +{
>>>> +	return container_of(lip, struct xfs_attri_log_item, item);
>>>> +}
>>>> +
>>>> +void
>>>> +xfs_attri_item_free(
>>>> +	struct xfs_attri_log_item	*attrip)
>>>> +{
>>>> +	kmem_free(attrip->item.li_lv_shadow);
>>>> +	kmem_free(attrip);
>>>> +}
>>>> +
>>>> +/*
>>>> + * This returns the number of iovecs needed to log the given attri item.
>>>> + * We only need 1 iovec for an attri item.  It just logs the attr_log_format
>>>> + * structure.
>>>> + */
>>>> +static inline int
>>>> +xfs_attri_item_sizeof(
>>>> +	struct xfs_attri_log_item *attrip)
>>>> +{
>>>> +	return sizeof(struct xfs_attri_log_format);
>>>> +}
>>>> +
>>>> +STATIC void
>>>> +xfs_attri_item_size(
>>>> +	struct xfs_log_item	*lip,
>>>> +	int			*nvecs,
>>>> +	int			*nbytes)
>>>> +{
>>>> +	struct xfs_attri_log_item       *attrip = ATTRI_ITEM(lip);
>>>> +
>>>> +	*nvecs += 1;
>>>> +	*nbytes += xfs_attri_item_sizeof(attrip);
>>>> +
>>>> +	if (attrip->name_len > 0) {
>>>> +		*nvecs += 1;
>>>> +		*nbytes += ATTR_NVEC_SIZE(attrip->name_len);
>>>> +	}
>>>> +
>>>> +	if (attrip->value_len > 0) {
>>>> +		*nvecs += 1;
>>>> +		*nbytes += ATTR_NVEC_SIZE(attrip->value_len);
>>>> +	}
>>>> +}
>>>> +
>>>> +/*
>>>> + * This is called to fill in the vector of log iovecs for the
>>>> + * given attri log item. We use only 1 iovec, and we point that
>>>> + * at the attri_log_format structure embedded in the attri item.
>>>> + * It is at this point that we assert that all of the attr
>>>> + * slots in the attri item have been filled.
>>>> + */
>>>
>>> I see a bunch of places throughout this patch such as above where the
>>> line length formatting looks inconsistent. The above comment should be
>>> widened to 80 chars. I'm sure much of this code was boilerplate brought
>>> over from other log items and such, but we should take the opportunity
>>> to properly format the new code we're adding.
>> Yes, I loosely modeled it of the efi code at the time.  I will go through
>> and do some clean up with the line lengths
>>
>>>
>>>> +STATIC void
>>>> +xfs_attri_item_format(
>>>> +	struct xfs_log_item	*lip,
>>>> +	struct xfs_log_vec	*lv)
>>>> +{
>>>> +	struct xfs_attri_log_item	*attrip = ATTRI_ITEM(lip);
>>>> +	struct xfs_log_iovec	*vecp = NULL;
>>>> +
>>>> +	attrip->format.alfi_type = XFS_LI_ATTRI;
>>>> +	attrip->format.alfi_size = 1;
>>>> +	if (attrip->name_len > 0)
>>>> +		attrip->format.alfi_size++;
>>>> +	if (attrip->value_len > 0)
>>>> +		attrip->format.alfi_size++;
>>>> +
>>>
>>> I'd move these afli_size updates to the equivalent if checks below.
>> Alrighty, will do
>>
>>>
>>>> +	xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTRI_FORMAT,
>>>> +			&attrip->format,
>>>> +			xfs_attri_item_sizeof(attrip));
>>>> +	if (attrip->name_len > 0)
>>>> +		xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTR_NAME,
>>>> +				attrip->name, ATTR_NVEC_SIZE(attrip->name_len));
>>>> +
>>>> +	if (attrip->value_len > 0)
>>>> +		xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTR_VALUE,
>>>> +				attrip->value,
>>>> +				ATTR_NVEC_SIZE(attrip->value_len));
>>>> +}
>>>> +
>>>> +
>>>> +/*
>>>> + * Pinning has no meaning for an attri item, so just return.
>>>> + */
>>>> +STATIC void
>>>> +xfs_attri_item_pin(
>>>> +	struct xfs_log_item	*lip)
>>>> +{
>>>> +}
>>>> +
>>>> +/*
>>>> + * The unpin operation is the last place an ATTRI is manipulated in the log. It
>>>> + * is either inserted in the AIL or aborted in the event of a log I/O error. In
>>>> + * either case, the ATTRI transaction has been successfully committed to make it
>>>> + * this far. Therefore, we expect whoever committed the ATTRI to either
>>>> + * construct and commit the ATTRD or drop the ATTRD's reference in the event of
>>>> + * error. Simply drop the log's ATTRI reference now that the log is done with
>>>> + * it.
>>>> + */
>>>> +STATIC void
>>>> +xfs_attri_item_unpin(
>>>> +	struct xfs_log_item	*lip,
>>>> +	int			remove)
>>>> +{
>>>> +	struct xfs_attri_log_item	*attrip = ATTRI_ITEM(lip);
>>>> +
>>>> +	xfs_attri_release(attrip);
>>>> +}
>>>> +
>>>> +/*
>>>> + * attri items have no locking or pushing.  However, since ATTRIs are pulled
>>>> + * from the AIL when their corresponding ATTRDs are committed to disk, their
>>>> + * situation is very similar to being pinned.  Return XFS_ITEM_PINNED so that
>>>> + * the caller will eventually flush the log.  This should help in getting the
>>>> + * ATTRI out of the AIL.
>>>> + */
>>>> +STATIC uint
>>>> +xfs_attri_item_push(
>>>> +	struct xfs_log_item	*lip,
>>>> +	struct list_head	*buffer_list)
>>>> +{
>>>> +	return XFS_ITEM_PINNED;
>>>> +}
>>>> +
>>>> +/*
>>>> + * The ATTRI has been either committed or aborted if the transaction has been
>>>> + * cancelled. If the transaction was cancelled, an ATTRD isn't going to be
>>>> + * constructed and thus we free the ATTRI here directly.
>>>> + */
>>>> +STATIC void
>>>> +xfs_attri_item_unlock(
>>>> +	struct xfs_log_item	*lip)
>>>> +{
>>>> +	if (test_bit(XFS_LI_ABORTED, &lip->li_flags))
>>>> +		xfs_attri_release(ATTRI_ITEM(lip));
>>>> +}
>>>> +
>>>> +/*
>>>> + * The ATTRI is logged only once and cannot be moved in the log, so simply
>>>> + * return the lsn at which it's been logged.
>>>> + */
>>>> +STATIC xfs_lsn_t
>>>> +xfs_attri_item_committed(
>>>> +	struct xfs_log_item	*lip,
>>>> +	xfs_lsn_t		lsn)
>>>> +{
>>>> +	return lsn;
>>>> +}
>>>> +
>>>> +STATIC void
>>>> +xfs_attri_item_committing(
>>>> +	struct xfs_log_item	*lip,
>>>> +	xfs_lsn_t		lsn)
>>>> +{
>>>> +}
>>>> +
>>>> +/*
>>>> + * This is the ops vector shared by all attri log items.
>>>> + */
>>>> +static const struct xfs_item_ops xfs_attri_item_ops = {
>>>> +	.iop_size	= xfs_attri_item_size,
>>>> +	.iop_format	= xfs_attri_item_format,
>>>> +	.iop_pin	= xfs_attri_item_pin,
>>>> +	.iop_unpin	= xfs_attri_item_unpin,
>>>> +	.iop_unlock	= xfs_attri_item_unlock,
>>>> +	.iop_committed	= xfs_attri_item_committed,
>>>> +	.iop_push	= xfs_attri_item_push,
>>>> +	.iop_committing = xfs_attri_item_committing
>>>> +};
>>>> +
>>>> +
>>>> +/*
>>>> + * Allocate and initialize an attri item
>>>> + */
>>>> +struct xfs_attri_log_item *
>>>> +xfs_attri_init(
>>>> +	struct xfs_mount	*mp)
>>>> +
>>>> +{
>>>> +	struct xfs_attri_log_item	*attrip;
>>>> +	uint			size;
>>>> +
>>>> +	size = (uint)(sizeof(struct xfs_attri_log_item));
>>>> +	attrip = kmem_zalloc(size, KM_SLEEP);
>>>> +
>>>> +	xfs_log_item_init(mp, &(attrip->item), XFS_LI_ATTRI,
>>>> +			  &xfs_attri_item_ops);
>>>
>>> No need for those braces around attrip->item, and with those removed we
>>> can reduce this to a single line.
>>>
>>>> +	attrip->format.alfi_id = (uintptr_t)(void *)attrip;
>>>> +	atomic_set(&attrip->refcount, 2);
>>>> +
>>>> +	return attrip;
>>>> +}
>>>> +
>>>> +/*
>>>> + * Copy an attr format buffer from the given buf, and into the destination
>>>> + * attr format structure.
>>>> + */
>>>> +int
>>>> +xfs_attri_copy_format(struct xfs_log_iovec *buf,
>>>> +		      struct xfs_attri_log_format *dst_attr_fmt)
>>>> +{
>>>> +	struct xfs_attri_log_format *src_attr_fmt = buf->i_addr;
>>>> +	uint len = sizeof(struct xfs_attri_log_format);
>>>> +
>>>> +	if (buf->i_len == len) {
>>>> +		memcpy((char *)dst_attr_fmt, (char *)src_attr_fmt, len);
>>>> +		return 0;
>>>> +	}
>>>> +	return -EFSCORRUPTED;
>>>
>>> Can we invert the logic flow here (and below)? I.e.,
>>>
>>> 	...
>>> 	if (buf->i_len != len)
>>> 		return -EFSCORRUPTED;
>>> 	memcpy(...);
>>> 	return 0;
>> Sure, I think that looks simpler too.
>>
>>>
>>>> +}
>>>> +
>>>> +/*
>>>> + * Copy an attr format buffer from the given buf, and into the destination
>>>> + * attr format structure.
>>>> + */
>>>> +int
>>>> +xfs_attrd_copy_format(struct xfs_log_iovec *buf,
>>>> +		      struct xfs_attrd_log_format *dst_attr_fmt)
>>>> +{
>>>> +	struct xfs_attrd_log_format *src_attr_fmt = buf->i_addr;
>>>> +	uint len = sizeof(struct xfs_attrd_log_format);
>>>> +
>>>> +	if (buf->i_len == len) {
>>>> +		memcpy((char *)dst_attr_fmt, (char *)src_attr_fmt, len);
>>>> +		return 0;
>>>> +	}
>>>> +	return -EFSCORRUPTED;
>>>> +}
>>>> +
>>>
>>> This function appears to be unused. The recover code looks like it just
>>> casts the iovec buffer directly to an attrd_log_format to determine the
>>> id.
>> Ok, I will see if I can take it out then.
>>
>>>
>>>> +/*
>>>> + * Freeing the attrip requires that we remove it from the AIL if it has already
>>>> + * been placed there. However, the ATTRI may not yet have been placed in the
>>>> + * AIL when called by xfs_attri_release() from ATTRD processing due to the
>>>> + * ordering of committed vs unpin operations in bulk insert operations. Hence
>>>> + * the reference count to ensure only the last caller frees the ATTRI.
>>>> + */
>>>> +void
>>>> +xfs_attri_release(
>>>> +	struct xfs_attri_log_item	*attrip)
>>>> +{
>>>> +	ASSERT(atomic_read(&attrip->refcount) > 0);
>>>> +	if (atomic_dec_and_test(&attrip->refcount)) {
>>>> +		xfs_trans_ail_remove(&attrip->item, SHUTDOWN_LOG_IO_ERROR);
>>>> +		xfs_attri_item_free(attrip);
>>>> +	}
>>>> +}
>>>> +
>>>> +static inline struct xfs_attrd_log_item *ATTRD_ITEM(struct xfs_log_item *lip)
>>>> +{
>>>> +	return container_of(lip, struct xfs_attrd_log_item, item);
>>>> +}
>>>> +
>>>> +STATIC void
>>>> +xfs_attrd_item_free(struct xfs_attrd_log_item *attrdp)
>>>> +{
>>>> +	kmem_free(attrdp->item.li_lv_shadow);
>>>> +	kmem_free(attrdp);
>>>> +}
>>>> +
>>>> +/*
>>>> + * This returns the number of iovecs needed to log the given attrd item.
>>>> + * We only need 1 iovec for an attrd item.  It just logs the attr_log_format
>>>> + * structure.
>>>> + */
>>>> +static inline int
>>>> +xfs_attrd_item_sizeof(
>>>> +	struct xfs_attrd_log_item *attrdp)
>>>> +{
>>>> +	return sizeof(struct xfs_attrd_log_format);
>>>> +}
>>>> +
>>>> +STATIC void
>>>> +xfs_attrd_item_size(
>>>> +	struct xfs_log_item	*lip,
>>>> +	int			*nvecs,
>>>> +	int			*nbytes)
>>>> +{
>>>> +	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
>>>> +	*nvecs += 1;
>>>> +	*nbytes += xfs_attrd_item_sizeof(attrdp);
>>>> +}
>>>> +
>>>> +/*
>>>> + * This is called to fill in the vector of log iovecs for the
>>>> + * given attrd log item. We use only 1 iovec, and we point that
>>>> + * at the attr_log_format structure embedded in the attrd item.
>>>> + * It is at this point that we assert that all of the attr
>>>> + * slots in the attrd item have been filled.
>>>> + */
>>>> +STATIC void
>>>> +xfs_attrd_item_format(
>>>> +	struct xfs_log_item	*lip,
>>>> +	struct xfs_log_vec	*lv)
>>>> +{
>>>> +	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
>>>> +	struct xfs_log_iovec	*vecp = NULL;
>>>> +
>>>> +	attrdp->format.alfd_type = XFS_LI_ATTRD;
>>>> +	attrdp->format.alfd_size = 1;
>>>> +
>>>> +	xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTRD_FORMAT,
>>>> +			&attrdp->format,
>>>> +			xfs_attrd_item_sizeof(attrdp));
>>>
>>> The above looks like it could be shrunk to 2 lines as well after 80 char
>>> widening. Note that I'm sure I haven't caught all of these, just
>>> pointing out some examples as I notice them.
>>>
>>> FWIW, if you happen to use vim, I sometimes use ':set cc=80' to draw an
>>> 80 char line in the viewer that helps to quickly eyeball new code for
>>> this kind of thing.
>> I do use vim, so this is very helpful!  I will add that to my config.  Thx!
>>
> 
> I mentioned it on IRC, but FYI see the following link for how to easily
> rewrap text in vim as well:
> 
> https://urldefense.proofpoint.com/v2/url?u=https-3A__thoughtbot.com_blog_wrap-2Dexisting-2Dtext-2Dat-2D80-2Dcharacters-2Din-2Dvim&d=DwIBAg&c=RoP1YumCXCgaWHvlZYR8PZh8Bv7qIrMUB65eapI_JnE&r=LHZQ8fHvy6wDKXGTWcm97burZH5sQKHRDMaY1UthQxc&m=IadNbc3hL6EMllQKwLzDsimX5eVtBHXJ2FYI_fgXxVE&s=wi7qV5gfBp2yjMK_3nFvgl8keiq8ZrGpoei9bBc4Sho&e=

Great, I will take a look.  Thx!

> 
>>>
>>>> +}
>>>> +
>>>> +/*
>>>> + * Pinning has no meaning for an attrd item, so just return.
>>>> + */
>>>> +STATIC void
>>>> +xfs_attrd_item_pin(
>>>> +	struct xfs_log_item	*lip)
>>>> +{
>>>> +}
>>>> +
>>>> +/*
>>>> + * Since pinning has no meaning for an attrd item, unpinning does
>>>> + * not either.
>>>> + */
>>>> +STATIC void
>>>> +xfs_attrd_item_unpin(
>>>> +	struct xfs_log_item	*lip,
>>>> +	int			remove)
>>>> +{
>>>> +}
>>>> +
>>>> +/*
>>>> + * There isn't much you can do to push on an attrd item.  It is simply stuck
>>>> + * waiting for the log to be flushed to disk.
>>>> + */
>>>> +STATIC uint
>>>> +xfs_attrd_item_push(
>>>> +	struct xfs_log_item	*lip,
>>>> +	struct list_head	*buffer_list)
>>>> +{
>>>> +	return XFS_ITEM_PINNED;
>>>> +}
>>>> +
>>>> +/*
>>>> + * The ATTRD is either committed or aborted if the transaction is cancelled. If
>>>> + * the transaction is cancelled, drop our reference to the ATTRI and free the
>>>> + * ATTRD.
>>>> + */
>>>> +STATIC void
>>>> +xfs_attrd_item_unlock(
>>>> +	struct xfs_log_item	*lip)
>>>> +{
>>>> +	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
>>>> +
>>>> +	if (test_bit(XFS_LI_ABORTED, &lip->li_flags)) {
>>>> +		xfs_attri_release(attrdp->attrip);
>>>> +		xfs_attrd_item_free(attrdp);
>>>> +	}
>>>> +}
>>>> +
>>>> +/*
>>>> + * When the attrd item is committed to disk, all we need to do is delete our
>>>> + * reference to our partner attri item and then free ourselves. Since we're
>>>> + * freeing ourselves we must return -1 to keep the transaction code from
>>>> + * further referencing this item.
>>>> + */
>>>> +STATIC xfs_lsn_t
>>>> +xfs_attrd_item_committed(
>>>> +	struct xfs_log_item	*lip,
>>>> +	xfs_lsn_t		lsn)
>>>> +{
>>>> +	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
>>>> +
>>>> +	/*
>>>> +	 * Drop the ATTRI reference regardless of whether the ATTRD has been
>>>> +	 * aborted. Once the ATTRD transaction is constructed, it is the sole
>>>> +	 * responsibility of the ATTRD to release the ATTRI (even if the ATTRI
>>>> +	 * is aborted due to log I/O error).
>>>> +	 */
>>>> +	xfs_attri_release(attrdp->attrip);
>>>> +	xfs_attrd_item_free(attrdp);
>>>> +
>>>> +	return (xfs_lsn_t)-1;
>>>> +}
>>>> +
>>>> +STATIC void
>>>> +xfs_attrd_item_committing(
>>>> +	struct xfs_log_item	*lip,
>>>> +	xfs_lsn_t		lsn)
>>>> +{
>>>> +}
>>>> +
>>>> +/*
>>>> + * This is the ops vector shared by all attrd log items.
>>>> + */
>>>> +static const struct xfs_item_ops xfs_attrd_item_ops = {
>>>> +	.iop_size	= xfs_attrd_item_size,
>>>> +	.iop_format	= xfs_attrd_item_format,
>>>> +	.iop_pin	= xfs_attrd_item_pin,
>>>> +	.iop_unpin	= xfs_attrd_item_unpin,
>>>> +	.iop_unlock	= xfs_attrd_item_unlock,
>>>> +	.iop_committed	= xfs_attrd_item_committed,
>>>> +	.iop_push	= xfs_attrd_item_push,
>>>> +	.iop_committing = xfs_attrd_item_committing
>>>> +};
>>>> +
>>>> +/*
>>>> + * Allocate and initialize an attrd item
>>>> + */
>>>> +struct xfs_attrd_log_item *
>>>> +xfs_attrd_init(
>>>> +	struct xfs_mount	*mp,
>>>> +	struct xfs_attri_log_item	*attrip)
>>>> +
>>>> +{
>>>> +	struct xfs_attrd_log_item	*attrdp;
>>>> +	uint			size;
>>>> +
>>>> +	size = (uint)(sizeof(struct xfs_attrd_log_item));
>>>> +	attrdp = kmem_zalloc(size, KM_SLEEP);
>>>> +
>>>> +	xfs_log_item_init(mp, &attrdp->item, XFS_LI_ATTRD,
>>>> +			  &xfs_attrd_item_ops);
>>>> +	attrdp->attrip = attrip;
>>>> +	attrdp->format.alfd_alf_id = attrip->format.alfi_id;
>>>> +
>>>> +	return attrdp;
>>>> +}
>>>> +
>>>> +/*
>>>> + * Process an attr intent item that was recovered from
>>>> + * the log.  We need to delete the attr that it describes.
>>>> + */
>>>
>>> ^^^ :)
>>>
>>>> +int
>>>> +xfs_attri_recover(
>>>> +	struct xfs_mount		*mp,
>>>> +	struct xfs_attri_log_item	*attrip)
>>>> +{
>>>> +	struct xfs_inode		*ip;
>>>> +	struct xfs_attrd_log_item	*attrdp;
>>>> +	struct xfs_da_args		args;
>>>> +	struct xfs_attri_log_format	*attrp;
>>>> +	struct xfs_trans_res		tres;
>>>> +	int				local;
>>>> +	int				error = 0;
>>>> +	int				rsvd = 0;
>>>> +
>>>> +	ASSERT(!test_bit(XFS_ATTRI_RECOVERED, &attrip->flags));
>>>> +
>>>> +	/*
>>>> +	 * First check the validity of the attr described by the
>>>> +	 * ATTRI.  If any are bad, then assume that all are bad and
>>>> +	 * just toss the ATTRI.
>>>> +	 */
>>>> +	attrp = &attrip->format;
>>>> +	if (
>>>> +	    /*
>>>> +	     * Must have either XFS_ATTR_OP_FLAGS_SET or
>>>> +	     * XFS_ATTR_OP_FLAGS_REMOVE set
>>>> +	     */
>>>> +	    !(attrp->alfi_op_flags == XFS_ATTR_OP_FLAGS_SET ||
>>>> +		attrp->alfi_op_flags == XFS_ATTR_OP_FLAGS_REMOVE) ||
>>>> +
>>>> +	    /* Check size of value and name lengths */
>>>> +	    (attrp->alfi_value_len > XATTR_SIZE_MAX ||
>>>> +		attrp->alfi_name_len > XATTR_NAME_MAX) ||
>>>> +
>>>> +	    /*
>>>> +	     * If the XFS_ATTR_OP_FLAGS_SET flag is set,
>>>> +	     * there must also be a name and value
>>>> +	     */
>>>> +	    (attrp->alfi_op_flags == XFS_ATTR_OP_FLAGS_SET &&
>>>> +		(attrp->alfi_value_len == 0 || attrp->alfi_name_len == 0)) ||
>>>
>>> It's been a while since I've played with any attribute stuff, but is
>>> this always the case or can we not have an empty attribute?
>>
>> I remember us having some discussion about this in an older review, where in
>> we thought all set operations have a to have value.  But after digging
>> around a bit, I think generic 062 does expect that you can set an attribute
>> to nothing.
>>
>> Since the test does not force a recovery, we probably have never encountered
>> the scenario of recovering an attribute with no value. So I think we got
>> away with the alfi_value_len == 0 check even though we should not have.
>>
>> I will adjust the logic here.  Maybe when we get this set finished out, it
>> might be a good idea to have a test case that checks for that?
>>
> 
> Indeed, this is probably a good opportunity to audit our xattr test
> coverage for this kind of thing. In particular, I think we should make
> sure we have good xattr log recovery coverage. I know we have a few
> general log recovery tests, but I'm not sure off the top of my head if
> they perform xattr ops, and if so, whether they'll introduce xattrs
> large enough for remote blocks and whatnot (where this series is most
> notably changing behavior). A new test might be useful to fill any gaps.

Alrighty then, I will see if I can come up with a test case to excersize 
some of the new code paths were adding in here.

> 
>> Thx for the catch!
>>>
>>>> +
>>>> +	    /*
>>>> +	     * If the XFS_ATTR_OP_FLAGS_REMOVE flag is set,
>>>> +	     * there must also be a name
>>>> +	     */
>>>> +	    (attrp->alfi_op_flags == XFS_ATTR_OP_FLAGS_REMOVE &&
>>>> +		(attrp->alfi_name_len == 0))
>>>> +	) {
>>>
>>> Comments are always nice of course, but interspersed with logic like
>>> this makes the whole thing hard to read. I'd suggest to just generalize
>>> the comment to include whatever things are non-obvious, condense the if
>>> logic and leave the comment above it.
>> Ok, I think probably we only need to check namelen anyway based off the
>> above observation too.
>>
>>>
>>>> +		/*
>>>> +		 * This will pull the ATTRI from the AIL and
>>>> +		 * free the memory associated with it.
>>>> +		 */
>>>> +		set_bit(XFS_ATTRI_RECOVERED, &attrip->flags);
>>>> +		xfs_attri_release(attrip);
>>>> +		return -EIO;
>>>> +	}
>>>> +
>>>> +	attrp = &attrip->format;
>>>> +	error = xfs_iget(mp, 0, attrp->alfi_ino, 0, 0, &ip);
>>>> +	if (error)
>>>> +		return error;
>>>> +
>>>> +	error = xfs_attr_args_init(&args, ip, attrip->name,
>>>> +			attrp->alfi_name_len, attrp->alfi_attr_flags);
>>>> +	if (error)
>>>> +		return error;
>>>> +
>>>> +	args.hashval = xfs_da_hashname(args.name, args.namelen);
>>>> +	args.value = attrip->value;
>>>> +	args.valuelen = attrp->alfi_value_len;
>>>> +	args.op_flags = XFS_DA_OP_OKNOENT;
>>>> +	args.total = xfs_attr_calc_size(&args, &local);
>>>> +
>>>> +	tres.tr_logres = M_RES(mp)->tr_attrsetm.tr_logres +
>>>> +			M_RES(mp)->tr_attrsetrt.tr_logres * args.total;
>>>> +	tres.tr_logcount = XFS_ATTRSET_LOG_COUNT;
>>>> +	tres.tr_logflags = XFS_TRANS_PERM_LOG_RES;
>>>> +
>>>> +	error = xfs_trans_alloc(mp, &tres, args.total,  0,
>>>> +				rsvd ? XFS_TRANS_RESERVE : 0, &args.trans);
>>>> +	if (error)
>>>> +		return error;
>>>> +	attrdp = xfs_trans_get_attrd(args.trans, attrip);
>>>> +
>>>> +	xfs_ilock(ip, XFS_ILOCK_EXCL);
>>>> +
>>>> +	xfs_trans_ijoin(args.trans, ip, 0);
>>>> +	error = xfs_trans_attr(&args, attrdp, attrp->alfi_op_flags);
>>>> +	if (error)
>>>> +		goto abort_error;
>>>> +
>>>> +
>>>> +	set_bit(XFS_ATTRI_RECOVERED, &attrip->flags);
>>>> +	xfs_trans_log_inode(args.trans, ip, XFS_ILOG_CORE | XFS_ILOG_ADATA);
>>>> +	error = xfs_trans_commit(args.trans);
>>>> +	xfs_iunlock(ip, XFS_ILOCK_EXCL);
>>>> +	return error;
>>>> +
>>>> +abort_error:
>>>> +	xfs_trans_cancel(args.trans);
>>>> +	xfs_iunlock(ip, XFS_ILOCK_EXCL);
>>>> +	return error;
>>>> +}
>>>> diff --git a/fs/xfs/xfs_attr_item.h b/fs/xfs/xfs_attr_item.h
>>>> new file mode 100644
>>>> index 0000000..fce7515
>>>> --- /dev/null
>>>> +++ b/fs/xfs/xfs_attr_item.h
>>>> @@ -0,0 +1,103 @@
>>>> +// SPDX-License-Identifier: GPL-2.0+
>>>> +/*
>>>> + * Copyright (C) 2019 Oracle.  All Rights Reserved.
>>>> + * Author: Allison Henderson <allison.henderson@oracle.com>
>>>> + */
>>>> +#ifndef	__XFS_ATTR_ITEM_H__
>>>> +#define	__XFS_ATTR_ITEM_H__
>>>> +
>>>> +/* kernel only ATTRI/ATTRD definitions */
>>>> +
>>>> +struct xfs_mount;
>>>> +struct kmem_zone;
>>>> +
>>>> +/*
>>>> + * Max number of attrs in fast allocation path.
>>>> + */
>>>> +#define XFS_ATTRI_MAX_FAST_ATTRS        1
>>>> +
>>>> +
>>>> +/*
>>>> + * Define ATTR flag bits. Manipulated by set/clear/test_bit operators.
>>>> + */
>>>> +#define	XFS_ATTRI_RECOVERED	1
>>>> +
>>>> +
>>>> +/* nvecs must be in multiples of 4 */
>>>> +#define ATTR_NVEC_SIZE(size) (size == sizeof(int32_t) ? sizeof(int32_t) : \
>>>> +				size + sizeof(int32_t) - \
>>>> +				(size % sizeof(int32_t)))
>>>> +
>>>
>>> Why? Also, any reason we couldn't use round_up() or some such here?
>> There's an assertion that checks for this in the recovery.  Without this
>> padding I can quickly recreate it:
>>
>> Assertion failed: reg->i_len % sizeof(int32_t) == 0, file: fs/xfs/xfs_log.c,
>> line: 2484
>>
>> It wasnt entirly clear from the context as to why, I assumed it must be
>> something to do with not wanting log items falling onto odd ball byte
>> alignments?
>>
> 
> Yeah, probably something like that :P. I figured it was here for a
> reason, that reason just wasn't clear to me. TBH I'm still not
> explicitly sure without digging around further, but that assert at least
> documents that the log writing infrastructure expects 32-bit aligned
> iovec lengths. I guess it makes sense that we'd need to do that here
> where name/value lengths are byte aligned (and user defined) and most
> other log item sizes are based on (presumably) properly sized data
> structures.
> 
> Could we update the comment a bit? For example:
> 
> 	/* iovec length must be 32-bit aligned */
Sure, will do :-)

> 
>>
>>>
>>>> +/*
>>>> + * This is the "attr intention" log item.  It is used to log the fact
>>>> + * that some attrs need to be processed.  It is used in conjunction with the
>>>> + * "attr done" log item described below.
>>>> + *
>>>> + * The ATTRI is reference counted so that it is not freed prior to both the
>>>> + * ATTRI and ATTRD being committed and unpinned. This ensures the ATTRI is
>>>> + * inserted into the AIL even in the event of out of order ATTRI/ATTRD
>>>> + * processing. In other words, an ATTRI is born with two references:
>>>> + *
>>>> + *      1.) an ATTRI held reference to track ATTRI AIL insertion
>>>> + *      2.) an ATTRD held reference to track ATTRD commit
>>>> + *
>>>> + * On allocation, both references are the responsibility of the caller. Once
>>>> + * the ATTRI is added to and dirtied in a transaction, ownership of reference
>>>> + * one transfers to the transaction. The reference is dropped once the ATTRI is
>>>> + * inserted to the AIL or in the event of failure along the way (e.g., commit
>>>> + * failure, log I/O error, etc.). Note that the caller remains responsible for
>>>> + * the ATTRD reference under all circumstances to this point. The caller has no
>>>> + * means to detect failure once the transaction is committed, however.
>>>> + * Therefore, an ATTRD is required after this point, even in the event of
>>>> + * unrelated failure.
>>>> + *
>>>> + * Once an ATTRD is allocated and dirtied in a transaction, reference two
>>>> + * transfers to the transaction. The ATTRD reference is dropped once it reaches
>>>> + * the unpin handler. Similar to the ATTRI, the reference also drops in the
>>>> + * event of commit failure or log I/O errors. Note that the ATTRD is not
>>>> + * inserted in the AIL, so at this point both the ATTI and ATTRD are freed.
>>>> + */
>>>> +struct xfs_attri_log_item {
>>>> +	xfs_log_item_t			item;
>>>> +	atomic_t			refcount;
>>>> +	unsigned long			flags;	/* misc flags */
>>>> +	int				name_len;
>>>> +	void				*name;
>>>> +	int				value_len;
>>>> +	void				*value;
>>>> +	struct xfs_attri_log_format	format;
>>>> +};
>>>
>>> I think we usually try to use field prefix names in these various
>>> structures (as you've done in other places). I.e., attri_item,
>>> attrd_item, etc. would probably be consistent with similar structures
>>> like the efi/efd log items.
>> Sure, I can tack on the attri_* prefix here
>>
>>>
>>>> +
>>>> +/*
>>>> + * This is the "attr done" log item.  It is used to log
>>>> + * the fact that some attrs earlier mentioned in an attri item
>>>> + * have been freed.
>>>> + */
>>>> +struct xfs_attrd_log_item {
>>>> +	struct xfs_log_item		item;
>>>> +	struct xfs_attri_log_item	*attrip;
>>>> +	struct xfs_attrd_log_format	format;
>>>> +};
>>>> +
>>>> +/*
>>>> + * Max number of attrs in fast allocation path.
>>>> + */
>>>> +#define	XFS_ATTRD_MAX_FAST_ATTRS	1
>>>> +
>>>> +extern struct kmem_zone	*xfs_attri_zone;
>>>> +extern struct kmem_zone	*xfs_attrd_zone;
>>>> +
>>>> +struct xfs_attri_log_item	*xfs_attri_init(struct xfs_mount *mp);
>>>> +struct xfs_attrd_log_item	*xfs_attrd_init(struct xfs_mount *mp,
>>>> +					struct xfs_attri_log_item *attrip);
>>>> +int xfs_attri_copy_format(struct xfs_log_iovec *buf,
>>>> +			   struct xfs_attri_log_format *dst_attri_fmt);
>>>> +int xfs_attrd_copy_format(struct xfs_log_iovec *buf,
>>>> +			   struct xfs_attrd_log_format *dst_attrd_fmt);
>>>> +void			xfs_attri_item_free(struct xfs_attri_log_item *attrip);
>>>> +void			xfs_attri_release(struct xfs_attri_log_item *attrip);
>>>> +
>>>> +int			xfs_attri_recover(struct xfs_mount *mp,
>>>> +					struct xfs_attri_log_item *attrip);
>>>> +
>>>> +#endif	/* __XFS_ATTR_ITEM_H__ */
>>> ...
>>>> diff --git a/fs/xfs/xfs_trans_attr.c b/fs/xfs/xfs_trans_attr.c
>>>> new file mode 100644
>>>> index 0000000..3679348
>>>> --- /dev/null
>>>> +++ b/fs/xfs/xfs_trans_attr.c
>>>> @@ -0,0 +1,240 @@
>>>> +// SPDX-License-Identifier: GPL-2.0+
>>>> +/*
>>>> + * Copyright (C) 2019 Oracle.  All Rights Reserved.
>>>> + * Author: Allison Henderson <allison.henderson@oracle.com>
>>>> + */
>>>> +#include "xfs.h"
>>>> +#include "xfs_fs.h"
>>>> +#include "xfs_shared.h"
>>>> +#include "xfs_format.h"
>>>> +#include "xfs_log_format.h"
>>>> +#include "xfs_trans_resv.h"
>>>> +#include "xfs_bit.h"
>>>> +#include "xfs_mount.h"
>>>> +#include "xfs_defer.h"
>>>> +#include "xfs_trans.h"
>>>> +#include "xfs_trans_priv.h"
>>>> +#include "xfs_attr_item.h"
>>>> +#include "xfs_alloc.h"
>>>> +#include "xfs_bmap.h"
>>>> +#include "xfs_trace.h"
>>>> +#include "libxfs/xfs_da_format.h"
>>>> +#include "xfs_da_btree.h"
>>>> +#include "xfs_attr.h"
>>>> +#include "xfs_inode.h"
>>>> +#include "xfs_icache.h"
>>>> +#include "xfs_quota.h"
>>>> +
>>>> +/*
>>>> + * This routine is called to allocate an "attr free done"
>>>> + * log item.
>>>> + */
>>>> +struct xfs_attrd_log_item *
>>>> +xfs_trans_get_attrd(struct xfs_trans		*tp,
>>>> +		  struct xfs_attri_log_item	*attrip)
>>>> +{
>>>> +	struct xfs_attrd_log_item			*attrdp;
>>>> +
>>>> +	ASSERT(tp != NULL);
>>>> +
>>>> +	attrdp = xfs_attrd_init(tp->t_mountp, attrip);
>>>> +	ASSERT(attrdp != NULL);
>>>> +
>>>> +	/*
>>>> +	 * Get a log_item_desc to point at the new item.
>>>> +	 */
>>>> +	xfs_trans_add_item(tp, &attrdp->item);
>>>> +	return attrdp;
>>>> +}
>>>> +
>>>> +/*
>>>> + * Delete an attr and log it to the ATTRD. Note that the transaction is marked
>>>> + * dirty regardless of whether the attr delete succeeds or fails to support the
>>>> + * ATTRI/ATTRD lifecycle rules.
>>>> + */
>>>> +int
>>>> +xfs_trans_attr(
>>>> +	struct xfs_da_args		*args,
>>>> +	struct xfs_attrd_log_item	*attrdp,
>>>> +	uint32_t			op_flags)
>>>> +{
>>>> +	int				error;
>>>> +	struct xfs_buf			*leaf_bp = NULL;
>>>> +
>>>> +	error = xfs_qm_dqattach_locked(args->dp, 0);
>>>> +	if (error)
>>>> +		return error;
>>>> +
>>>> +	switch (op_flags) {
>>>> +	case XFS_ATTR_OP_FLAGS_SET:
>>>> +		args->op_flags |= XFS_DA_OP_ADDNAME;
>>>> +		error = xfs_attr_set_args(args, &leaf_bp, false);
>>>> +		break;
>>>> +	case XFS_ATTR_OP_FLAGS_REMOVE:
>>>> +		ASSERT(XFS_IFORK_Q((args->dp)));
>>>> +		error = xfs_attr_remove_args(args, false);
>>>> +		break;
>>>> +	default:
>>>> +		error = -EFSCORRUPTED;
>>>> +	}
>>>> +
>>>> +	if (error) {
>>>> +		if (leaf_bp)
>>>> +			xfs_trans_brelse(args->trans, leaf_bp);
>>>> +	}
>>>> +
>>>> +	/*
>>>> +	 * Mark the transaction dirty, even on error. This ensures the
>>>> +	 * transaction is aborted, which:
>>>> +	 *
>>>> +	 * 1.) releases the ATTRI and frees the ATTRD
>>>> +	 * 2.) shuts down the filesystem
>>>> +	 */
>>>> +	args->trans->t_flags |= XFS_TRANS_DIRTY;
>>>> +	set_bit(XFS_LI_DIRTY, &attrdp->item.li_flags);
>>>> +
>>>> +	attrdp->attrip->name = (void *)args->name;
>>>> +	attrdp->attrip->value = (void *)args->value;
>>>> +	attrdp->attrip->name_len = args->namelen;
>>>> +	attrdp->attrip->value_len = args->valuelen;
>>>> +
>>>
>>> What's the reason for updating the attri here? It's already been
>>> committed by the time we get around to the attrd. Is this used again
>>> somewhere?
>> I think I may have observed it in other code I was using as a model at the
>> time. It seems to be able to get along without it though, so I dont think
>> it's used again.  I will go ahead and take it out.
>>
>>>
>>>> +	return error;
>>>> +}
>>>> +
>>>> +static int
>>>> +xfs_attr_diff_items(
>>>> +	void				*priv,
>>>> +	struct list_head		*a,
>>>> +	struct list_head		*b)
>>>> +{
>>>> +	return 0;
>>>> +}
>>>> +
>>>> +/* Get an ATTRI. */
>>>> +STATIC void *
>>>> +xfs_attr_create_intent(
>>>> +	struct xfs_trans		*tp,
>>>> +	unsigned int			count)
>>>> +{
>>>> +	struct xfs_attri_log_item		*attrip;
>>>> +
>>>> +	ASSERT(tp != NULL);
>>>> +	ASSERT(count == 1);
>>>> +
>>>> +	attrip = xfs_attri_init(tp->t_mountp);
>>>> +	ASSERT(attrip != NULL);
>>>> +
>>>> +	/*
>>>> +	 * Get a log_item_desc to point at the new item.
>>>> +	 */
>>>> +	xfs_trans_add_item(tp, &attrip->item);
>>>> +	return attrip;
>>>> +}
>>>> +
>>>> +/* Log an attr to the intent item. */
>>>> +STATIC void
>>>> +xfs_attr_log_item(
>>>> +	struct xfs_trans		*tp,
>>>> +	void				*intent,
>>>> +	struct list_head		*item)
>>>> +{
>>>> +	struct xfs_attri_log_item	*attrip = intent;
>>>> +	struct xfs_attr_item		*attr;
>>>> +	struct xfs_attri_log_format	*attrp;
>>>> +	char				*name_value;
>>>> +
>>>> +	attr = container_of(item, struct xfs_attr_item, xattri_list);
>>>> +	name_value = ((char *)attr) + sizeof(struct xfs_attr_item);
>>>> +
>>>> +	tp->t_flags |= XFS_TRANS_DIRTY;
>>>> +	set_bit(XFS_LI_DIRTY, &attrip->item.li_flags);
>>>> +
>>>> +	attrp = &attrip->format;
>>>> +	attrp->alfi_ino = attr->xattri_ip->i_ino;
>>>> +	attrp->alfi_op_flags = attr->xattri_op_flags;
>>>> +	attrp->alfi_value_len = attr->xattri_value_len;
>>>> +	attrp->alfi_name_len = attr->xattri_name_len;
>>>> +	attrp->alfi_attr_flags = attr->xattri_flags;
>>>> +
>>>> +	attrip->name = name_value;
>>>> +	attrip->value = &name_value[attr->xattri_name_len];
>>>> +	attrip->name_len = attr->xattri_name_len;
>>>> +	attrip->value_len = attr->xattri_value_len;
>>>
>>> So once we're at this point, we've constructed an xfs_attr_item to
>>> describe the high level deferred operation, created an intent log item
>>> and we're now logging that xfs_attri_log_item. We fill in the log format
>>> structure based on the xfs_attr_item and point the xfs_attri_log_item
>>> name/value pointers at the xfs_attr_item memory. It's thus important to
>>> note we've established a subtle relationship between these two data
>>> structures because they may have different lifecycles.
>>
>> Right, I can add some comments if you like?  I guess i assume people have
>> seen these patterns enough to not need them, but the extra explaining never
>> hurts I suppose :-)
>>
> 
> No need to re-document the common patterns, but I think that the log
> item pointing at the attr item memory is fairly unique and subtle. It's
> not self-documenting because the attr structure isn't reference counted
> or anything, it's just allocated and supplied into the dfops
> infrastucture for consumption. A comment somewhere to document that
> dependency would definitely be useful, either in the header or where
> those pointers are set/cleared.
> 
> Brian
> 

Sure, that makes sense.  I will add in some comentary to help point that 
out then.  Thanks!

Allison

>>>
>>>> +}
>>>> +
>>>> +/* Get an ATTRD so we can process all the attrs. */
>>>> +STATIC void *
>>>> +xfs_attr_create_done(
>>>> +	struct xfs_trans		*tp,
>>>> +	void				*intent,
>>>> +	unsigned int			count)
>>>> +{
>>>> +	return xfs_trans_get_attrd(tp, intent);
>>>> +}
>>>> +
>>>> +/* Process an attr. */
>>>> +STATIC int
>>>> +xfs_attr_finish_item(
>>>> +	struct xfs_trans		*tp,
>>>> +	struct list_head		*item,
>>>> +	void				*done_item,
>>>> +	void				**state)
>>>> +{
>>>> +	struct xfs_attr_item		*attr;
>>>> +	char				*name_value;
>>>> +	int				error;
>>>> +	int				local;
>>>> +	struct xfs_da_args		args;
>>>> +
>>>> +	attr = container_of(item, struct xfs_attr_item, xattri_list);
>>>> +	name_value = ((char *)attr) + sizeof(struct xfs_attr_item);
>>>> +
>>>> +	error = xfs_attr_args_init(&args, attr->xattri_ip, name_value,
>>>> +				   attr->xattri_name_len, attr->xattri_flags);
>>>> +	if (error)
>>>> +		goto out;
>>>> +
>>>> +	args.hashval = xfs_da_hashname(args.name, args.namelen);
>>>> +	args.value = &name_value[attr->xattri_name_len];
>>>> +	args.valuelen = attr->xattri_value_len;
>>>> +	args.op_flags = XFS_DA_OP_OKNOENT;
>>>> +	args.total = xfs_attr_calc_size(&args, &local);
>>>> +	args.trans = tp;
>>>> +
>>>> +	error = xfs_trans_attr(&args, done_item,
>>>> +			attr->xattri_op_flags);
>>>
>>> So now we've committed/rolled our xfs_attri_log_item intent and
>>> created/attached the xfs_attrd_log_item and thus we're free to perform
>>> the operation...
>>>
>>>> +out:
>>>> +	kmem_free(attr);
>>>
>>> ... and here is where we end up freeing the xfs_attr_item created for
>>> the dfops infrastructure that holds our name and value memory.
>>>
>>> Hmm.. I think this means our name/value memory accesses are safe because
>>> the xfs_attri_log_item only accesses them in the ->iop_format()
>>> callback, which occurs during transaction commit of the intent and we're
>>> long past that.
>>>
>>> That said, the attri/attrd log items themselves outlive the current
>>> transaction commit sequence (i.e. until the attrd is physically
>>> logged/committed and we free both). That means that once we free the
>>> attr above we technically have an attri passing through the log
>>> infrastructure with a couple invalid pointers, they just don't happen to
>>> be used. It might be worth thinking about how we can clean that up,
>>> whether it be clearing those pointers here, or allocating the name/val
>>> memory separately and transferring it to the attri, etc. Whatever we end
>>> up doing, we should probably add a comment somewhere to explain exactly
>>> what's going on as well.
>>>
>>> Brian
>>
>> I see, thats a good observation.  I'll see if I can work in some clean up
>> code and be sure to add some comentary to point it out.  Thanks for the
>> thorough review!!  Much appreciated!!
>>
>> Allison
>>
>>>
>>>> +	return error;
>>>> +}
>>>> +
>>>> +/* Abort all pending ATTRs. */
>>>> +STATIC void
>>>> +xfs_attr_abort_intent(
>>>> +	void				*intent)
>>>> +{
>>>> +	xfs_attri_release(intent);
>>>> +}
>>>> +
>>>> +/* Cancel an attr */
>>>> +STATIC void
>>>> +xfs_attr_cancel_item(
>>>> +	struct list_head		*item)
>>>> +{
>>>> +	struct xfs_attr_item	*attr;
>>>> +
>>>> +	attr = container_of(item, struct xfs_attr_item, xattri_list);
>>>> +	kmem_free(attr);
>>>> +}
>>>> +
>>>> +const struct xfs_defer_op_type xfs_attr_defer_type = {
>>>> +	.max_items	= XFS_ATTRI_MAX_FAST_ATTRS,
>>>> +	.diff_items	= xfs_attr_diff_items,
>>>> +	.create_intent	= xfs_attr_create_intent,
>>>> +	.abort_intent	= xfs_attr_abort_intent,
>>>> +	.log_item	= xfs_attr_log_item,
>>>> +	.create_done	= xfs_attr_create_done,
>>>> +	.finish_item	= xfs_attr_finish_item,
>>>> +	.cancel_item	= xfs_attr_cancel_item,
>>>> +};
>>>> +
>>>> -- 
>>>> 2.7.4
>>>>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 5/9] xfs: Add xfs_attr_set_deferred and xfs_attr_remove_deferred
  2019-04-22 11:01       ` Brian Foster
@ 2019-04-22 22:01         ` Allison Henderson
  2019-04-23 13:00           ` Brian Foster
  0 siblings, 1 reply; 48+ messages in thread
From: Allison Henderson @ 2019-04-22 22:01 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs



On 4/22/19 4:01 AM, Brian Foster wrote:
> On Thu, Apr 18, 2019 at 02:28:00PM -0700, Allison Henderson wrote:
>> On 4/18/19 8:49 AM, Brian Foster wrote:
>>> On Fri, Apr 12, 2019 at 03:50:32PM -0700, Allison Henderson wrote:
>>>> These routines set up set and start a new deferred attribute
>>>> operation.  These functions are meant to be called by other
>>>> code needing to initiate a deferred attribute operation.  We
>>>> will use these routines later in the parent pointer patches.
>>>>
>>>
>>> We probably don't need to reference the parent pointer stuff any more
>>> for this, right? I'm assuming we'll be converting generic attr
>>> infrastructure over to this mechanism in subsequent patches..?
>>
>> Right, some of these comments are a little stale.  I will clean then up a
>> bit.
>>
>>>
>>>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
>>>> ---
>>>>    fs/xfs/libxfs/xfs_attr.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>    fs/xfs/libxfs/xfs_attr.h |  7 +++++
>>>>    2 files changed, 87 insertions(+)
>>>>
>>>> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
>>>> index fadd485..c3477fa7 100644
>>>> --- a/fs/xfs/libxfs/xfs_attr.c
>>>> +++ b/fs/xfs/libxfs/xfs_attr.c
> ...
>>>> @@ -513,6 +560,39 @@ xfs_attr_remove(
>>>>    	return error;
>>>>    }
>>>> +/* Removes an attribute for an inode as a deferred operation */
>>>> +int
>>>> +xfs_attr_remove_deferred(
>>>
>>> Hmm.. I'm kind of wondering if we actually need to defer attr removes.
>>> Do we have the same kind of challenges for attr removal as for attr
>>> creation, or is there some future scenario where this is needed?
>>
>> I suppose we don't have to have it?  The motivation was to help break up the
>> amount of transaction activity that happens on inode create/rename/remove
>> operations once pptrs go in.  Attr remove does not look as complex as attr
>> set, but I suppose it helps to some degree?
>>
> 
> Ok, this probably needs more thought. On one hand, I'm not a huge fan of
> using complex infrastructure where not required just because it's there.
> On the other, it could just be more simple to have consistency between
> xattr ops. As you note above, perhaps we do want the ability to defer
> xattr removes so we can use it in particular contexts (parent pointer
> updates) and not others (direct xattr remove requests from userspace).
> Perhaps the right thing to do for the time being is to continue on with
> the support for deferred xattr remove but don't invoke it from the
> direct xattr remove codepath..?

We can do this, but it means we need to keep the "roll_trans" boolean 
for all code paths that want to retain their original functionality, and 
also still be able to function as a delayed operation too.

It's not a big deal I suppose.  The remove code path does not have as 
many uses of the boolean.  But I seem to recall people thinking that the 
boolean was not particularly elegant, so I was careful to point out that 
it was going away at the end of the set :-)

> 
> Note that if we took that approach, we could add a DEBUG option and/or
> an errortag to (randomly) defer xattr removes in the common path for
> test coverage purposes.

Sure, that would be an easy thing to stitch in.  Once parent pointers go 
in, delayed attrs will get a lot more exorcise since they will be a part 
of inode create/move/remove too.

Allison

> 
> Brian
> 
>>>
>>>> +	struct xfs_inode        *dp,
>>>> +	struct xfs_trans	*tp,
>>>> +	const unsigned char	*name,
>>>> +	unsigned int		namelen,
>>>> +	int                     flags)
>>>> +{
>>>> +
>>>> +	struct xfs_attr_item	*new;
>>>> +	char			*name_value;
>>>> +
>>>> +	if (!namelen) {
>>>> +		ASSERT(0);
>>>> +		return -EFSCORRUPTED;
>>>
>>> Similar comment around -EFSCORRUPTED vs. -EINVAL (or something else..).
>> Ok, I will change to EINVAL here too.
>>
>> Thanks again for the reviews!!  They are very helpful!
>>
>> Allison
>>>
>>> Brian
>>>
>>>> +	}
>>>> +
>>>> +	new = kmem_alloc(XFS_ATTR_ITEM_SIZEOF(namelen, 0), KM_SLEEP|KM_NOFS);
>>>> +	name_value = ((char *)new) + sizeof(struct xfs_attr_item);
>>>> +	memset(new, 0, XFS_ATTR_ITEM_SIZEOF(namelen, 0));
>>>> +	new->xattri_ip = dp;
>>>> +	new->xattri_op_flags = XFS_ATTR_OP_FLAGS_REMOVE;
>>>> +	new->xattri_name_len = namelen;
>>>> +	new->xattri_value_len = 0;
>>>> +	new->xattri_flags = flags;
>>>> +	memcpy(name_value, name, namelen);
>>>> +
>>>> +	xfs_defer_add(tp, XFS_DEFER_OPS_TYPE_ATTR, &new->xattri_list);
>>>> +
>>>> +	return 0;
>>>> +}
>>>> +
>>>>    /*========================================================================
>>>>     * External routines when attribute list is inside the inode
>>>>     *========================================================================*/
>>>> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
>>>> index 92d9a15..83b3621 100644
>>>> --- a/fs/xfs/libxfs/xfs_attr.h
>>>> +++ b/fs/xfs/libxfs/xfs_attr.h
>>>> @@ -175,5 +175,12 @@ bool xfs_attr_namecheck(const void *name, size_t length);
>>>>    int xfs_attr_args_init(struct xfs_da_args *args, struct xfs_inode *dp,
>>>>    			const unsigned char *name, size_t namelen, int flags);
>>>>    int xfs_attr_calc_size(struct xfs_da_args *args, int *local);
>>>> +int xfs_attr_set_deferred(struct xfs_inode *dp, struct xfs_trans *tp,
>>>> +			  const unsigned char *name, unsigned int name_len,
>>>> +			  const unsigned char *value, unsigned int valuelen,
>>>> +			  int flags);
>>>> +int xfs_attr_remove_deferred(struct xfs_inode *dp, struct xfs_trans *tp,
>>>> +			    const unsigned char *name, unsigned int namelen,
>>>> +			    int flags);
>>>>    #endif	/* __XFS_ATTR_H__ */
>>>> -- 
>>>> 2.7.4
>>>>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 6/9] xfs: Add xfs_has_attr and subroutines
  2019-04-22 13:00   ` Brian Foster
@ 2019-04-22 22:01     ` Allison Henderson
  0 siblings, 0 replies; 48+ messages in thread
From: Allison Henderson @ 2019-04-22 22:01 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On 4/22/19 6:00 AM, Brian Foster wrote:
> On Fri, Apr 12, 2019 at 03:50:33PM -0700, Allison Henderson wrote:
>> This patch adds a new functions to check for the existence of
>> an attribute.  Subroutines are also added to handle the cases
>> of leaf blocks, nodes or shortform.  We will need this later
>> for delayed attributes since delayed operations cannot return
>> error codes.
>>
>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
>> ---
>>   fs/xfs/libxfs/xfs_attr.c      | 78 +++++++++++++++++++++++++++++++++++++++++++
>>   fs/xfs/libxfs/xfs_attr.h      |  1 +
>>   fs/xfs/libxfs/xfs_attr_leaf.c | 33 ++++++++++++++++++
>>   fs/xfs/libxfs/xfs_attr_leaf.h |  1 +
>>   4 files changed, 113 insertions(+)
>>
>> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
>> index c3477fa7..0042708 100644
>> --- a/fs/xfs/libxfs/xfs_attr.c
>> +++ b/fs/xfs/libxfs/xfs_attr.c
>> @@ -53,6 +53,7 @@ STATIC int xfs_attr_shortform_addname(xfs_da_args_t *args);
>>   STATIC int xfs_attr_leaf_get(xfs_da_args_t *args);
>>   STATIC int xfs_attr_leaf_addname(xfs_da_args_t *args, bool roll_trans);
>>   STATIC int xfs_attr_leaf_removename(xfs_da_args_t *args, bool roll_trans);
>> +STATIC int xfs_leaf_has_attr(xfs_da_args_t *args);
>>   
>>   /*
>>    * Internal routines when attribute list is more than one block.
>> @@ -60,6 +61,7 @@ STATIC int xfs_attr_leaf_removename(xfs_da_args_t *args, bool roll_trans);
>>   STATIC int xfs_attr_node_get(xfs_da_args_t *args);
>>   STATIC int xfs_attr_node_addname(xfs_da_args_t *args, bool roll_trans);
>>   STATIC int xfs_attr_node_removename(xfs_da_args_t *args, bool roll_trans);
>> +STATIC int xfs_attr_node_hasname(xfs_da_args_t *args);
>>   STATIC int xfs_attr_fillstate(xfs_da_state_t *state);
>>   STATIC int xfs_attr_refillstate(xfs_da_state_t *state);
>>   
>> @@ -301,6 +303,29 @@ xfs_attr_set_args(
>>   }
>>   
>>   /*
>> + * Return successful if attr is found, or ENOATTR if not
>> + */
>> +int
>> +xfs_has_attr(
>> +	struct xfs_da_args      *args)
>> +{
>> +	struct xfs_inode        *dp = args->dp;
>> +	int                     error;
>> +
>> +	if (!xfs_inode_hasattr(dp))
>> +		error = -ENOATTR;
>> +	else if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL) {
>> +		ASSERT(dp->i_afp->if_flags & XFS_IFINLINE);
>> +		error = xfs_shortform_has_attr(args);
>> +	} else if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
>> +		error = xfs_leaf_has_attr(args);
>> +	else
>> +		error = xfs_attr_node_hasname(args);
> 
> I think it's usually expected to keep the {} braces around each branch
> of a multi-branch if/else when at least one branch has multiple lines.
> 
> Also, I see that at least some of this code is pulled from existing
> xattr functions. For example, the xfs_shortform_has_attr() code
> currently exists in xfs_attr_shortform_remove(),
> xfs_attr_shortform_add(), etc. Similar for xfs_leaf_has_attr() and
> xfs_attr_leaf_removename(), etc.
> 
> Rather than just adding new, not yet used functions, can we turn this
> patch more into a refactor where these new functions are reused by
> existing code where applicable? That reduces duplication and also
> facilitates review.
> 
> Brian

Sure, I will see if I can factor out the common code and consolidate 
things a bit.

Allison

> 
>> +
>> +	return error;
>> +}
>> +
>> +/*
>>    * Remove the attribute specified in @args.
>>    */
>>   int
>> @@ -836,6 +861,29 @@ xfs_attr_leaf_addname(
>>   }
>>   
>>   /*
>> + * Return successful if attr is found, or ENOATTR if not
>> + */
>> +STATIC int
>> +xfs_leaf_has_attr(
>> +	struct xfs_da_args      *args)
>> +{
>> +	struct xfs_buf          *bp;
>> +	int                     error = 0;
>> +
>> +	args->blkno = 0;
>> +	error = xfs_attr3_leaf_read(args->trans, args->dp,
>> +			args->blkno, -1, &bp);
>> +	if (error)
>> +		return error;
>> +
>> +	error = xfs_attr3_leaf_lookup_int(bp, args);
>> +	error = (error == -ENOATTR) ? -ENOATTR : 0;
>> +	xfs_trans_brelse(args->trans, bp);
>> +
>> +	return error;
>> +}
>> +
>> +/*
>>    * Remove a name from the leaf attribute list structure
>>    *
>>    * This leaf block cannot have a "remote" value, we only call this routine
>> @@ -1166,6 +1214,36 @@ xfs_attr_node_addname(
>>   }
>>   
>>   /*
>> + * Return successful if attr is found, or ENOATTR if not
>> + */
>> +STATIC int
>> +xfs_attr_node_hasname(
>> +	struct xfs_da_args	*args)
>> +{
>> +	struct xfs_da_state	*state;
>> +	struct xfs_inode	*dp;
>> +	int			retval, error;
>> +
>> +	/*
>> +	 * Tie a string around our finger to remind us where we are.
>> +	 */
>> +	dp = args->dp;
>> +	state = xfs_da_state_alloc();
>> +	state->args = args;
>> +	state->mp = dp->i_mount;
>> +
>> +	/*
>> +	 * Search to see if name exists, and get back a pointer to it.
>> +	 */
>> +	error = xfs_da3_node_lookup_int(state, &retval);
>> +	if (error || (retval != -EEXIST)) {
>> +		if (error == 0)
>> +			error = retval;
>> +	}
>> +	return error;
>> +}
>> +
>> +/*
>>    * Remove a name from a B-tree attribute list.
>>    *
>>    * This will involve walking down the Btree, and may involve joining
>> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
>> index 83b3621..974c963 100644
>> --- a/fs/xfs/libxfs/xfs_attr.h
>> +++ b/fs/xfs/libxfs/xfs_attr.h
>> @@ -168,6 +168,7 @@ int xfs_attr_set_args(struct xfs_da_args *args, struct xfs_buf **leaf_bp,
>>   		 bool roll_trans);
>>   int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name,
>>   		    size_t namelen, int flags);
>> +int xfs_has_attr(struct xfs_da_args *args);
>>   int xfs_attr_remove_args(struct xfs_da_args *args, bool roll_trans);
>>   int xfs_attr_list(struct xfs_inode *dp, char *buffer, int bufsize,
>>   		  int flags, struct attrlist_cursor_kern *cursor);
>> diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c
>> index 128bfe9..e9f2f53 100644
>> --- a/fs/xfs/libxfs/xfs_attr_leaf.c
>> +++ b/fs/xfs/libxfs/xfs_attr_leaf.c
>> @@ -622,6 +622,39 @@ xfs_attr_fork_remove(
>>   }
>>   
>>   /*
>> + * Return successful if attr is found, or ENOATTR if not
>> + */
>> +int
>> +xfs_shortform_has_attr(
>> +	struct xfs_da_args	 *args)
>> +{
>> +	struct xfs_attr_shortform *sf;
>> +	struct xfs_attr_sf_entry *sfe;
>> +	int			base = sizeof(struct xfs_attr_sf_hdr);
>> +	int			size = 0;
>> +	int			end;
>> +	int			i;
>> +
>> +	sf = (struct xfs_attr_shortform *)args->dp->i_afp->if_u1.if_data;
>> +	sfe = &sf->list[0];
>> +	end = sf->hdr.count;
>> +	for (i = 0; i < end; sfe = XFS_ATTR_SF_NEXTENTRY(sfe),
>> +			base += size, i++) {
>> +		size = XFS_ATTR_SF_ENTSIZE(sfe);
>> +		if (sfe->namelen != args->namelen)
>> +			continue;
>> +		if (memcmp(sfe->nameval, args->name, args->namelen) != 0)
>> +			continue;
>> +		if (!xfs_attr_namesp_match(args->flags, sfe->flags))
>> +			continue;
>> +		break;
>> +	}
>> +	if (i == end)
>> +		return -ENOATTR;
>> +	return 0;
>> +}
>> +
>> +/*
>>    * Remove an attribute from the shortform attribute list structure.
>>    */
>>   int
>> diff --git a/fs/xfs/libxfs/xfs_attr_leaf.h b/fs/xfs/libxfs/xfs_attr_leaf.h
>> index 9d830ec..98dd169 100644
>> --- a/fs/xfs/libxfs/xfs_attr_leaf.h
>> +++ b/fs/xfs/libxfs/xfs_attr_leaf.h
>> @@ -39,6 +39,7 @@ int	xfs_attr_shortform_getvalue(struct xfs_da_args *args);
>>   int	xfs_attr_shortform_to_leaf(struct xfs_da_args *args,
>>   			struct xfs_buf **leaf_bp);
>>   int	xfs_attr_shortform_remove(struct xfs_da_args *args);
>> +int	xfs_shortform_has_attr(struct xfs_da_args *args);
>>   int	xfs_attr_shortform_allfit(struct xfs_buf *bp, struct xfs_inode *dp);
>>   int	xfs_attr_shortform_bytesfit(struct xfs_inode *dp, int bytes);
>>   xfs_failaddr_t xfs_attr_shortform_verify(struct xfs_inode *ip);
>> -- 
>> 2.7.4
>>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 7/9] xfs: Add attr context to log item
  2019-04-22 13:03   ` Brian Foster
@ 2019-04-22 22:01     ` Allison Henderson
  2019-04-23 13:20       ` Brian Foster
  0 siblings, 1 reply; 48+ messages in thread
From: Allison Henderson @ 2019-04-22 22:01 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs



On 4/22/19 6:03 AM, Brian Foster wrote:
> On Fri, Apr 12, 2019 at 03:50:34PM -0700, Allison Henderson wrote:
>> This patch modifies xfs_attr_item to store a xfs_da_args, a xfs_buf pointer
>> and a new state type. We will use these in the next patch when
>> we modify xfs_set_attr_args to roll transactions by returning EAGAIN.
>> Because the subroutines of this function modify the contents of these
>> structures, we need to find a place to store them where they remain
>> instantiated across multiple calls to xfs_set_attr_args.
>>
>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
>> ---
> 
> I see Darrick has already commented on the whole state thing. I'll
> probably have to grok the next patch to comment further, but just a
> couple initial thoughts:
> 
> First, I hit a build failure with this patch. It looks like there's a
> missed include in the scrub code:
> 
>    ...
>    CC [M]  fs/xfs/scrub/repair.o
> In file included from fs/xfs/scrub/repair.c:32:
> fs/xfs/libxfs/xfs_attr.h:105:21: error: field ‘xattri_args’ has incomplete type
>    struct xfs_da_args xattri_args;   /* args context */
Hmm, ok.  I'll get that corrected, I probably need to clean out my 
workspace and build from scratch.

>    ...
> 
> Second, the commit log suggests that the states will reflect the current
> transaction roll points (i.e., establishing re-entry points down in
> xfs_attr_set_args(). I'm kind of wondering if we should break these
> xattr set sub-sequences down into smaller helper functions (refactoring
> the existing code as we go) such that the mechanism could technically be
> used deferred or not. Re: the previous thought on whether to defer xattr
> removes or not, there might also be cases where there's not a need to
> defer xattr sets.
> 
> E.g., taking a quick peek into the next patch, the state 1 case in
> xfs_attr_try_sf_addname() is actually a transaction commit, which I
> think means we're done. We'd have done an attr memory allocation,
> deferred op and transaction roll where none was necessary so it might
> not be worth it to defer in that scenario. Hmm, it also looks like we
> return -EAGAIN in places where we've not actually done any work, like if
> a shortform add attempt returns -ENOSPC (or the -EAGAIN return before we
> even attempt the sf add). That kind of looks like a waste of transaction
> rolls and further suggests it might be cleaner to break this whole path
> down into helpers and put it back together in a way more conducive to
> deferred operations.

Yes, this area is a bit of a wart the way it is right now.  I think 
you're right in that ultimately we may end up having to do a lot of 
refactoring in order to have more efficient "re-entry points".  The 
state machine is hard to get into subroutines, so it's limited in use in 
the top level function.

I was also starting to wonder if maybe I could do some refactoring in 
xfs_defer_finish_noroll to capture the common code associated with the 
-EAGAIN handling.  Then maybe we could make a function pointer that we 
can pass through the finish_item interface.  The idea being that 
subroutines could use the function pointer to cycle out the transaction 
when needed instead of having to record states and back out like this. 
It'd be a new parameter to pipe around, but it'd be more efficient than 
the state machine, and less surgery in the refactor.  And maybe a 
blessing to any other operations that might need to go through this 
transition in the future.  Thoughts?

Thanks again for the reviews!

Allison

> 
> Brian
> 
> 
>>   fs/xfs/libxfs/xfs_attr.h | 18 +++++++++++++++++-
>>   fs/xfs/scrub/common.c    |  2 ++
>>   fs/xfs/xfs_acl.c         |  2 ++
>>   fs/xfs/xfs_attr_item.c   |  2 +-
>>   fs/xfs/xfs_ioctl.c       |  2 ++
>>   fs/xfs/xfs_ioctl32.c     |  2 ++
>>   fs/xfs/xfs_iops.c        |  1 +
>>   fs/xfs/xfs_xattr.c       |  1 +
>>   8 files changed, 28 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
>> index 974c963..4ce3b0a 100644
>> --- a/fs/xfs/libxfs/xfs_attr.h
>> +++ b/fs/xfs/libxfs/xfs_attr.h
>> @@ -77,6 +77,13 @@ typedef struct attrlist_ent {	/* data from attr_list() */
>>   	char	a_name[1];	/* attr name (NULL terminated) */
>>   } attrlist_ent_t;
>>   
>> +/* Attr state machine types */
>> +enum xfs_attr_state {
>> +	XFS_ATTR_STATE1 = 1,
>> +	XFS_ATTR_STATE2 = 2,
>> +	XFS_ATTR_STATE3 = 3,
>> +};
>> +
>>   /*
>>    * List of attrs to commit later.
>>    */
>> @@ -88,7 +95,16 @@ struct xfs_attr_item {
>>   	void		  *xattri_name;	      /* attr name */
>>   	uint32_t	  xattri_name_len;    /* length of name */
>>   	uint32_t	  xattri_flags;       /* attr flags */
>> -	struct list_head  xattri_list;
>> +
>> +	/*
>> +	 * Delayed attr parameters that need to remain instantiated
>> +	 * across transaction rolls during the defer finish
>> +	 */
>> +	struct xfs_buf		*xattri_leaf_bp;  /* Leaf buf to release */
>> +	enum xfs_attr_state	xattri_state;	  /* state machine marker */
>> +	struct xfs_da_args	xattri_args;	  /* args context */
>> +
>> +	struct list_head	xattri_list;
>>   
>>   	/*
>>   	 * A byte array follows the header containing the file name and
>> diff --git a/fs/xfs/scrub/common.c b/fs/xfs/scrub/common.c
>> index 0c54ff5..270c32e 100644
>> --- a/fs/xfs/scrub/common.c
>> +++ b/fs/xfs/scrub/common.c
>> @@ -30,6 +30,8 @@
>>   #include "xfs_rmap_btree.h"
>>   #include "xfs_log.h"
>>   #include "xfs_trans_priv.h"
>> +#include "xfs_da_format.h"
>> +#include "xfs_da_btree.h"
>>   #include "xfs_attr.h"
>>   #include "xfs_reflink.h"
>>   #include "scrub/xfs_scrub.h"
>> diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c
>> index 142de8d..9b1b93e 100644
>> --- a/fs/xfs/xfs_acl.c
>> +++ b/fs/xfs/xfs_acl.c
>> @@ -10,6 +10,8 @@
>>   #include "xfs_mount.h"
>>   #include "xfs_inode.h"
>>   #include "xfs_acl.h"
>> +#include "xfs_da_format.h"
>> +#include "xfs_da_btree.h"
>>   #include "xfs_attr.h"
>>   #include "xfs_trace.h"
>>   #include <linux/slab.h>
>> diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
>> index 0ea19b4..36e6d1e 100644
>> --- a/fs/xfs/xfs_attr_item.c
>> +++ b/fs/xfs/xfs_attr_item.c
>> @@ -19,10 +19,10 @@
>>   #include "xfs_rmap.h"
>>   #include "xfs_inode.h"
>>   #include "xfs_icache.h"
>> -#include "xfs_attr.h"
>>   #include "xfs_shared.h"
>>   #include "xfs_da_format.h"
>>   #include "xfs_da_btree.h"
>> +#include "xfs_attr.h"
>>   
>>   static inline struct xfs_attri_log_item *ATTRI_ITEM(struct xfs_log_item *lip)
>>   {
>> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
>> index ab341d6..c8728ca 100644
>> --- a/fs/xfs/xfs_ioctl.c
>> +++ b/fs/xfs/xfs_ioctl.c
>> @@ -16,6 +16,8 @@
>>   #include "xfs_rtalloc.h"
>>   #include "xfs_itable.h"
>>   #include "xfs_error.h"
>> +#include "xfs_da_format.h"
>> +#include "xfs_da_btree.h"
>>   #include "xfs_attr.h"
>>   #include "xfs_bmap.h"
>>   #include "xfs_bmap_util.h"
>> diff --git a/fs/xfs/xfs_ioctl32.c b/fs/xfs/xfs_ioctl32.c
>> index 5001dca..23f6990 100644
>> --- a/fs/xfs/xfs_ioctl32.c
>> +++ b/fs/xfs/xfs_ioctl32.c
>> @@ -21,6 +21,8 @@
>>   #include "xfs_fsops.h"
>>   #include "xfs_alloc.h"
>>   #include "xfs_rtalloc.h"
>> +#include "xfs_da_format.h"
>> +#include "xfs_da_btree.h"
>>   #include "xfs_attr.h"
>>   #include "xfs_ioctl.h"
>>   #include "xfs_ioctl32.h"
>> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
>> index e73c21a..561c467 100644
>> --- a/fs/xfs/xfs_iops.c
>> +++ b/fs/xfs/xfs_iops.c
>> @@ -17,6 +17,7 @@
>>   #include "xfs_acl.h"
>>   #include "xfs_quota.h"
>>   #include "xfs_error.h"
>> +#include "xfs_da_btree.h"
>>   #include "xfs_attr.h"
>>   #include "xfs_trans.h"
>>   #include "xfs_trace.h"
>> diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c
>> index 3013746..938e81d 100644
>> --- a/fs/xfs/xfs_xattr.c
>> +++ b/fs/xfs/xfs_xattr.c
>> @@ -11,6 +11,7 @@
>>   #include "xfs_mount.h"
>>   #include "xfs_da_format.h"
>>   #include "xfs_inode.h"
>> +#include "xfs_da_btree.h"
>>   #include "xfs_attr.h"
>>   #include "xfs_attr_leaf.h"
>>   #include "xfs_acl.h"
>> -- 
>> 2.7.4
>>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 5/9] xfs: Add xfs_attr_set_deferred and xfs_attr_remove_deferred
  2019-04-22 22:01         ` Allison Henderson
@ 2019-04-23 13:00           ` Brian Foster
  2019-04-24  2:24             ` Allison Henderson
  0 siblings, 1 reply; 48+ messages in thread
From: Brian Foster @ 2019-04-23 13:00 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Mon, Apr 22, 2019 at 03:01:14PM -0700, Allison Henderson wrote:
> 
> 
> On 4/22/19 4:01 AM, Brian Foster wrote:
> > On Thu, Apr 18, 2019 at 02:28:00PM -0700, Allison Henderson wrote:
> > > On 4/18/19 8:49 AM, Brian Foster wrote:
> > > > On Fri, Apr 12, 2019 at 03:50:32PM -0700, Allison Henderson wrote:
> > > > > These routines set up set and start a new deferred attribute
> > > > > operation.  These functions are meant to be called by other
> > > > > code needing to initiate a deferred attribute operation.  We
> > > > > will use these routines later in the parent pointer patches.
> > > > > 
> > > > 
> > > > We probably don't need to reference the parent pointer stuff any more
> > > > for this, right? I'm assuming we'll be converting generic attr
> > > > infrastructure over to this mechanism in subsequent patches..?
> > > 
> > > Right, some of these comments are a little stale.  I will clean then up a
> > > bit.
> > > 
> > > > 
> > > > > Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> > > > > ---
> > > > >    fs/xfs/libxfs/xfs_attr.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++
> > > > >    fs/xfs/libxfs/xfs_attr.h |  7 +++++
> > > > >    2 files changed, 87 insertions(+)
> > > > > 
> > > > > diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> > > > > index fadd485..c3477fa7 100644
> > > > > --- a/fs/xfs/libxfs/xfs_attr.c
> > > > > +++ b/fs/xfs/libxfs/xfs_attr.c
> > ...
> > > > > @@ -513,6 +560,39 @@ xfs_attr_remove(
> > > > >    	return error;
> > > > >    }
> > > > > +/* Removes an attribute for an inode as a deferred operation */
> > > > > +int
> > > > > +xfs_attr_remove_deferred(
> > > > 
> > > > Hmm.. I'm kind of wondering if we actually need to defer attr removes.
> > > > Do we have the same kind of challenges for attr removal as for attr
> > > > creation, or is there some future scenario where this is needed?
> > > 
> > > I suppose we don't have to have it?  The motivation was to help break up the
> > > amount of transaction activity that happens on inode create/rename/remove
> > > operations once pptrs go in.  Attr remove does not look as complex as attr
> > > set, but I suppose it helps to some degree?
> > > 
> > 
> > Ok, this probably needs more thought. On one hand, I'm not a huge fan of
> > using complex infrastructure where not required just because it's there.
> > On the other, it could just be more simple to have consistency between
> > xattr ops. As you note above, perhaps we do want the ability to defer
> > xattr removes so we can use it in particular contexts (parent pointer
> > updates) and not others (direct xattr remove requests from userspace).
> > Perhaps the right thing to do for the time being is to continue on with
> > the support for deferred xattr remove but don't invoke it from the
> > direct xattr remove codepath..?
> 
> We can do this, but it means we need to keep the "roll_trans" boolean for
> all code paths that want to retain their original functionality, and also
> still be able to function as a delayed operation too.
> 
> It's not a big deal I suppose.  The remove code path does not have as many
> uses of the boolean.  But I seem to recall people thinking that the boolean
> was not particularly elegant, so I was careful to point out that it was
> going away at the end of the set :-)
> 

Hmm, I was hoping we could refactor the existing code in a way that
supports both without spreading the boolean all over the place (by
breaking things down into smaller functional components), but poking
deeper into the xattr codepath suggests that could get quite hairy and
might not be worth it. I think it might be reasonable to just leave
around enough direct functionality for operations that don't require a
transaction roll. For example, a shortform xattr set just commits the
transaction if it succeeds. If it fails, we could make the decision to
defer the operation as we know we're now going to require a tx roll
anyways. That way a direct xattr set doesn't need to be deferred for no
reason if it wouldn't otherwise roll, while we still have the ability to
defer an arbitrary xattr set (even if shortform) for internal things
like parent pointers where we don't necessarily have an xattr
transaction.

Same goes for the shortform remove operation (and perhaps others), which
could be reused in both direct and deferred contexts because it doesn't
appear to roll the tx. Note that we don't necessarily have to share the
exact same xfs_attr_[set|remove]_args() function between direct and
deferred context. A separate function in the direct path to attempt a
direct op and then defer and another in the deferred path that covers
pretty much everything (with fixed up -EAGAIN magic) might be easier to
manage.

All that said, if you'd rather just defer everything for now and
potentially revisit pulling more things into the direct path later on
then I think that's perfectly reasonable too. The existing code is
really kind of a jumbled mess and we stand to benefit just by
simplifying/organizing it, IMO. I think there's a reasonable argument to
be made that we're better off working through all of the -EAGAIN stuff
and working the direct case as an optimization from there.

> > 
> > Note that if we took that approach, we could add a DEBUG option and/or
> > an errortag to (randomly) defer xattr removes in the common path for
> > test coverage purposes.
> 
> Sure, that would be an easy thing to stitch in.  Once parent pointers go in,
> delayed attrs will get a lot more exorcise since they will be a part of
> inode create/move/remove too.
> 

Note that I think this would only be warranted if there was no other way
to invoke the deferred path directly from userspace (for testing). If we
did a deferred fallback approach like the above or just resort to
deferring everything, then we'll defer plenty (or all) of traditional
xattr ops and this is probably not necessary.

Brian

> Allison
> 
> > 
> > Brian
> > 
> > > > 
> > > > > +	struct xfs_inode        *dp,
> > > > > +	struct xfs_trans	*tp,
> > > > > +	const unsigned char	*name,
> > > > > +	unsigned int		namelen,
> > > > > +	int                     flags)
> > > > > +{
> > > > > +
> > > > > +	struct xfs_attr_item	*new;
> > > > > +	char			*name_value;
> > > > > +
> > > > > +	if (!namelen) {
> > > > > +		ASSERT(0);
> > > > > +		return -EFSCORRUPTED;
> > > > 
> > > > Similar comment around -EFSCORRUPTED vs. -EINVAL (or something else..).
> > > Ok, I will change to EINVAL here too.
> > > 
> > > Thanks again for the reviews!!  They are very helpful!
> > > 
> > > Allison
> > > > 
> > > > Brian
> > > > 
> > > > > +	}
> > > > > +
> > > > > +	new = kmem_alloc(XFS_ATTR_ITEM_SIZEOF(namelen, 0), KM_SLEEP|KM_NOFS);
> > > > > +	name_value = ((char *)new) + sizeof(struct xfs_attr_item);
> > > > > +	memset(new, 0, XFS_ATTR_ITEM_SIZEOF(namelen, 0));
> > > > > +	new->xattri_ip = dp;
> > > > > +	new->xattri_op_flags = XFS_ATTR_OP_FLAGS_REMOVE;
> > > > > +	new->xattri_name_len = namelen;
> > > > > +	new->xattri_value_len = 0;
> > > > > +	new->xattri_flags = flags;
> > > > > +	memcpy(name_value, name, namelen);
> > > > > +
> > > > > +	xfs_defer_add(tp, XFS_DEFER_OPS_TYPE_ATTR, &new->xattri_list);
> > > > > +
> > > > > +	return 0;
> > > > > +}
> > > > > +
> > > > >    /*========================================================================
> > > > >     * External routines when attribute list is inside the inode
> > > > >     *========================================================================*/
> > > > > diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
> > > > > index 92d9a15..83b3621 100644
> > > > > --- a/fs/xfs/libxfs/xfs_attr.h
> > > > > +++ b/fs/xfs/libxfs/xfs_attr.h
> > > > > @@ -175,5 +175,12 @@ bool xfs_attr_namecheck(const void *name, size_t length);
> > > > >    int xfs_attr_args_init(struct xfs_da_args *args, struct xfs_inode *dp,
> > > > >    			const unsigned char *name, size_t namelen, int flags);
> > > > >    int xfs_attr_calc_size(struct xfs_da_args *args, int *local);
> > > > > +int xfs_attr_set_deferred(struct xfs_inode *dp, struct xfs_trans *tp,
> > > > > +			  const unsigned char *name, unsigned int name_len,
> > > > > +			  const unsigned char *value, unsigned int valuelen,
> > > > > +			  int flags);
> > > > > +int xfs_attr_remove_deferred(struct xfs_inode *dp, struct xfs_trans *tp,
> > > > > +			    const unsigned char *name, unsigned int namelen,
> > > > > +			    int flags);
> > > > >    #endif	/* __XFS_ATTR_H__ */
> > > > > -- 
> > > > > 2.7.4
> > > > > 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 7/9] xfs: Add attr context to log item
  2019-04-22 22:01     ` Allison Henderson
@ 2019-04-23 13:20       ` Brian Foster
  2019-04-24  2:24         ` Allison Henderson
  0 siblings, 1 reply; 48+ messages in thread
From: Brian Foster @ 2019-04-23 13:20 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Mon, Apr 22, 2019 at 03:01:27PM -0700, Allison Henderson wrote:
> 
> 
> On 4/22/19 6:03 AM, Brian Foster wrote:
> > On Fri, Apr 12, 2019 at 03:50:34PM -0700, Allison Henderson wrote:
> > > This patch modifies xfs_attr_item to store a xfs_da_args, a xfs_buf pointer
> > > and a new state type. We will use these in the next patch when
> > > we modify xfs_set_attr_args to roll transactions by returning EAGAIN.
> > > Because the subroutines of this function modify the contents of these
> > > structures, we need to find a place to store them where they remain
> > > instantiated across multiple calls to xfs_set_attr_args.
> > > 
> > > Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> > > ---
> > 
> > I see Darrick has already commented on the whole state thing. I'll
> > probably have to grok the next patch to comment further, but just a
> > couple initial thoughts:
> > 
> > First, I hit a build failure with this patch. It looks like there's a
> > missed include in the scrub code:
> > 
> >    ...
> >    CC [M]  fs/xfs/scrub/repair.o
> > In file included from fs/xfs/scrub/repair.c:32:
> > fs/xfs/libxfs/xfs_attr.h:105:21: error: field ‘xattri_args’ has incomplete type
> >    struct xfs_da_args xattri_args;   /* args context */
> Hmm, ok.  I'll get that corrected, I probably need to clean out my workspace
> and build from scratch.
> 
> >    ...
> > 
> > Second, the commit log suggests that the states will reflect the current
> > transaction roll points (i.e., establishing re-entry points down in
> > xfs_attr_set_args(). I'm kind of wondering if we should break these
> > xattr set sub-sequences down into smaller helper functions (refactoring
> > the existing code as we go) such that the mechanism could technically be
> > used deferred or not. Re: the previous thought on whether to defer xattr
> > removes or not, there might also be cases where there's not a need to
> > defer xattr sets.
> > 
> > E.g., taking a quick peek into the next patch, the state 1 case in
> > xfs_attr_try_sf_addname() is actually a transaction commit, which I
> > think means we're done. We'd have done an attr memory allocation,
> > deferred op and transaction roll where none was necessary so it might
> > not be worth it to defer in that scenario. Hmm, it also looks like we
> > return -EAGAIN in places where we've not actually done any work, like if
> > a shortform add attempt returns -ENOSPC (or the -EAGAIN return before we
> > even attempt the sf add). That kind of looks like a waste of transaction
> > rolls and further suggests it might be cleaner to break this whole path
> > down into helpers and put it back together in a way more conducive to
> > deferred operations.
> 
> Yes, this area is a bit of a wart the way it is right now.  I think you're
> right in that ultimately we may end up having to do a lot of refactoring in
> order to have more efficient "re-entry points".  The state machine is hard
> to get into subroutines, so it's limited in use in the top level function.
> 
> I was also starting to wonder if maybe I could do some refactoring in
> xfs_defer_finish_noroll to capture the common code associated with the
> -EAGAIN handling.  Then maybe we could make a function pointer that we can
> pass through the finish_item interface.  The idea being that subroutines
> could use the function pointer to cycle out the transaction when needed
> instead of having to record states and back out like this. It'd be a new
> parameter to pipe around, but it'd be more efficient than the state machine,
> and less surgery in the refactor.  And maybe a blessing to any other
> operations that might need to go through this transition in the future.
> Thoughts?
> 

That's an interesting idea. It still strikes me as a bit of a
fallback/hack as opposed to organizing the code to properly fit into the
dfops infrastructure, but it could be useful as a transient solution.
>From a high level, it looks like we'd have to create a new intent, relog
this item and all remaining items associated with the dfp to it, roll
the tx, and finally create a done item associated with the intent in the
new tx. You'd need access to the dfp for some of that, so it's not
immediately clear to me that this ends up much easier than fixing up
the xattr code.

BTW, if we did end up with something like that I'd probably prefer to
see it as an exported dfops helper function as opposed to a function
pointer being passed around, if possible.

Brian

> Thanks again for the reviews!
> 
> Allison
> 
> > 
> > Brian
> > 
> > 
> > >   fs/xfs/libxfs/xfs_attr.h | 18 +++++++++++++++++-
> > >   fs/xfs/scrub/common.c    |  2 ++
> > >   fs/xfs/xfs_acl.c         |  2 ++
> > >   fs/xfs/xfs_attr_item.c   |  2 +-
> > >   fs/xfs/xfs_ioctl.c       |  2 ++
> > >   fs/xfs/xfs_ioctl32.c     |  2 ++
> > >   fs/xfs/xfs_iops.c        |  1 +
> > >   fs/xfs/xfs_xattr.c       |  1 +
> > >   8 files changed, 28 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
> > > index 974c963..4ce3b0a 100644
> > > --- a/fs/xfs/libxfs/xfs_attr.h
> > > +++ b/fs/xfs/libxfs/xfs_attr.h
> > > @@ -77,6 +77,13 @@ typedef struct attrlist_ent {	/* data from attr_list() */
> > >   	char	a_name[1];	/* attr name (NULL terminated) */
> > >   } attrlist_ent_t;
> > > +/* Attr state machine types */
> > > +enum xfs_attr_state {
> > > +	XFS_ATTR_STATE1 = 1,
> > > +	XFS_ATTR_STATE2 = 2,
> > > +	XFS_ATTR_STATE3 = 3,
> > > +};
> > > +
> > >   /*
> > >    * List of attrs to commit later.
> > >    */
> > > @@ -88,7 +95,16 @@ struct xfs_attr_item {
> > >   	void		  *xattri_name;	      /* attr name */
> > >   	uint32_t	  xattri_name_len;    /* length of name */
> > >   	uint32_t	  xattri_flags;       /* attr flags */
> > > -	struct list_head  xattri_list;
> > > +
> > > +	/*
> > > +	 * Delayed attr parameters that need to remain instantiated
> > > +	 * across transaction rolls during the defer finish
> > > +	 */
> > > +	struct xfs_buf		*xattri_leaf_bp;  /* Leaf buf to release */
> > > +	enum xfs_attr_state	xattri_state;	  /* state machine marker */
> > > +	struct xfs_da_args	xattri_args;	  /* args context */
> > > +
> > > +	struct list_head	xattri_list;
> > >   	/*
> > >   	 * A byte array follows the header containing the file name and
> > > diff --git a/fs/xfs/scrub/common.c b/fs/xfs/scrub/common.c
> > > index 0c54ff5..270c32e 100644
> > > --- a/fs/xfs/scrub/common.c
> > > +++ b/fs/xfs/scrub/common.c
> > > @@ -30,6 +30,8 @@
> > >   #include "xfs_rmap_btree.h"
> > >   #include "xfs_log.h"
> > >   #include "xfs_trans_priv.h"
> > > +#include "xfs_da_format.h"
> > > +#include "xfs_da_btree.h"
> > >   #include "xfs_attr.h"
> > >   #include "xfs_reflink.h"
> > >   #include "scrub/xfs_scrub.h"
> > > diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c
> > > index 142de8d..9b1b93e 100644
> > > --- a/fs/xfs/xfs_acl.c
> > > +++ b/fs/xfs/xfs_acl.c
> > > @@ -10,6 +10,8 @@
> > >   #include "xfs_mount.h"
> > >   #include "xfs_inode.h"
> > >   #include "xfs_acl.h"
> > > +#include "xfs_da_format.h"
> > > +#include "xfs_da_btree.h"
> > >   #include "xfs_attr.h"
> > >   #include "xfs_trace.h"
> > >   #include <linux/slab.h>
> > > diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
> > > index 0ea19b4..36e6d1e 100644
> > > --- a/fs/xfs/xfs_attr_item.c
> > > +++ b/fs/xfs/xfs_attr_item.c
> > > @@ -19,10 +19,10 @@
> > >   #include "xfs_rmap.h"
> > >   #include "xfs_inode.h"
> > >   #include "xfs_icache.h"
> > > -#include "xfs_attr.h"
> > >   #include "xfs_shared.h"
> > >   #include "xfs_da_format.h"
> > >   #include "xfs_da_btree.h"
> > > +#include "xfs_attr.h"
> > >   static inline struct xfs_attri_log_item *ATTRI_ITEM(struct xfs_log_item *lip)
> > >   {
> > > diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> > > index ab341d6..c8728ca 100644
> > > --- a/fs/xfs/xfs_ioctl.c
> > > +++ b/fs/xfs/xfs_ioctl.c
> > > @@ -16,6 +16,8 @@
> > >   #include "xfs_rtalloc.h"
> > >   #include "xfs_itable.h"
> > >   #include "xfs_error.h"
> > > +#include "xfs_da_format.h"
> > > +#include "xfs_da_btree.h"
> > >   #include "xfs_attr.h"
> > >   #include "xfs_bmap.h"
> > >   #include "xfs_bmap_util.h"
> > > diff --git a/fs/xfs/xfs_ioctl32.c b/fs/xfs/xfs_ioctl32.c
> > > index 5001dca..23f6990 100644
> > > --- a/fs/xfs/xfs_ioctl32.c
> > > +++ b/fs/xfs/xfs_ioctl32.c
> > > @@ -21,6 +21,8 @@
> > >   #include "xfs_fsops.h"
> > >   #include "xfs_alloc.h"
> > >   #include "xfs_rtalloc.h"
> > > +#include "xfs_da_format.h"
> > > +#include "xfs_da_btree.h"
> > >   #include "xfs_attr.h"
> > >   #include "xfs_ioctl.h"
> > >   #include "xfs_ioctl32.h"
> > > diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> > > index e73c21a..561c467 100644
> > > --- a/fs/xfs/xfs_iops.c
> > > +++ b/fs/xfs/xfs_iops.c
> > > @@ -17,6 +17,7 @@
> > >   #include "xfs_acl.h"
> > >   #include "xfs_quota.h"
> > >   #include "xfs_error.h"
> > > +#include "xfs_da_btree.h"
> > >   #include "xfs_attr.h"
> > >   #include "xfs_trans.h"
> > >   #include "xfs_trace.h"
> > > diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c
> > > index 3013746..938e81d 100644
> > > --- a/fs/xfs/xfs_xattr.c
> > > +++ b/fs/xfs/xfs_xattr.c
> > > @@ -11,6 +11,7 @@
> > >   #include "xfs_mount.h"
> > >   #include "xfs_da_format.h"
> > >   #include "xfs_inode.h"
> > > +#include "xfs_da_btree.h"
> > >   #include "xfs_attr.h"
> > >   #include "xfs_attr_leaf.h"
> > >   #include "xfs_acl.h"
> > > -- 
> > > 2.7.4
> > > 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 8/9] xfs: Roll delayed attr operations by returning EAGAIN
  2019-04-12 22:50 ` [PATCH 8/9] xfs: Roll delayed attr operations by returning EAGAIN Allison Henderson
  2019-04-15 23:31   ` Darrick J. Wong
@ 2019-04-23 14:19   ` Brian Foster
  2019-04-24  2:24     ` Allison Henderson
  1 sibling, 1 reply; 48+ messages in thread
From: Brian Foster @ 2019-04-23 14:19 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Fri, Apr 12, 2019 at 03:50:35PM -0700, Allison Henderson wrote:
> This patch modifies xfs_attr_set_args to return -EAGAIN
> when a transaction needs to be rolled.  All functions
> currently calling xfs_attr_set_args are modified to use
> the deferred attr operation, or handle the -EAGAIN return
> code
> 
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_attr.c | 62 ++++++++++++++++++++++++++++++++++++++++--------
>  fs/xfs/libxfs/xfs_attr.h |  2 +-
>  fs/xfs/xfs_attr_item.c   | 41 +++++++++++++++++++++++++++-----
>  fs/xfs/xfs_trans.h       |  2 ++
>  fs/xfs/xfs_trans_attr.c  | 56 +++++++++++++++++++++++++------------------
>  5 files changed, 123 insertions(+), 40 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index 0042708..4ddd86b 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -236,10 +236,37 @@ int
>  xfs_attr_set_args(
>  	struct xfs_da_args	*args,
>  	struct xfs_buf          **leaf_bp,
> +	enum xfs_attr_state	*state,
>  	bool			roll_trans)
>  {
>  	struct xfs_inode	*dp = args->dp;
>  	int			error = 0;
> +	int			sf_size;
> +
> +	switch (*state) {
> +	case (XFS_ATTR_STATE1):
> +		goto state1;
> +	case (XFS_ATTR_STATE2):
> +		goto state2;
> +	case (XFS_ATTR_STATE3):
> +		goto state3;
> +	}
> +
> +	/*
> +	 * New inodes may not have an attribute fork yet. So set the attribute
> +	 * fork appropriately
> +	 */
> +	if (XFS_IFORK_Q((args->dp)) == 0) {
> +		sf_size = sizeof(struct xfs_attr_sf_hdr) +
> +		     XFS_ATTR_SF_ENTSIZE_BYNAME(args->namelen, args->valuelen);
> +		xfs_bmap_set_attrforkoff(args->dp, sf_size, NULL);
> +		args->dp->i_afp = kmem_zone_zalloc(xfs_ifork_zone, KM_SLEEP);
> +		args->dp->i_afp->if_flags = XFS_IFEXTENTS;
> +	}
> +
> +	*state = XFS_ATTR_STATE1;
> +	return -EAGAIN;

As noted previously, this return seems unnecessary since we've not done
anything in the transaction to this point.

> +state1:
>  
>  	/*
>  	 * If the attribute list is non-existent or a shortform list,
> @@ -248,7 +275,6 @@ xfs_attr_set_args(
>  	if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL ||
>  	    (dp->i_d.di_aformat == XFS_DINODE_FMT_EXTENTS &&
>  	     dp->i_d.di_anextents == 0)) {
> -
>  		/*
>  		 * Build initial attribute list (if required).
>  		 */
> @@ -262,6 +288,9 @@ xfs_attr_set_args(
>  		if (error != -ENOSPC)
>  			return error;
>  
> +		*state = XFS_ATTR_STATE2;
> +		return -EAGAIN;
> +state2:

Here we've failed the sf add but not yet done the conversion, which
means we've still not done anything in the transaction. I suspect we
should probably convert to leaf and then return -EAGAIN.

>  		/*
>  		 * It won't fit in the shortform, transform to a leaf block.
>  		 * GROT: another possible req'mt for a double-split btree op.
> @@ -270,14 +299,14 @@ xfs_attr_set_args(
>  		if (error)
>  			return error;
>  
> -		if (roll_trans) {
> -			/*
> -			 * Prevent the leaf buffer from being unlocked so that a
> -			 * concurrent AIL push cannot grab the half-baked leaf
> -			 * buffer and run into problems with the write verifier.
> -			 */
> -			xfs_trans_bhold(args->trans, *leaf_bp);
> +		/*
> +		 * Prevent the leaf buffer from being unlocked so that a
> +		 * concurrent AIL push cannot grab the half-baked leaf
> +		 * buffer and run into problems with the write verifier.
> +		 */
> +		xfs_trans_bhold(args->trans, *leaf_bp);
>  
> +		if (roll_trans) {
>  			error = xfs_defer_finish(&args->trans);
>  			if (error)
>  				return error;
> @@ -293,6 +322,12 @@ xfs_attr_set_args(
>  			xfs_trans_bjoin(args->trans, *leaf_bp);
>  			*leaf_bp = NULL;
>  		}
> +
> +		*state = XFS_ATTR_STATE3;
> +		return -EAGAIN;
> +state3:

Hmm, and this appears to be the last place we return -EAGAIN from the
set code. Am I following this correctly that we basically expect any of
the other rolls down in xfs_attr_[leaf|node]_addname() to go away in
deferred context? If so, why is that?

That aside, I'm wondering whether we need the whole state thing to track
this. For example, why not have a high level flow as something like the
following?

xfs_attr_set_args()
{
	...
	if (local format) {
		error = xfs_attr_try_sf_addname(dp, args);
		if (error == -ENOSPC) {
			error = xfs_attr_shortform_to_leaf(args, leaf_bp);
			return -EAGAIN;
		} else
			return error;
	} else if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
		error = xfs_attr_leaf_addname(args);
	} else {
		error = xfs_attr_node_addname(args);
	}
}

Of course, this may need further changes if we do end up incorporating
the rolls down in the leaf/node functions. Perhaps we could pull apart
those functions such that we -EAGAIN on the conversions required to
address -ENOSPC returns. That might provide a natural boundary to
re-enter the top-level function without the need for a state machine, at
least for any rolls that occurs before we actually do an attr op
(post-op rolls may very well require more state to incorporate).
Thoughts?

Brian

> +		if (*leaf_bp != NULL)
> +			xfs_trans_brelse(args->trans, *leaf_bp);
>  	}
>  
>  	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
> @@ -419,7 +454,9 @@ xfs_attr_set(
>  		goto out_trans_cancel;
>  
>  	xfs_trans_ijoin(args.trans, dp, 0);
> -	error = xfs_attr_set_args(&args, &leaf_bp, true);
> +
> +	error = xfs_attr_set_deferred(dp, args.trans, name, namelen,
> +			value, valuelen, flags);
>  	if (error)
>  		goto out_release_leaf;
>  	if (!args.trans) {
> @@ -554,8 +591,13 @@ xfs_attr_remove(
>  	 */
>  	xfs_trans_ijoin(args.trans, dp, 0);
>  
> -	error = xfs_attr_remove_args(&args, true);
> +	error = xfs_has_attr(&args);
> +	if (error)
> +		goto out;
> +
>  
> +	error = xfs_attr_remove_deferred(dp, args.trans,
> +			name, namelen, flags);
>  	if (error)
>  		goto out;
>  
> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
> index 4ce3b0a..da95e69 100644
> --- a/fs/xfs/libxfs/xfs_attr.h
> +++ b/fs/xfs/libxfs/xfs_attr.h
> @@ -181,7 +181,7 @@ int xfs_attr_set(struct xfs_inode *dp, const unsigned char *name,
>  		 size_t namelen, unsigned char *value, int valuelen,
>  		 int flags);
>  int xfs_attr_set_args(struct xfs_da_args *args, struct xfs_buf **leaf_bp,
> -		 bool roll_trans);
> +		 enum xfs_attr_state *state, bool roll_trans);
>  int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name,
>  		    size_t namelen, int flags);
>  int xfs_has_attr(struct xfs_da_args *args);
> diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
> index 36e6d1e..292d608 100644
> --- a/fs/xfs/xfs_attr_item.c
> +++ b/fs/xfs/xfs_attr_item.c
> @@ -464,8 +464,11 @@ xfs_attri_recover(
>  	struct xfs_attri_log_format	*attrp;
>  	struct xfs_trans_res		tres;
>  	int				local;
> -	int				error = 0;
> +	int				error, err2 = 0;
>  	int				rsvd = 0;
> +	enum xfs_attr_state		state = 0;
> +	struct xfs_buf			*leaf_bp = NULL;
> +
>  
>  	ASSERT(!test_bit(XFS_ATTRI_RECOVERED, &attrip->flags));
>  
> @@ -540,14 +543,40 @@ xfs_attri_recover(
>  	xfs_ilock(ip, XFS_ILOCK_EXCL);
>  
>  	xfs_trans_ijoin(args.trans, ip, 0);
> -	error = xfs_trans_attr(&args, attrdp, attrp->alfi_op_flags);
> -	if (error)
> -		goto abort_error;
>  
> +	do {
> +		leaf_bp = NULL;
> +
> +		error = xfs_trans_attr(&args, attrdp, &leaf_bp, &state,
> +				attrp->alfi_op_flags);
> +		if (error && error != -EAGAIN)
> +			goto abort_error;
> +
> +		xfs_trans_log_inode(args.trans, ip,
> +				XFS_ILOG_CORE | XFS_ILOG_ADATA);
> +
> +		err2 = xfs_trans_commit(args.trans);
> +		if (err2) {
> +			error = err2;
> +			goto abort_error;
> +		}
> +
> +		if (error == -EAGAIN) {
> +			err2 = xfs_trans_alloc(mp, &tres, args.total, 0,
> +				XFS_TRANS_PERM_LOG_RES, &args.trans);
> +			if (err2) {
> +				error = err2;
> +				goto abort_error;
> +			}
> +			xfs_trans_ijoin(args.trans, ip, 0);
> +		}
> +
> +	} while (error == -EAGAIN);
> +
> +	if (leaf_bp)
> +		xfs_trans_brelse(args.trans, leaf_bp);
>  
>  	set_bit(XFS_ATTRI_RECOVERED, &attrip->flags);
> -	xfs_trans_log_inode(args.trans, ip, XFS_ILOG_CORE | XFS_ILOG_ADATA);
> -	error = xfs_trans_commit(args.trans);
>  	xfs_iunlock(ip, XFS_ILOCK_EXCL);
>  	return error;
>  
> diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
> index 7bb9d8e..c785cd7 100644
> --- a/fs/xfs/xfs_trans.h
> +++ b/fs/xfs/xfs_trans.h
> @@ -239,6 +239,8 @@ xfs_trans_get_attrd(struct xfs_trans *tp,
>  		    struct xfs_attri_log_item *attrip);
>  int xfs_trans_attr(struct xfs_da_args *args,
>  		   struct xfs_attrd_log_item *attrdp,
> +		   struct xfs_buf **leaf_bp,
> +		   void *state,
>  		   uint32_t attr_op_flags);
>  
>  int		xfs_trans_commit(struct xfs_trans *);
> diff --git a/fs/xfs/xfs_trans_attr.c b/fs/xfs/xfs_trans_attr.c
> index 3679348..a3339ea 100644
> --- a/fs/xfs/xfs_trans_attr.c
> +++ b/fs/xfs/xfs_trans_attr.c
> @@ -56,10 +56,11 @@ int
>  xfs_trans_attr(
>  	struct xfs_da_args		*args,
>  	struct xfs_attrd_log_item	*attrdp,
> +	struct xfs_buf			**leaf_bp,
> +	void				*state,
>  	uint32_t			op_flags)
>  {
>  	int				error;
> -	struct xfs_buf			*leaf_bp = NULL;
>  
>  	error = xfs_qm_dqattach_locked(args->dp, 0);
>  	if (error)
> @@ -68,7 +69,8 @@ xfs_trans_attr(
>  	switch (op_flags) {
>  	case XFS_ATTR_OP_FLAGS_SET:
>  		args->op_flags |= XFS_DA_OP_ADDNAME;
> -		error = xfs_attr_set_args(args, &leaf_bp, false);
> +		error = xfs_attr_set_args(args, leaf_bp,
> +				(enum xfs_attr_state *)state, false);
>  		break;
>  	case XFS_ATTR_OP_FLAGS_REMOVE:
>  		ASSERT(XFS_IFORK_Q((args->dp)));
> @@ -78,11 +80,6 @@ xfs_trans_attr(
>  		error = -EFSCORRUPTED;
>  	}
>  
> -	if (error) {
> -		if (leaf_bp)
> -			xfs_trans_brelse(args->trans, leaf_bp);
> -	}
> -
>  	/*
>  	 * Mark the transaction dirty, even on error. This ensures the
>  	 * transaction is aborted, which:
> @@ -184,27 +181,40 @@ xfs_attr_finish_item(
>  	char				*name_value;
>  	int				error;
>  	int				local;
> -	struct xfs_da_args		args;
> +	struct xfs_da_args		*args;
>  
>  	attr = container_of(item, struct xfs_attr_item, xattri_list);
> -	name_value = ((char *)attr) + sizeof(struct xfs_attr_item);
> -
> -	error = xfs_attr_args_init(&args, attr->xattri_ip, name_value,
> -				   attr->xattri_name_len, attr->xattri_flags);
> -	if (error)
> -		goto out;
> +	args = &attr->xattri_args;
> +
> +	if (attr->xattri_state == 0) {
> +		/* Only need to initialize args context once */
> +		name_value = ((char *)attr) + sizeof(struct xfs_attr_item);
> +		error = xfs_attr_args_init(args, attr->xattri_ip, name_value,
> +					   attr->xattri_name_len,
> +					   attr->xattri_flags);
> +		if (error)
> +			goto out;
> +
> +		args->hashval = xfs_da_hashname(args->name, args->namelen);
> +		args->value = &name_value[attr->xattri_name_len];
> +		args->valuelen = attr->xattri_value_len;
> +		args->op_flags = XFS_DA_OP_OKNOENT;
> +		args->total = xfs_attr_calc_size(args, &local);
> +		attr->xattri_leaf_bp = NULL;
> +	}
>  
> -	args.hashval = xfs_da_hashname(args.name, args.namelen);
> -	args.value = &name_value[attr->xattri_name_len];
> -	args.valuelen = attr->xattri_value_len;
> -	args.op_flags = XFS_DA_OP_OKNOENT;
> -	args.total = xfs_attr_calc_size(&args, &local);
> -	args.trans = tp;
> +	/*
> +	 * Always reset trans after EAGAIN cycle
> +	 * since the transaction is new
> +	 */
> +	args->trans = tp;
>  
> -	error = xfs_trans_attr(&args, done_item,
> -			attr->xattri_op_flags);
> +	error = xfs_trans_attr(args, done_item, &attr->xattri_leaf_bp,
> +			&attr->xattri_state, attr->xattri_op_flags);
>  out:
> -	kmem_free(attr);
> +	if (error != -EAGAIN)
> +		kmem_free(attr);
> +
>  	return error;
>  }
>  
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 5/9] xfs: Add xfs_attr_set_deferred and xfs_attr_remove_deferred
  2019-04-23 13:00           ` Brian Foster
@ 2019-04-24  2:24             ` Allison Henderson
  0 siblings, 0 replies; 48+ messages in thread
From: Allison Henderson @ 2019-04-24  2:24 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs



On 4/23/19 6:00 AM, Brian Foster wrote:
> On Mon, Apr 22, 2019 at 03:01:14PM -0700, Allison Henderson wrote:
>>
>>
>> On 4/22/19 4:01 AM, Brian Foster wrote:
>>> On Thu, Apr 18, 2019 at 02:28:00PM -0700, Allison Henderson wrote:
>>>> On 4/18/19 8:49 AM, Brian Foster wrote:
>>>>> On Fri, Apr 12, 2019 at 03:50:32PM -0700, Allison Henderson wrote:
>>>>>> These routines set up set and start a new deferred attribute
>>>>>> operation.  These functions are meant to be called by other
>>>>>> code needing to initiate a deferred attribute operation.  We
>>>>>> will use these routines later in the parent pointer patches.
>>>>>>
>>>>>
>>>>> We probably don't need to reference the parent pointer stuff any more
>>>>> for this, right? I'm assuming we'll be converting generic attr
>>>>> infrastructure over to this mechanism in subsequent patches..?
>>>>
>>>> Right, some of these comments are a little stale.  I will clean then up a
>>>> bit.
>>>>
>>>>>
>>>>>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
>>>>>> ---
>>>>>>     fs/xfs/libxfs/xfs_attr.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++++
>>>>>>     fs/xfs/libxfs/xfs_attr.h |  7 +++++
>>>>>>     2 files changed, 87 insertions(+)
>>>>>>
>>>>>> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
>>>>>> index fadd485..c3477fa7 100644
>>>>>> --- a/fs/xfs/libxfs/xfs_attr.c
>>>>>> +++ b/fs/xfs/libxfs/xfs_attr.c
>>> ...
>>>>>> @@ -513,6 +560,39 @@ xfs_attr_remove(
>>>>>>     	return error;
>>>>>>     }
>>>>>> +/* Removes an attribute for an inode as a deferred operation */
>>>>>> +int
>>>>>> +xfs_attr_remove_deferred(
>>>>>
>>>>> Hmm.. I'm kind of wondering if we actually need to defer attr removes.
>>>>> Do we have the same kind of challenges for attr removal as for attr
>>>>> creation, or is there some future scenario where this is needed?
>>>>
>>>> I suppose we don't have to have it?  The motivation was to help break up the
>>>> amount of transaction activity that happens on inode create/rename/remove
>>>> operations once pptrs go in.  Attr remove does not look as complex as attr
>>>> set, but I suppose it helps to some degree?
>>>>
>>>
>>> Ok, this probably needs more thought. On one hand, I'm not a huge fan of
>>> using complex infrastructure where not required just because it's there.
>>> On the other, it could just be more simple to have consistency between
>>> xattr ops. As you note above, perhaps we do want the ability to defer
>>> xattr removes so we can use it in particular contexts (parent pointer
>>> updates) and not others (direct xattr remove requests from userspace).
>>> Perhaps the right thing to do for the time being is to continue on with
>>> the support for deferred xattr remove but don't invoke it from the
>>> direct xattr remove codepath..?
>>
>> We can do this, but it means we need to keep the "roll_trans" boolean for
>> all code paths that want to retain their original functionality, and also
>> still be able to function as a delayed operation too.
>>
>> It's not a big deal I suppose.  The remove code path does not have as many
>> uses of the boolean.  But I seem to recall people thinking that the boolean
>> was not particularly elegant, so I was careful to point out that it was
>> going away at the end of the set :-)
>>
> 
> Hmm, I was hoping we could refactor the existing code in a way that
> supports both without spreading the boolean all over the place (by
> breaking things down into smaller functional components), but poking
> deeper into the xattr codepath suggests that could get quite hairy and
> might not be worth it. I think it might be reasonable to just leave
> around enough direct functionality for operations that don't require a
> transaction roll. For example, a shortform xattr set just commits the
> transaction if it succeeds. If it fails, we could make the decision to
> defer the operation as we know we're now going to require a tx roll
> anyways. That way a direct xattr set doesn't need to be deferred for no
> reason if it wouldn't otherwise roll, while we still have the ability to
> defer an arbitrary xattr set (even if shortform) for internal things
> like parent pointers where we don't necessarily have an xattr
> transaction.
> 
> Same goes for the shortform remove operation (and perhaps others), which
> could be reused in both direct and deferred contexts because it doesn't
> appear to roll the tx. Note that we don't necessarily have to share the
> exact same xfs_attr_[set|remove]_args() function between direct and
> deferred context. A separate function in the direct path to attempt a
> direct op and then defer and another in the deferred path that covers
> pretty much everything (with fixed up -EAGAIN magic) might be easier to
> manage.

Ok, I think I understand what you're trying to describe here.  I'll see 
if I can separate the areas that need delayed function and try to factor 
out more common code.  I guess I usually try to aim to eliminate code 
with duplicate function just because more code volume tends to generate 
more maintenance.  But if people feel more comfortable having both 
methods I will try and see if I can preserve both.

> 
> All that said, if you'd rather just defer everything for now and
> potentially revisit pulling more things into the direct path later on
> then I think that's perfectly reasonable too. The existing code is
> really kind of a jumbled mess and we stand to benefit just by
> simplifying/organizing it, IMO. I think there's a reasonable argument to
> be made that we're better off working through all of the -EAGAIN stuff
> and working the direct case as an optimization from there.
> 

Alrighty then, perhaps we should focus more on how we want to reorganize 
things for the this EAGAIN handling first, since it might change what we 
decide here.

>>>
>>> Note that if we took that approach, we could add a DEBUG option and/or
>>> an errortag to (randomly) defer xattr removes in the common path for
>>> test coverage purposes.
>>
>> Sure, that would be an easy thing to stitch in.  Once parent pointers go in,
>> delayed attrs will get a lot more exorcise since they will be a part of
>> inode create/move/remove too.
>>
> 
> Note that I think this would only be warranted if there was no other way
> to invoke the deferred path directly from userspace (for testing). If we
> did a deferred fallback approach like the above or just resort to
> deferring everything, then we'll defer plenty (or all) of traditional
> xattr ops and this is probably not necessary.

Sure, I'll find a way to make sure it gets a thorough work out depending 
on what we end up with.  We can take always take the error tag back out 
once we get to pptrs.

Fwiw, I'm trying to keep the extended pptr set stable on top of this set 
as we go along, just to make sure we don't come up with something that 
causes issues later down the road.  ATM, I'm just limiting the reviews 
to a smaller set because I know bandwidth is limited, and if we can keep 
focused here maybe we can get through the bigger picture in smaller 
chunks :-)

Thx for the feedback!
Allison
> 
> Brian
> 
>> Allison
>>
>>>
>>> Brian
>>>
>>>>>
>>>>>> +	struct xfs_inode        *dp,
>>>>>> +	struct xfs_trans	*tp,
>>>>>> +	const unsigned char	*name,
>>>>>> +	unsigned int		namelen,
>>>>>> +	int                     flags)
>>>>>> +{
>>>>>> +
>>>>>> +	struct xfs_attr_item	*new;
>>>>>> +	char			*name_value;
>>>>>> +
>>>>>> +	if (!namelen) {
>>>>>> +		ASSERT(0);
>>>>>> +		return -EFSCORRUPTED;
>>>>>
>>>>> Similar comment around -EFSCORRUPTED vs. -EINVAL (or something else..).
>>>> Ok, I will change to EINVAL here too.
>>>>
>>>> Thanks again for the reviews!!  They are very helpful!
>>>>
>>>> Allison
>>>>>
>>>>> Brian
>>>>>
>>>>>> +	}
>>>>>> +
>>>>>> +	new = kmem_alloc(XFS_ATTR_ITEM_SIZEOF(namelen, 0), KM_SLEEP|KM_NOFS);
>>>>>> +	name_value = ((char *)new) + sizeof(struct xfs_attr_item);
>>>>>> +	memset(new, 0, XFS_ATTR_ITEM_SIZEOF(namelen, 0));
>>>>>> +	new->xattri_ip = dp;
>>>>>> +	new->xattri_op_flags = XFS_ATTR_OP_FLAGS_REMOVE;
>>>>>> +	new->xattri_name_len = namelen;
>>>>>> +	new->xattri_value_len = 0;
>>>>>> +	new->xattri_flags = flags;
>>>>>> +	memcpy(name_value, name, namelen);
>>>>>> +
>>>>>> +	xfs_defer_add(tp, XFS_DEFER_OPS_TYPE_ATTR, &new->xattri_list);
>>>>>> +
>>>>>> +	return 0;
>>>>>> +}
>>>>>> +
>>>>>>     /*========================================================================
>>>>>>      * External routines when attribute list is inside the inode
>>>>>>      *========================================================================*/
>>>>>> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
>>>>>> index 92d9a15..83b3621 100644
>>>>>> --- a/fs/xfs/libxfs/xfs_attr.h
>>>>>> +++ b/fs/xfs/libxfs/xfs_attr.h
>>>>>> @@ -175,5 +175,12 @@ bool xfs_attr_namecheck(const void *name, size_t length);
>>>>>>     int xfs_attr_args_init(struct xfs_da_args *args, struct xfs_inode *dp,
>>>>>>     			const unsigned char *name, size_t namelen, int flags);
>>>>>>     int xfs_attr_calc_size(struct xfs_da_args *args, int *local);
>>>>>> +int xfs_attr_set_deferred(struct xfs_inode *dp, struct xfs_trans *tp,
>>>>>> +			  const unsigned char *name, unsigned int name_len,
>>>>>> +			  const unsigned char *value, unsigned int valuelen,
>>>>>> +			  int flags);
>>>>>> +int xfs_attr_remove_deferred(struct xfs_inode *dp, struct xfs_trans *tp,
>>>>>> +			    const unsigned char *name, unsigned int namelen,
>>>>>> +			    int flags);
>>>>>>     #endif	/* __XFS_ATTR_H__ */
>>>>>> -- 
>>>>>> 2.7.4
>>>>>>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 8/9] xfs: Roll delayed attr operations by returning EAGAIN
  2019-04-23 14:19   ` Brian Foster
@ 2019-04-24  2:24     ` Allison Henderson
  0 siblings, 0 replies; 48+ messages in thread
From: Allison Henderson @ 2019-04-24  2:24 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs



On 4/23/19 7:19 AM, Brian Foster wrote:
> On Fri, Apr 12, 2019 at 03:50:35PM -0700, Allison Henderson wrote:
>> This patch modifies xfs_attr_set_args to return -EAGAIN
>> when a transaction needs to be rolled.  All functions
>> currently calling xfs_attr_set_args are modified to use
>> the deferred attr operation, or handle the -EAGAIN return
>> code
>>
>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
>> ---
>>   fs/xfs/libxfs/xfs_attr.c | 62 ++++++++++++++++++++++++++++++++++++++++--------
>>   fs/xfs/libxfs/xfs_attr.h |  2 +-
>>   fs/xfs/xfs_attr_item.c   | 41 +++++++++++++++++++++++++++-----
>>   fs/xfs/xfs_trans.h       |  2 ++
>>   fs/xfs/xfs_trans_attr.c  | 56 +++++++++++++++++++++++++------------------
>>   5 files changed, 123 insertions(+), 40 deletions(-)
>>
>> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
>> index 0042708..4ddd86b 100644
>> --- a/fs/xfs/libxfs/xfs_attr.c
>> +++ b/fs/xfs/libxfs/xfs_attr.c
>> @@ -236,10 +236,37 @@ int
>>   xfs_attr_set_args(
>>   	struct xfs_da_args	*args,
>>   	struct xfs_buf          **leaf_bp,
>> +	enum xfs_attr_state	*state,
>>   	bool			roll_trans)
>>   {
>>   	struct xfs_inode	*dp = args->dp;
>>   	int			error = 0;
>> +	int			sf_size;
>> +
>> +	switch (*state) {
>> +	case (XFS_ATTR_STATE1):
>> +		goto state1;
>> +	case (XFS_ATTR_STATE2):
>> +		goto state2;
>> +	case (XFS_ATTR_STATE3):
>> +		goto state3;
>> +	}
>> +
>> +	/*
>> +	 * New inodes may not have an attribute fork yet. So set the attribute
>> +	 * fork appropriately
>> +	 */
>> +	if (XFS_IFORK_Q((args->dp)) == 0) {
>> +		sf_size = sizeof(struct xfs_attr_sf_hdr) +
>> +		     XFS_ATTR_SF_ENTSIZE_BYNAME(args->namelen, args->valuelen);
>> +		xfs_bmap_set_attrforkoff(args->dp, sf_size, NULL);
>> +		args->dp->i_afp = kmem_zone_zalloc(xfs_ifork_zone, KM_SLEEP);
>> +		args->dp->i_afp->if_flags = XFS_IFEXTENTS;
>> +	}
>> +
>> +	*state = XFS_ATTR_STATE1;
>> +	return -EAGAIN;
> 
> As noted previously, this return seems unnecessary since we've not done
> anything in the transaction to this point.
> 
>> +state1:
>>   
>>   	/*
>>   	 * If the attribute list is non-existent or a shortform list,
>> @@ -248,7 +275,6 @@ xfs_attr_set_args(
>>   	if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL ||
>>   	    (dp->i_d.di_aformat == XFS_DINODE_FMT_EXTENTS &&
>>   	     dp->i_d.di_anextents == 0)) {
>> -
>>   		/*
>>   		 * Build initial attribute list (if required).
>>   		 */
>> @@ -262,6 +288,9 @@ xfs_attr_set_args(
>>   		if (error != -ENOSPC)
>>   			return error;
>>   
>> +		*state = XFS_ATTR_STATE2;
>> +		return -EAGAIN;
>> +state2:
> 
> Here we've failed the sf add but not yet done the conversion, which
> means we've still not done anything in the transaction. I suspect we
> should probably convert to leaf and then return -EAGAIN.
> 
>>   		/*
>>   		 * It won't fit in the shortform, transform to a leaf block.
>>   		 * GROT: another possible req'mt for a double-split btree op.
>> @@ -270,14 +299,14 @@ xfs_attr_set_args(
>>   		if (error)
>>   			return error;
>>   
>> -		if (roll_trans) {
>> -			/*
>> -			 * Prevent the leaf buffer from being unlocked so that a
>> -			 * concurrent AIL push cannot grab the half-baked leaf
>> -			 * buffer and run into problems with the write verifier.
>> -			 */
>> -			xfs_trans_bhold(args->trans, *leaf_bp);
>> +		/*
>> +		 * Prevent the leaf buffer from being unlocked so that a
>> +		 * concurrent AIL push cannot grab the half-baked leaf
>> +		 * buffer and run into problems with the write verifier.
>> +		 */
>> +		xfs_trans_bhold(args->trans, *leaf_bp);
>>   
>> +		if (roll_trans) {
>>   			error = xfs_defer_finish(&args->trans);
>>   			if (error)
>>   				return error;
>> @@ -293,6 +322,12 @@ xfs_attr_set_args(
>>   			xfs_trans_bjoin(args->trans, *leaf_bp);
>>   			*leaf_bp = NULL;
>>   		}
>> +
>> +		*state = XFS_ATTR_STATE3;
>> +		return -EAGAIN;
>> +state3:
> 
> Hmm, and this appears to be the last place we return -EAGAIN from the
> set code. Am I following this correctly that we basically expect any of
> the other rolls down in xfs_attr_[leaf|node]_addname() to go away in
> deferred context? If so, why is that?

Well it looks like the calling function (xfs_defer_finish_noroll) is 
taking care of logging and rolling the transactions for us, so we need 
to not be doing that twice.  And then the same routines are used later 
during the log recovery too.

> 
> That aside, I'm wondering whether we need the whole state thing to track
> this. For example, why not have a high level flow as something like the
> following?
> 
> xfs_attr_set_args()
> {
> 	...
> 	if (local format) {
> 		error = xfs_attr_try_sf_addname(dp, args);
> 		if (error == -ENOSPC) {
> 			error = xfs_attr_shortform_to_leaf(args, leaf_bp);
> 			return -EAGAIN;
> 		} else
> 			return error;
> 	} else if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
> 		error = xfs_attr_leaf_addname(args);
> 	} else {
> 		error = xfs_attr_node_addname(args);
> 	}
> }
> 
> Of course, this may need further changes if we do end up incorporating
> the rolls down in the leaf/node functions. Perhaps we could pull apart
> those functions such that we -EAGAIN on the conversions required to
> address -ENOSPC returns. That might provide a natural boundary to
> re-enter the top-level function without the need for a state machine, at
> least for any rolls that occurs before we actually do an attr op
> (post-op rolls may very well require more state to incorporate).
> Thoughts?
> 

Alrighty, I think that sounds like a good approach.  I'll try to re 
factor this area as you suggest and see where I get.  Maybe I can 
isolate the areas that handle the transactions as we talked about 
earlier too.

Thanks again for the reviews!  They are very helpful!
Allison

> Brian
> 
>> +		if (*leaf_bp != NULL)
>> +			xfs_trans_brelse(args->trans, *leaf_bp);
>>   	}
>>   
>>   	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
>> @@ -419,7 +454,9 @@ xfs_attr_set(
>>   		goto out_trans_cancel;
>>   
>>   	xfs_trans_ijoin(args.trans, dp, 0);
>> -	error = xfs_attr_set_args(&args, &leaf_bp, true);
>> +
>> +	error = xfs_attr_set_deferred(dp, args.trans, name, namelen,
>> +			value, valuelen, flags);
>>   	if (error)
>>   		goto out_release_leaf;
>>   	if (!args.trans) {
>> @@ -554,8 +591,13 @@ xfs_attr_remove(
>>   	 */
>>   	xfs_trans_ijoin(args.trans, dp, 0);
>>   
>> -	error = xfs_attr_remove_args(&args, true);
>> +	error = xfs_has_attr(&args);
>> +	if (error)
>> +		goto out;
>> +
>>   
>> +	error = xfs_attr_remove_deferred(dp, args.trans,
>> +			name, namelen, flags);
>>   	if (error)
>>   		goto out;
>>   
>> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
>> index 4ce3b0a..da95e69 100644
>> --- a/fs/xfs/libxfs/xfs_attr.h
>> +++ b/fs/xfs/libxfs/xfs_attr.h
>> @@ -181,7 +181,7 @@ int xfs_attr_set(struct xfs_inode *dp, const unsigned char *name,
>>   		 size_t namelen, unsigned char *value, int valuelen,
>>   		 int flags);
>>   int xfs_attr_set_args(struct xfs_da_args *args, struct xfs_buf **leaf_bp,
>> -		 bool roll_trans);
>> +		 enum xfs_attr_state *state, bool roll_trans);
>>   int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name,
>>   		    size_t namelen, int flags);
>>   int xfs_has_attr(struct xfs_da_args *args);
>> diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
>> index 36e6d1e..292d608 100644
>> --- a/fs/xfs/xfs_attr_item.c
>> +++ b/fs/xfs/xfs_attr_item.c
>> @@ -464,8 +464,11 @@ xfs_attri_recover(
>>   	struct xfs_attri_log_format	*attrp;
>>   	struct xfs_trans_res		tres;
>>   	int				local;
>> -	int				error = 0;
>> +	int				error, err2 = 0;
>>   	int				rsvd = 0;
>> +	enum xfs_attr_state		state = 0;
>> +	struct xfs_buf			*leaf_bp = NULL;
>> +
>>   
>>   	ASSERT(!test_bit(XFS_ATTRI_RECOVERED, &attrip->flags));
>>   
>> @@ -540,14 +543,40 @@ xfs_attri_recover(
>>   	xfs_ilock(ip, XFS_ILOCK_EXCL);
>>   
>>   	xfs_trans_ijoin(args.trans, ip, 0);
>> -	error = xfs_trans_attr(&args, attrdp, attrp->alfi_op_flags);
>> -	if (error)
>> -		goto abort_error;
>>   
>> +	do {
>> +		leaf_bp = NULL;
>> +
>> +		error = xfs_trans_attr(&args, attrdp, &leaf_bp, &state,
>> +				attrp->alfi_op_flags);
>> +		if (error && error != -EAGAIN)
>> +			goto abort_error;
>> +
>> +		xfs_trans_log_inode(args.trans, ip,
>> +				XFS_ILOG_CORE | XFS_ILOG_ADATA);
>> +
>> +		err2 = xfs_trans_commit(args.trans);
>> +		if (err2) {
>> +			error = err2;
>> +			goto abort_error;
>> +		}
>> +
>> +		if (error == -EAGAIN) {
>> +			err2 = xfs_trans_alloc(mp, &tres, args.total, 0,
>> +				XFS_TRANS_PERM_LOG_RES, &args.trans);
>> +			if (err2) {
>> +				error = err2;
>> +				goto abort_error;
>> +			}
>> +			xfs_trans_ijoin(args.trans, ip, 0);
>> +		}
>> +
>> +	} while (error == -EAGAIN);
>> +
>> +	if (leaf_bp)
>> +		xfs_trans_brelse(args.trans, leaf_bp);
>>   
>>   	set_bit(XFS_ATTRI_RECOVERED, &attrip->flags);
>> -	xfs_trans_log_inode(args.trans, ip, XFS_ILOG_CORE | XFS_ILOG_ADATA);
>> -	error = xfs_trans_commit(args.trans);
>>   	xfs_iunlock(ip, XFS_ILOCK_EXCL);
>>   	return error;
>>   
>> diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
>> index 7bb9d8e..c785cd7 100644
>> --- a/fs/xfs/xfs_trans.h
>> +++ b/fs/xfs/xfs_trans.h
>> @@ -239,6 +239,8 @@ xfs_trans_get_attrd(struct xfs_trans *tp,
>>   		    struct xfs_attri_log_item *attrip);
>>   int xfs_trans_attr(struct xfs_da_args *args,
>>   		   struct xfs_attrd_log_item *attrdp,
>> +		   struct xfs_buf **leaf_bp,
>> +		   void *state,
>>   		   uint32_t attr_op_flags);
>>   
>>   int		xfs_trans_commit(struct xfs_trans *);
>> diff --git a/fs/xfs/xfs_trans_attr.c b/fs/xfs/xfs_trans_attr.c
>> index 3679348..a3339ea 100644
>> --- a/fs/xfs/xfs_trans_attr.c
>> +++ b/fs/xfs/xfs_trans_attr.c
>> @@ -56,10 +56,11 @@ int
>>   xfs_trans_attr(
>>   	struct xfs_da_args		*args,
>>   	struct xfs_attrd_log_item	*attrdp,
>> +	struct xfs_buf			**leaf_bp,
>> +	void				*state,
>>   	uint32_t			op_flags)
>>   {
>>   	int				error;
>> -	struct xfs_buf			*leaf_bp = NULL;
>>   
>>   	error = xfs_qm_dqattach_locked(args->dp, 0);
>>   	if (error)
>> @@ -68,7 +69,8 @@ xfs_trans_attr(
>>   	switch (op_flags) {
>>   	case XFS_ATTR_OP_FLAGS_SET:
>>   		args->op_flags |= XFS_DA_OP_ADDNAME;
>> -		error = xfs_attr_set_args(args, &leaf_bp, false);
>> +		error = xfs_attr_set_args(args, leaf_bp,
>> +				(enum xfs_attr_state *)state, false);
>>   		break;
>>   	case XFS_ATTR_OP_FLAGS_REMOVE:
>>   		ASSERT(XFS_IFORK_Q((args->dp)));
>> @@ -78,11 +80,6 @@ xfs_trans_attr(
>>   		error = -EFSCORRUPTED;
>>   	}
>>   
>> -	if (error) {
>> -		if (leaf_bp)
>> -			xfs_trans_brelse(args->trans, leaf_bp);
>> -	}
>> -
>>   	/*
>>   	 * Mark the transaction dirty, even on error. This ensures the
>>   	 * transaction is aborted, which:
>> @@ -184,27 +181,40 @@ xfs_attr_finish_item(
>>   	char				*name_value;
>>   	int				error;
>>   	int				local;
>> -	struct xfs_da_args		args;
>> +	struct xfs_da_args		*args;
>>   
>>   	attr = container_of(item, struct xfs_attr_item, xattri_list);
>> -	name_value = ((char *)attr) + sizeof(struct xfs_attr_item);
>> -
>> -	error = xfs_attr_args_init(&args, attr->xattri_ip, name_value,
>> -				   attr->xattri_name_len, attr->xattri_flags);
>> -	if (error)
>> -		goto out;
>> +	args = &attr->xattri_args;
>> +
>> +	if (attr->xattri_state == 0) {
>> +		/* Only need to initialize args context once */
>> +		name_value = ((char *)attr) + sizeof(struct xfs_attr_item);
>> +		error = xfs_attr_args_init(args, attr->xattri_ip, name_value,
>> +					   attr->xattri_name_len,
>> +					   attr->xattri_flags);
>> +		if (error)
>> +			goto out;
>> +
>> +		args->hashval = xfs_da_hashname(args->name, args->namelen);
>> +		args->value = &name_value[attr->xattri_name_len];
>> +		args->valuelen = attr->xattri_value_len;
>> +		args->op_flags = XFS_DA_OP_OKNOENT;
>> +		args->total = xfs_attr_calc_size(args, &local);
>> +		attr->xattri_leaf_bp = NULL;
>> +	}
>>   
>> -	args.hashval = xfs_da_hashname(args.name, args.namelen);
>> -	args.value = &name_value[attr->xattri_name_len];
>> -	args.valuelen = attr->xattri_value_len;
>> -	args.op_flags = XFS_DA_OP_OKNOENT;
>> -	args.total = xfs_attr_calc_size(&args, &local);
>> -	args.trans = tp;
>> +	/*
>> +	 * Always reset trans after EAGAIN cycle
>> +	 * since the transaction is new
>> +	 */
>> +	args->trans = tp;
>>   
>> -	error = xfs_trans_attr(&args, done_item,
>> -			attr->xattri_op_flags);
>> +	error = xfs_trans_attr(args, done_item, &attr->xattri_leaf_bp,
>> +			&attr->xattri_state, attr->xattri_op_flags);
>>   out:
>> -	kmem_free(attr);
>> +	if (error != -EAGAIN)
>> +		kmem_free(attr);
>> +
>>   	return error;
>>   }
>>   
>> -- 
>> 2.7.4
>>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 7/9] xfs: Add attr context to log item
  2019-04-23 13:20       ` Brian Foster
@ 2019-04-24  2:24         ` Allison Henderson
  2019-04-24  4:10           ` Darrick J. Wong
  0 siblings, 1 reply; 48+ messages in thread
From: Allison Henderson @ 2019-04-24  2:24 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs


On 4/23/19 6:20 AM, Brian Foster wrote:
> On Mon, Apr 22, 2019 at 03:01:27PM -0700, Allison Henderson wrote:
>>
>>
>> On 4/22/19 6:03 AM, Brian Foster wrote:
>>> On Fri, Apr 12, 2019 at 03:50:34PM -0700, Allison Henderson wrote:
>>>> This patch modifies xfs_attr_item to store a xfs_da_args, a xfs_buf pointer
>>>> and a new state type. We will use these in the next patch when
>>>> we modify xfs_set_attr_args to roll transactions by returning EAGAIN.
>>>> Because the subroutines of this function modify the contents of these
>>>> structures, we need to find a place to store them where they remain
>>>> instantiated across multiple calls to xfs_set_attr_args.
>>>>
>>>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
>>>> ---
>>>
>>> I see Darrick has already commented on the whole state thing. I'll
>>> probably have to grok the next patch to comment further, but just a
>>> couple initial thoughts:
>>>
>>> First, I hit a build failure with this patch. It looks like there's a
>>> missed include in the scrub code:
>>>
>>>     ...
>>>     CC [M]  fs/xfs/scrub/repair.o
>>> In file included from fs/xfs/scrub/repair.c:32:
>>> fs/xfs/libxfs/xfs_attr.h:105:21: error: field ‘xattri_args’ has incomplete type
>>>     struct xfs_da_args xattri_args;   /* args context */
>> Hmm, ok.  I'll get that corrected, I probably need to clean out my workspace
>> and build from scratch.
>>
>>>     ...
>>>
>>> Second, the commit log suggests that the states will reflect the current
>>> transaction roll points (i.e., establishing re-entry points down in
>>> xfs_attr_set_args(). I'm kind of wondering if we should break these
>>> xattr set sub-sequences down into smaller helper functions (refactoring
>>> the existing code as we go) such that the mechanism could technically be
>>> used deferred or not. Re: the previous thought on whether to defer xattr
>>> removes or not, there might also be cases where there's not a need to
>>> defer xattr sets.
>>>
>>> E.g., taking a quick peek into the next patch, the state 1 case in
>>> xfs_attr_try_sf_addname() is actually a transaction commit, which I
>>> think means we're done. We'd have done an attr memory allocation,
>>> deferred op and transaction roll where none was necessary so it might
>>> not be worth it to defer in that scenario. Hmm, it also looks like we
>>> return -EAGAIN in places where we've not actually done any work, like if
>>> a shortform add attempt returns -ENOSPC (or the -EAGAIN return before we
>>> even attempt the sf add). That kind of looks like a waste of transaction
>>> rolls and further suggests it might be cleaner to break this whole path
>>> down into helpers and put it back together in a way more conducive to
>>> deferred operations.
>>
>> Yes, this area is a bit of a wart the way it is right now.  I think you're
>> right in that ultimately we may end up having to do a lot of refactoring in
>> order to have more efficient "re-entry points".  The state machine is hard
>> to get into subroutines, so it's limited in use in the top level function.
>>
>> I was also starting to wonder if maybe I could do some refactoring in
>> xfs_defer_finish_noroll to capture the common code associated with the
>> -EAGAIN handling.  Then maybe we could make a function pointer that we can
>> pass through the finish_item interface.  The idea being that subroutines
>> could use the function pointer to cycle out the transaction when needed
>> instead of having to record states and back out like this. It'd be a new
>> parameter to pipe around, but it'd be more efficient than the state machine,
>> and less surgery in the refactor.  And maybe a blessing to any other
>> operations that might need to go through this transition in the future.
>> Thoughts?
>>
> 
> That's an interesting idea. It still strikes me as a bit of a
> fallback/hack as opposed to organizing the code to properly fit into the
> dfops infrastructure, but it could be useful as a transient solution.
>  From a high level, it looks like we'd have to create a new intent, relog
> this item and all remaining items associated with the dfp to it, roll
> the tx, and finally create a done item associated with the intent in the
> new tx. You'd need access to the dfp for some of that, so it's not
> immediately clear to me that this ends up much easier than fixing up
> the xattr code.
> 
> BTW, if we did end up with something like that I'd probably prefer to
> see it as an exported dfops helper function as opposed to a function
> pointer being passed around, if possible.
> 

Alrighty, I think for now I may try to pursue something more like what 
you proposed in the next patch and see where I get first.  Maybe I'll 
come back to this later if for some reason it doesn't work out, but I 
think what you have there is reasonable.

Thanks again for the reviews!
Allison

> Brian
> 
>> Thanks again for the reviews!
>>
>> Allison
>>
>>>
>>> Brian
>>>
>>>
>>>>    fs/xfs/libxfs/xfs_attr.h | 18 +++++++++++++++++-
>>>>    fs/xfs/scrub/common.c    |  2 ++
>>>>    fs/xfs/xfs_acl.c         |  2 ++
>>>>    fs/xfs/xfs_attr_item.c   |  2 +-
>>>>    fs/xfs/xfs_ioctl.c       |  2 ++
>>>>    fs/xfs/xfs_ioctl32.c     |  2 ++
>>>>    fs/xfs/xfs_iops.c        |  1 +
>>>>    fs/xfs/xfs_xattr.c       |  1 +
>>>>    8 files changed, 28 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
>>>> index 974c963..4ce3b0a 100644
>>>> --- a/fs/xfs/libxfs/xfs_attr.h
>>>> +++ b/fs/xfs/libxfs/xfs_attr.h
>>>> @@ -77,6 +77,13 @@ typedef struct attrlist_ent {	/* data from attr_list() */
>>>>    	char	a_name[1];	/* attr name (NULL terminated) */
>>>>    } attrlist_ent_t;
>>>> +/* Attr state machine types */
>>>> +enum xfs_attr_state {
>>>> +	XFS_ATTR_STATE1 = 1,
>>>> +	XFS_ATTR_STATE2 = 2,
>>>> +	XFS_ATTR_STATE3 = 3,
>>>> +};
>>>> +
>>>>    /*
>>>>     * List of attrs to commit later.
>>>>     */
>>>> @@ -88,7 +95,16 @@ struct xfs_attr_item {
>>>>    	void		  *xattri_name;	      /* attr name */
>>>>    	uint32_t	  xattri_name_len;    /* length of name */
>>>>    	uint32_t	  xattri_flags;       /* attr flags */
>>>> -	struct list_head  xattri_list;
>>>> +
>>>> +	/*
>>>> +	 * Delayed attr parameters that need to remain instantiated
>>>> +	 * across transaction rolls during the defer finish
>>>> +	 */
>>>> +	struct xfs_buf		*xattri_leaf_bp;  /* Leaf buf to release */
>>>> +	enum xfs_attr_state	xattri_state;	  /* state machine marker */
>>>> +	struct xfs_da_args	xattri_args;	  /* args context */
>>>> +
>>>> +	struct list_head	xattri_list;
>>>>    	/*
>>>>    	 * A byte array follows the header containing the file name and
>>>> diff --git a/fs/xfs/scrub/common.c b/fs/xfs/scrub/common.c
>>>> index 0c54ff5..270c32e 100644
>>>> --- a/fs/xfs/scrub/common.c
>>>> +++ b/fs/xfs/scrub/common.c
>>>> @@ -30,6 +30,8 @@
>>>>    #include "xfs_rmap_btree.h"
>>>>    #include "xfs_log.h"
>>>>    #include "xfs_trans_priv.h"
>>>> +#include "xfs_da_format.h"
>>>> +#include "xfs_da_btree.h"
>>>>    #include "xfs_attr.h"
>>>>    #include "xfs_reflink.h"
>>>>    #include "scrub/xfs_scrub.h"
>>>> diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c
>>>> index 142de8d..9b1b93e 100644
>>>> --- a/fs/xfs/xfs_acl.c
>>>> +++ b/fs/xfs/xfs_acl.c
>>>> @@ -10,6 +10,8 @@
>>>>    #include "xfs_mount.h"
>>>>    #include "xfs_inode.h"
>>>>    #include "xfs_acl.h"
>>>> +#include "xfs_da_format.h"
>>>> +#include "xfs_da_btree.h"
>>>>    #include "xfs_attr.h"
>>>>    #include "xfs_trace.h"
>>>>    #include <linux/slab.h>
>>>> diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
>>>> index 0ea19b4..36e6d1e 100644
>>>> --- a/fs/xfs/xfs_attr_item.c
>>>> +++ b/fs/xfs/xfs_attr_item.c
>>>> @@ -19,10 +19,10 @@
>>>>    #include "xfs_rmap.h"
>>>>    #include "xfs_inode.h"
>>>>    #include "xfs_icache.h"
>>>> -#include "xfs_attr.h"
>>>>    #include "xfs_shared.h"
>>>>    #include "xfs_da_format.h"
>>>>    #include "xfs_da_btree.h"
>>>> +#include "xfs_attr.h"
>>>>    static inline struct xfs_attri_log_item *ATTRI_ITEM(struct xfs_log_item *lip)
>>>>    {
>>>> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
>>>> index ab341d6..c8728ca 100644
>>>> --- a/fs/xfs/xfs_ioctl.c
>>>> +++ b/fs/xfs/xfs_ioctl.c
>>>> @@ -16,6 +16,8 @@
>>>>    #include "xfs_rtalloc.h"
>>>>    #include "xfs_itable.h"
>>>>    #include "xfs_error.h"
>>>> +#include "xfs_da_format.h"
>>>> +#include "xfs_da_btree.h"
>>>>    #include "xfs_attr.h"
>>>>    #include "xfs_bmap.h"
>>>>    #include "xfs_bmap_util.h"
>>>> diff --git a/fs/xfs/xfs_ioctl32.c b/fs/xfs/xfs_ioctl32.c
>>>> index 5001dca..23f6990 100644
>>>> --- a/fs/xfs/xfs_ioctl32.c
>>>> +++ b/fs/xfs/xfs_ioctl32.c
>>>> @@ -21,6 +21,8 @@
>>>>    #include "xfs_fsops.h"
>>>>    #include "xfs_alloc.h"
>>>>    #include "xfs_rtalloc.h"
>>>> +#include "xfs_da_format.h"
>>>> +#include "xfs_da_btree.h"
>>>>    #include "xfs_attr.h"
>>>>    #include "xfs_ioctl.h"
>>>>    #include "xfs_ioctl32.h"
>>>> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
>>>> index e73c21a..561c467 100644
>>>> --- a/fs/xfs/xfs_iops.c
>>>> +++ b/fs/xfs/xfs_iops.c
>>>> @@ -17,6 +17,7 @@
>>>>    #include "xfs_acl.h"
>>>>    #include "xfs_quota.h"
>>>>    #include "xfs_error.h"
>>>> +#include "xfs_da_btree.h"
>>>>    #include "xfs_attr.h"
>>>>    #include "xfs_trans.h"
>>>>    #include "xfs_trace.h"
>>>> diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c
>>>> index 3013746..938e81d 100644
>>>> --- a/fs/xfs/xfs_xattr.c
>>>> +++ b/fs/xfs/xfs_xattr.c
>>>> @@ -11,6 +11,7 @@
>>>>    #include "xfs_mount.h"
>>>>    #include "xfs_da_format.h"
>>>>    #include "xfs_inode.h"
>>>> +#include "xfs_da_btree.h"
>>>>    #include "xfs_attr.h"
>>>>    #include "xfs_attr_leaf.h"
>>>>    #include "xfs_acl.h"
>>>> -- 
>>>> 2.7.4
>>>>

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 7/9] xfs: Add attr context to log item
  2019-04-24  2:24         ` Allison Henderson
@ 2019-04-24  4:10           ` Darrick J. Wong
  2019-04-24 12:17             ` Brian Foster
  0 siblings, 1 reply; 48+ messages in thread
From: Darrick J. Wong @ 2019-04-24  4:10 UTC (permalink / raw)
  To: Allison Henderson; +Cc: Brian Foster, linux-xfs

Sorry I'm late back to the party...

On Tue, Apr 23, 2019 at 07:24:40PM -0700, Allison Henderson wrote:
> 
> On 4/23/19 6:20 AM, Brian Foster wrote:
> > On Mon, Apr 22, 2019 at 03:01:27PM -0700, Allison Henderson wrote:
> > > 
> > > 
> > > On 4/22/19 6:03 AM, Brian Foster wrote:
> > > > On Fri, Apr 12, 2019 at 03:50:34PM -0700, Allison Henderson wrote:
> > > > > This patch modifies xfs_attr_item to store a xfs_da_args, a xfs_buf pointer
> > > > > and a new state type. We will use these in the next patch when
> > > > > we modify xfs_set_attr_args to roll transactions by returning EAGAIN.
> > > > > Because the subroutines of this function modify the contents of these
> > > > > structures, we need to find a place to store them where they remain
> > > > > instantiated across multiple calls to xfs_set_attr_args.
> > > > > 
> > > > > Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> > > > > ---
> > > > 
> > > > I see Darrick has already commented on the whole state thing. I'll
> > > > probably have to grok the next patch to comment further, but just a
> > > > couple initial thoughts:
> > > > 
> > > > First, I hit a build failure with this patch. It looks like there's a
> > > > missed include in the scrub code:
> > > > 
> > > >     ...
> > > >     CC [M]  fs/xfs/scrub/repair.o
> > > > In file included from fs/xfs/scrub/repair.c:32:
> > > > fs/xfs/libxfs/xfs_attr.h:105:21: error: field ‘xattri_args’ has incomplete type
> > > >     struct xfs_da_args xattri_args;   /* args context */
> > > Hmm, ok.  I'll get that corrected, I probably need to clean out my workspace
> > > and build from scratch.
> > > 
> > > >     ...
> > > > 
> > > > Second, the commit log suggests that the states will reflect the current
> > > > transaction roll points (i.e., establishing re-entry points down in
> > > > xfs_attr_set_args(). I'm kind of wondering if we should break these
> > > > xattr set sub-sequences down into smaller helper functions (refactoring
> > > > the existing code as we go) such that the mechanism could technically be

I had had the thought of "why not just give each step of setting an
attribute its own log item, so we don't have to have this STATE_NNN
business?" but then realized that will generate an insane amount of
boilerplate, and you're already close to a better solution, so I shut up
to think harder. :)

> > > > used deferred or not. Re: the previous thought on whether to defer xattr
> > > > removes or not, there might also be cases where there's not a need to
> > > > defer xattr sets.
> > > > 
> > > > E.g., taking a quick peek into the next patch, the state 1 case in
> > > > xfs_attr_try_sf_addname() is actually a transaction commit, which I
> > > > think means we're done. We'd have done an attr memory allocation,
> > > > deferred op and transaction roll where none was necessary so it might
> > > > not be worth it to defer in that scenario. Hmm, it also looks like we
> > > > return -EAGAIN in places where we've not actually done any work, like if
> > > > a shortform add attempt returns -ENOSPC (or the -EAGAIN return before we
> > > > even attempt the sf add). That kind of looks like a waste of transaction
> > > > rolls and further suggests it might be cleaner to break this whole path
> > > > down into helpers and put it back together in a way more conducive to
> > > > deferred operations.

Er, agreed:

> > > Yes, this area is a bit of a wart the way it is right now.  I think you're
> > > right in that ultimately we may end up having to do a lot of refactoring in
> > > order to have more efficient "re-entry points".  The state machine is hard
> > > to get into subroutines, so it's limited in use in the top level function.

So my current understanding of the problem is that we have this big old
xfs_attr_set_args function that does multiple responsibilities requiring
transaction rolls, which we can't do directly inside a ->finish_item
handler:

 1. If no attr fork, add one.
 2. If shortform attr fork, try to put it in the sf area.
 3. If shortform attr fork and out of space, convert to leaf format.
 4. Add attr to leaf/node attr tree.

So how about this: refactor each of these pieces into a separate
function, then add a separate XFS_ATTR_OP_FLAGS_* value for each of
these little pieces.  xfs_trans_attr() can call the appropriate little
function for the OP_FLAG and xfs_attr_finish_item can figure out which
state comes next based on the return value.

By directly mapping distinct OP_FLAGS to each piece of the attr setting
puzzle, you can use the existing "roll and come back" part of the defer
ops machinery.

If _finish_item thinks we're done then we just exit.  Otherwise, store
the new state in the (struct xfs_attr_item *) parameter passed into
_finish_item and return -EAGAIN, which puts the defer item back on the
defer op list, logs a new xattr intent with the new state, rolls the
transaction, and tries to finish the attr again.  I think you've already
done this last part.

xfs_attri_recover then becomes much simpler -- we're passed in the
reconstructed log item from which we figure out which step we need to
do.  We call xfs_trans_attr() to do that one step, but unlike
_finish_item, we use the new state to construct a *new* attr intent and
attach it to the transaction, then call xfs_defer_move at the end to
move all the queued defer_ops to the parent_tp because log recovery
requires us to recover all the incomplete log intent items before
finishing any new ones that were created as part of recovery.

This does mean that we end up with dramatically separate code paths for
defer ops attr setting vs. regular attr setting, but as you point out
the parent pointer feature will give the new code paths plenty of exercise.
Tying the new log intent items to a new feature bit is key to preventing
old kernels from stumbling across our new intent items, so we needed to
preserve the old attr set paths anyway.

Anyway, if this all seems confusing, you can track me down, because I
wrote most of this system and therefore have forgotten all of
it^W^W^W^W^Wam available to help. :)

> > > 
> > > I was also starting to wonder if maybe I could do some refactoring in
> > > xfs_defer_finish_noroll to capture the common code associated with the
> > > -EAGAIN handling.  Then maybe we could make a function pointer that we can
> > > pass through the finish_item interface.  The idea being that subroutines
> > > could use the function pointer to cycle out the transaction when needed
> > > instead of having to record states and back out like this. It'd be a new

The state tracking and rolling is already built into xfs_defer.c. :)

> > > parameter to pipe around, but it'd be more efficient than the state machine,
> > > and less surgery in the refactor.  And maybe a blessing to any other
> > > operations that might need to go through this transition in the future.
> > > Thoughts?
> > > 
> > 
> > That's an interesting idea. It still strikes me as a bit of a
> > fallback/hack as opposed to organizing the code to properly fit into the
> > dfops infrastructure, but it could be useful as a transient solution.
> >  From a high level, it looks like we'd have to create a new intent, relog
> > this item and all remaining items associated with the dfp to it, roll
> > the tx, and finally create a done item associated with the intent in the
> > new tx. You'd need access to the dfp for some of that, so it's not
> > immediately clear to me that this ends up much easier than fixing up
> > the xattr code.

(I think the code that handles EAGAIN being returned from finish_item
does this for you....)

> > 
> > BTW, if we did end up with something like that I'd probably prefer to
> > see it as an exported dfops helper function as opposed to a function
> > pointer being passed around, if possible.
> > 
> 
> Alrighty, I think for now I may try to pursue something more like what you
> proposed in the next patch and see where I get first.  Maybe I'll come back
> to this later if for some reason it doesn't work out, but I think what you
> have there is reasonable.

<nod>

--D

> 
> Thanks again for the reviews!
> Allison
> 
> > Brian
> > 
> > > Thanks again for the reviews!
> > > 
> > > Allison
> > > 
> > > > 
> > > > Brian
> > > > 
> > > > 
> > > > >    fs/xfs/libxfs/xfs_attr.h | 18 +++++++++++++++++-
> > > > >    fs/xfs/scrub/common.c    |  2 ++
> > > > >    fs/xfs/xfs_acl.c         |  2 ++
> > > > >    fs/xfs/xfs_attr_item.c   |  2 +-
> > > > >    fs/xfs/xfs_ioctl.c       |  2 ++
> > > > >    fs/xfs/xfs_ioctl32.c     |  2 ++
> > > > >    fs/xfs/xfs_iops.c        |  1 +
> > > > >    fs/xfs/xfs_xattr.c       |  1 +
> > > > >    8 files changed, 28 insertions(+), 2 deletions(-)
> > > > > 
> > > > > diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
> > > > > index 974c963..4ce3b0a 100644
> > > > > --- a/fs/xfs/libxfs/xfs_attr.h
> > > > > +++ b/fs/xfs/libxfs/xfs_attr.h
> > > > > @@ -77,6 +77,13 @@ typedef struct attrlist_ent {	/* data from attr_list() */
> > > > >    	char	a_name[1];	/* attr name (NULL terminated) */
> > > > >    } attrlist_ent_t;
> > > > > +/* Attr state machine types */
> > > > > +enum xfs_attr_state {
> > > > > +	XFS_ATTR_STATE1 = 1,
> > > > > +	XFS_ATTR_STATE2 = 2,
> > > > > +	XFS_ATTR_STATE3 = 3,
> > > > > +};
> > > > > +
> > > > >    /*
> > > > >     * List of attrs to commit later.
> > > > >     */
> > > > > @@ -88,7 +95,16 @@ struct xfs_attr_item {
> > > > >    	void		  *xattri_name;	      /* attr name */
> > > > >    	uint32_t	  xattri_name_len;    /* length of name */
> > > > >    	uint32_t	  xattri_flags;       /* attr flags */
> > > > > -	struct list_head  xattri_list;
> > > > > +
> > > > > +	/*
> > > > > +	 * Delayed attr parameters that need to remain instantiated
> > > > > +	 * across transaction rolls during the defer finish
> > > > > +	 */
> > > > > +	struct xfs_buf		*xattri_leaf_bp;  /* Leaf buf to release */
> > > > > +	enum xfs_attr_state	xattri_state;	  /* state machine marker */
> > > > > +	struct xfs_da_args	xattri_args;	  /* args context */
> > > > > +
> > > > > +	struct list_head	xattri_list;
> > > > >    	/*
> > > > >    	 * A byte array follows the header containing the file name and
> > > > > diff --git a/fs/xfs/scrub/common.c b/fs/xfs/scrub/common.c
> > > > > index 0c54ff5..270c32e 100644
> > > > > --- a/fs/xfs/scrub/common.c
> > > > > +++ b/fs/xfs/scrub/common.c
> > > > > @@ -30,6 +30,8 @@
> > > > >    #include "xfs_rmap_btree.h"
> > > > >    #include "xfs_log.h"
> > > > >    #include "xfs_trans_priv.h"
> > > > > +#include "xfs_da_format.h"
> > > > > +#include "xfs_da_btree.h"
> > > > >    #include "xfs_attr.h"
> > > > >    #include "xfs_reflink.h"
> > > > >    #include "scrub/xfs_scrub.h"
> > > > > diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c
> > > > > index 142de8d..9b1b93e 100644
> > > > > --- a/fs/xfs/xfs_acl.c
> > > > > +++ b/fs/xfs/xfs_acl.c
> > > > > @@ -10,6 +10,8 @@
> > > > >    #include "xfs_mount.h"
> > > > >    #include "xfs_inode.h"
> > > > >    #include "xfs_acl.h"
> > > > > +#include "xfs_da_format.h"
> > > > > +#include "xfs_da_btree.h"
> > > > >    #include "xfs_attr.h"
> > > > >    #include "xfs_trace.h"
> > > > >    #include <linux/slab.h>
> > > > > diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
> > > > > index 0ea19b4..36e6d1e 100644
> > > > > --- a/fs/xfs/xfs_attr_item.c
> > > > > +++ b/fs/xfs/xfs_attr_item.c
> > > > > @@ -19,10 +19,10 @@
> > > > >    #include "xfs_rmap.h"
> > > > >    #include "xfs_inode.h"
> > > > >    #include "xfs_icache.h"
> > > > > -#include "xfs_attr.h"
> > > > >    #include "xfs_shared.h"
> > > > >    #include "xfs_da_format.h"
> > > > >    #include "xfs_da_btree.h"
> > > > > +#include "xfs_attr.h"
> > > > >    static inline struct xfs_attri_log_item *ATTRI_ITEM(struct xfs_log_item *lip)
> > > > >    {
> > > > > diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> > > > > index ab341d6..c8728ca 100644
> > > > > --- a/fs/xfs/xfs_ioctl.c
> > > > > +++ b/fs/xfs/xfs_ioctl.c
> > > > > @@ -16,6 +16,8 @@
> > > > >    #include "xfs_rtalloc.h"
> > > > >    #include "xfs_itable.h"
> > > > >    #include "xfs_error.h"
> > > > > +#include "xfs_da_format.h"
> > > > > +#include "xfs_da_btree.h"
> > > > >    #include "xfs_attr.h"
> > > > >    #include "xfs_bmap.h"
> > > > >    #include "xfs_bmap_util.h"
> > > > > diff --git a/fs/xfs/xfs_ioctl32.c b/fs/xfs/xfs_ioctl32.c
> > > > > index 5001dca..23f6990 100644
> > > > > --- a/fs/xfs/xfs_ioctl32.c
> > > > > +++ b/fs/xfs/xfs_ioctl32.c
> > > > > @@ -21,6 +21,8 @@
> > > > >    #include "xfs_fsops.h"
> > > > >    #include "xfs_alloc.h"
> > > > >    #include "xfs_rtalloc.h"
> > > > > +#include "xfs_da_format.h"
> > > > > +#include "xfs_da_btree.h"
> > > > >    #include "xfs_attr.h"
> > > > >    #include "xfs_ioctl.h"
> > > > >    #include "xfs_ioctl32.h"
> > > > > diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> > > > > index e73c21a..561c467 100644
> > > > > --- a/fs/xfs/xfs_iops.c
> > > > > +++ b/fs/xfs/xfs_iops.c
> > > > > @@ -17,6 +17,7 @@
> > > > >    #include "xfs_acl.h"
> > > > >    #include "xfs_quota.h"
> > > > >    #include "xfs_error.h"
> > > > > +#include "xfs_da_btree.h"
> > > > >    #include "xfs_attr.h"
> > > > >    #include "xfs_trans.h"
> > > > >    #include "xfs_trace.h"
> > > > > diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c
> > > > > index 3013746..938e81d 100644
> > > > > --- a/fs/xfs/xfs_xattr.c
> > > > > +++ b/fs/xfs/xfs_xattr.c
> > > > > @@ -11,6 +11,7 @@
> > > > >    #include "xfs_mount.h"
> > > > >    #include "xfs_da_format.h"
> > > > >    #include "xfs_inode.h"
> > > > > +#include "xfs_da_btree.h"
> > > > >    #include "xfs_attr.h"
> > > > >    #include "xfs_attr_leaf.h"
> > > > >    #include "xfs_acl.h"
> > > > > -- 
> > > > > 2.7.4
> > > > > 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 7/9] xfs: Add attr context to log item
  2019-04-24  4:10           ` Darrick J. Wong
@ 2019-04-24 12:17             ` Brian Foster
  2019-04-24 15:25               ` Darrick J. Wong
  0 siblings, 1 reply; 48+ messages in thread
From: Brian Foster @ 2019-04-24 12:17 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Allison Henderson, linux-xfs

On Tue, Apr 23, 2019 at 09:10:16PM -0700, Darrick J. Wong wrote:
> Sorry I'm late back to the party...
> 
> On Tue, Apr 23, 2019 at 07:24:40PM -0700, Allison Henderson wrote:
> > 
> > On 4/23/19 6:20 AM, Brian Foster wrote:
> > > On Mon, Apr 22, 2019 at 03:01:27PM -0700, Allison Henderson wrote:
> > > > 
> > > > 
> > > > On 4/22/19 6:03 AM, Brian Foster wrote:
> > > > > On Fri, Apr 12, 2019 at 03:50:34PM -0700, Allison Henderson wrote:
> > > > > > This patch modifies xfs_attr_item to store a xfs_da_args, a xfs_buf pointer
> > > > > > and a new state type. We will use these in the next patch when
> > > > > > we modify xfs_set_attr_args to roll transactions by returning EAGAIN.
> > > > > > Because the subroutines of this function modify the contents of these
> > > > > > structures, we need to find a place to store them where they remain
> > > > > > instantiated across multiple calls to xfs_set_attr_args.
> > > > > > 
> > > > > > Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> > > > > > ---
> > > > > 
> > > > > I see Darrick has already commented on the whole state thing. I'll
> > > > > probably have to grok the next patch to comment further, but just a
> > > > > couple initial thoughts:
> > > > > 
> > > > > First, I hit a build failure with this patch. It looks like there's a
> > > > > missed include in the scrub code:
> > > > > 
> > > > >     ...
> > > > >     CC [M]  fs/xfs/scrub/repair.o
> > > > > In file included from fs/xfs/scrub/repair.c:32:
> > > > > fs/xfs/libxfs/xfs_attr.h:105:21: error: field ‘xattri_args’ has incomplete type
> > > > >     struct xfs_da_args xattri_args;   /* args context */
> > > > Hmm, ok.  I'll get that corrected, I probably need to clean out my workspace
> > > > and build from scratch.
> > > > 
> > > > >     ...
> > > > > 
> > > > > Second, the commit log suggests that the states will reflect the current
> > > > > transaction roll points (i.e., establishing re-entry points down in
> > > > > xfs_attr_set_args(). I'm kind of wondering if we should break these
> > > > > xattr set sub-sequences down into smaller helper functions (refactoring
> > > > > the existing code as we go) such that the mechanism could technically be
> 
> I had had the thought of "why not just give each step of setting an
> attribute its own log item, so we don't have to have this STATE_NNN
> business?" but then realized that will generate an insane amount of
> boilerplate, and you're already close to a better solution, so I shut up
> to think harder. :)
> 

The thought of separating things down into smaller "ops" popped into my
head (not necessarily separate/smaller log items), but I hadn't really
thought it through to this point...

> > > > > used deferred or not. Re: the previous thought on whether to defer xattr
> > > > > removes or not, there might also be cases where there's not a need to
> > > > > defer xattr sets.
> > > > > 
> > > > > E.g., taking a quick peek into the next patch, the state 1 case in
> > > > > xfs_attr_try_sf_addname() is actually a transaction commit, which I
> > > > > think means we're done. We'd have done an attr memory allocation,
> > > > > deferred op and transaction roll where none was necessary so it might
> > > > > not be worth it to defer in that scenario. Hmm, it also looks like we
> > > > > return -EAGAIN in places where we've not actually done any work, like if
> > > > > a shortform add attempt returns -ENOSPC (or the -EAGAIN return before we
> > > > > even attempt the sf add). That kind of looks like a waste of transaction
> > > > > rolls and further suggests it might be cleaner to break this whole path
> > > > > down into helpers and put it back together in a way more conducive to
> > > > > deferred operations.
> 
> Er, agreed:
> 
> > > > Yes, this area is a bit of a wart the way it is right now.  I think you're
> > > > right in that ultimately we may end up having to do a lot of refactoring in
> > > > order to have more efficient "re-entry points".  The state machine is hard
> > > > to get into subroutines, so it's limited in use in the top level function.
> 
> So my current understanding of the problem is that we have this big old
> xfs_attr_set_args function that does multiple responsibilities requiring
> transaction rolls, which we can't do directly inside a ->finish_item
> handler:
> 
>  1. If no attr fork, add one.
>  2. If shortform attr fork, try to put it in the sf area.
>  3. If shortform attr fork and out of space, convert to leaf format.
>  4. Add attr to leaf/node attr tree.
> 

And there are a bunch of tx rolls down in the #4 codepath that this
series currently just tosses away. I'm not quite sure how appropriate
that is, but I also don't think we necessarily need to preserve each and
every transaction roll as implemented by the current code.

IOW, I think it absolutely makes sense to step back from the current
behavior and reassess the best/required places to roll xattr ops in
progress as well as the transaction reservation itself.

> So how about this: refactor each of these pieces into a separate
> function, then add a separate XFS_ATTR_OP_FLAGS_* value for each of
> these little pieces.  xfs_trans_attr() can call the appropriate little
> function for the OP_FLAG and xfs_attr_finish_item can figure out which
> state comes next based on the return value.
> 
> By directly mapping distinct OP_FLAGS to each piece of the attr setting
> puzzle, you can use the existing "roll and come back" part of the defer
> ops machinery.
> 
> If _finish_item thinks we're done then we just exit.  Otherwise, store
> the new state in the (struct xfs_attr_item *) parameter passed into
> _finish_item and return -EAGAIN, which puts the defer item back on the
> defer op list, logs a new xattr intent with the new state, rolls the
> transaction, and tries to finish the attr again.  I think you've already
> done this last part.
> 

That sounds plausible to me. One concern I have is that I think we
should try to avoid creating more unnecessary complexity in the dfops
state mechanism simply to accommodate a messy xattr implementation. For
example, consider the following sequence for a simple set of an xattr
that requires leaf format and remote value block(s):

- try sf add
- returns -ENOSPC, convert to leaf and roll tx
- attempt to add the xattr (xfs_attr_leaf_addname())
	- if -ENOSPC, convert to node and call xfs_attr_node_addname()
	- else call xfs_attr3_leaf_add_work()
		- add entry
		- if remoteval, set INCOMPLETE
- roll tx
- if remoteval, call xfs_attr_rmtval_set()
	- block allocation, tx roll loop
	- copy remote value into bufs, xfs_bwrite()
- if remoteval, xfs_attr3_leaf_clearflag()
	- clear INCOMPLETE
	- update/log rmt pointers
	- roll tx

I'm wondering 1.) how much of this is necessary with an intent based
implementation and 2.) how much of this can be refactored to not require
complex state tracking.

For example, all of the format conversions that occur before we actually
make any modifications associated with the xattr (i.e., -ENOSPC returns
from the current format) seem to me could easily be performed and
immediately return -EAGAIN without any state tracking. The retry should
pick up the current format of the fork and retry there. Thus, ISTM we
could drop the whole xfs_attr_leaf_addname() -> xfs_attr3_leaf_to_node()
-> xfs_attr_node_addname() codepath in favor of a format conversion and
-EAGAIN retry that calls directly into xfs_attr_node_addname().

Once we have leaf format and we're doing remote block allocation, how
much could we get away with by re-looking up the entry, finding that
we're still short of remote blocks and performing another
xfs_bmapi_write() -> -EAGAIN cycle until we're good to copy in the xattr
value?

What about all this INCOMPLETE stuff? Do we even need that with an
intent based implementation? My understanding was that was because we
had to roll the transaction and thus could leave an incomplete xattr on
disk. I haven't looked too far into it so perhaps there's more to it
than that, but if not and this is no longer a problem with an intent
based implementation then perhaps much of that code and associated tx
rolls can be bypassed as well.

This is not to say that we won't require any such state tracking as
you've described above. The whole block allocation thing above may
require a state marker to get around attempts to set the xattr name
again and get back to the remote value block allocation code. It also
looks like we can do post xattr set format changes (i.e., node -> leaf,
leaf -> sf) that might require something like that to make sure we don't
go an retry an xattr set we've already completed. The point is just that
I'd prefer that we explore how much we can simplify this mess of an
implementation as much as possible (the above is all very handwavy)
first to reduce the state tracking complexity, particularly if these
states end up written to the log via the intent.

Hmm, I'm starting to think that maybe what we really need to do here is
step back from the code and logically map out what these states and the
resulting operation flow needs to be, particularly since there are so
many variations between different format conversions, renames, remote
blocks, etc. Once we have this whole mess mapped out, coding it up
should be more of an effort in refactoring.

> xfs_attri_recover then becomes much simpler -- we're passed in the
> reconstructed log item from which we figure out which step we need to
> do.  We call xfs_trans_attr() to do that one step, but unlike
> _finish_item, we use the new state to construct a *new* attr intent and
> attach it to the transaction, then call xfs_defer_move at the end to
> move all the queued defer_ops to the parent_tp because log recovery
> requires us to recover all the incomplete log intent items before
> finishing any new ones that were created as part of recovery.
> 
> This does mean that we end up with dramatically separate code paths for
> defer ops attr setting vs. regular attr setting, but as you point out
> the parent pointer feature will give the new code paths plenty of exercise.
> Tying the new log intent items to a new feature bit is key to preventing
> old kernels from stumbling across our new intent items, so we needed to
> preserve the old attr set paths anyway.
> 

That's a good point wrt to the other discussion around the direct xattr
codepath. It sounds like we do need to keep that entire path around
regardless to support v4 filesystems and such. The current series just
unconditionally switches things over to deferred ops.

> Anyway, if this all seems confusing, you can track me down, because I
> wrote most of this system and therefore have forgotten all of
> it^W^W^W^W^Wam available to help. :)
> 
> > > > 
> > > > I was also starting to wonder if maybe I could do some refactoring in
> > > > xfs_defer_finish_noroll to capture the common code associated with the
> > > > -EAGAIN handling.  Then maybe we could make a function pointer that we can
> > > > pass through the finish_item interface.  The idea being that subroutines
> > > > could use the function pointer to cycle out the transaction when needed
> > > > instead of having to record states and back out like this. It'd be a new
> 
> The state tracking and rolling is already built into xfs_defer.c. :)
> 
> > > > parameter to pipe around, but it'd be more efficient than the state machine,
> > > > and less surgery in the refactor.  And maybe a blessing to any other
> > > > operations that might need to go through this transition in the future.
> > > > Thoughts?
> > > > 
> > > 
> > > That's an interesting idea. It still strikes me as a bit of a
> > > fallback/hack as opposed to organizing the code to properly fit into the
> > > dfops infrastructure, but it could be useful as a transient solution.
> > >  From a high level, it looks like we'd have to create a new intent, relog
> > > this item and all remaining items associated with the dfp to it, roll
> > > the tx, and finally create a done item associated with the intent in the
> > > new tx. You'd need access to the dfp for some of that, so it's not
> > > immediately clear to me that this ends up much easier than fixing up
> > > the xattr code.
> 
> (I think the code that handles EAGAIN being returned from finish_item
> does this for you....)
> 

Yeah, I'm not totally sure it's an ideal/feasible approach, but for the
sake of clarity I think what Allison is getting at is that if there was
a way to trigger a dfops -EAGAIN roll sequence via a callback/helper
function, we wouldn't need to refactor the xattr subsystem to have
-EAGAIN return points. Instead we could just invoke the callback at the
existing roll points and achieve the same behavior (in theory). It's
kind of like providing an inside-out xfs_defer_finish_noroll() -EAGAIN
implementation via a helper function for code down in ->finish_item().

Brian

> > > 
> > > BTW, if we did end up with something like that I'd probably prefer to
> > > see it as an exported dfops helper function as opposed to a function
> > > pointer being passed around, if possible.
> > > 
> > 
> > Alrighty, I think for now I may try to pursue something more like what you
> > proposed in the next patch and see where I get first.  Maybe I'll come back
> > to this later if for some reason it doesn't work out, but I think what you
> > have there is reasonable.
> 
> <nod>
> 
> --D
> 
> > 
> > Thanks again for the reviews!
> > Allison
> > 
> > > Brian
> > > 
> > > > Thanks again for the reviews!
> > > > 
> > > > Allison
> > > > 
> > > > > 
> > > > > Brian
> > > > > 
> > > > > 
> > > > > >    fs/xfs/libxfs/xfs_attr.h | 18 +++++++++++++++++-
> > > > > >    fs/xfs/scrub/common.c    |  2 ++
> > > > > >    fs/xfs/xfs_acl.c         |  2 ++
> > > > > >    fs/xfs/xfs_attr_item.c   |  2 +-
> > > > > >    fs/xfs/xfs_ioctl.c       |  2 ++
> > > > > >    fs/xfs/xfs_ioctl32.c     |  2 ++
> > > > > >    fs/xfs/xfs_iops.c        |  1 +
> > > > > >    fs/xfs/xfs_xattr.c       |  1 +
> > > > > >    8 files changed, 28 insertions(+), 2 deletions(-)
> > > > > > 
> > > > > > diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
> > > > > > index 974c963..4ce3b0a 100644
> > > > > > --- a/fs/xfs/libxfs/xfs_attr.h
> > > > > > +++ b/fs/xfs/libxfs/xfs_attr.h
> > > > > > @@ -77,6 +77,13 @@ typedef struct attrlist_ent {	/* data from attr_list() */
> > > > > >    	char	a_name[1];	/* attr name (NULL terminated) */
> > > > > >    } attrlist_ent_t;
> > > > > > +/* Attr state machine types */
> > > > > > +enum xfs_attr_state {
> > > > > > +	XFS_ATTR_STATE1 = 1,
> > > > > > +	XFS_ATTR_STATE2 = 2,
> > > > > > +	XFS_ATTR_STATE3 = 3,
> > > > > > +};
> > > > > > +
> > > > > >    /*
> > > > > >     * List of attrs to commit later.
> > > > > >     */
> > > > > > @@ -88,7 +95,16 @@ struct xfs_attr_item {
> > > > > >    	void		  *xattri_name;	      /* attr name */
> > > > > >    	uint32_t	  xattri_name_len;    /* length of name */
> > > > > >    	uint32_t	  xattri_flags;       /* attr flags */
> > > > > > -	struct list_head  xattri_list;
> > > > > > +
> > > > > > +	/*
> > > > > > +	 * Delayed attr parameters that need to remain instantiated
> > > > > > +	 * across transaction rolls during the defer finish
> > > > > > +	 */
> > > > > > +	struct xfs_buf		*xattri_leaf_bp;  /* Leaf buf to release */
> > > > > > +	enum xfs_attr_state	xattri_state;	  /* state machine marker */
> > > > > > +	struct xfs_da_args	xattri_args;	  /* args context */
> > > > > > +
> > > > > > +	struct list_head	xattri_list;
> > > > > >    	/*
> > > > > >    	 * A byte array follows the header containing the file name and
> > > > > > diff --git a/fs/xfs/scrub/common.c b/fs/xfs/scrub/common.c
> > > > > > index 0c54ff5..270c32e 100644
> > > > > > --- a/fs/xfs/scrub/common.c
> > > > > > +++ b/fs/xfs/scrub/common.c
> > > > > > @@ -30,6 +30,8 @@
> > > > > >    #include "xfs_rmap_btree.h"
> > > > > >    #include "xfs_log.h"
> > > > > >    #include "xfs_trans_priv.h"
> > > > > > +#include "xfs_da_format.h"
> > > > > > +#include "xfs_da_btree.h"
> > > > > >    #include "xfs_attr.h"
> > > > > >    #include "xfs_reflink.h"
> > > > > >    #include "scrub/xfs_scrub.h"
> > > > > > diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c
> > > > > > index 142de8d..9b1b93e 100644
> > > > > > --- a/fs/xfs/xfs_acl.c
> > > > > > +++ b/fs/xfs/xfs_acl.c
> > > > > > @@ -10,6 +10,8 @@
> > > > > >    #include "xfs_mount.h"
> > > > > >    #include "xfs_inode.h"
> > > > > >    #include "xfs_acl.h"
> > > > > > +#include "xfs_da_format.h"
> > > > > > +#include "xfs_da_btree.h"
> > > > > >    #include "xfs_attr.h"
> > > > > >    #include "xfs_trace.h"
> > > > > >    #include <linux/slab.h>
> > > > > > diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
> > > > > > index 0ea19b4..36e6d1e 100644
> > > > > > --- a/fs/xfs/xfs_attr_item.c
> > > > > > +++ b/fs/xfs/xfs_attr_item.c
> > > > > > @@ -19,10 +19,10 @@
> > > > > >    #include "xfs_rmap.h"
> > > > > >    #include "xfs_inode.h"
> > > > > >    #include "xfs_icache.h"
> > > > > > -#include "xfs_attr.h"
> > > > > >    #include "xfs_shared.h"
> > > > > >    #include "xfs_da_format.h"
> > > > > >    #include "xfs_da_btree.h"
> > > > > > +#include "xfs_attr.h"
> > > > > >    static inline struct xfs_attri_log_item *ATTRI_ITEM(struct xfs_log_item *lip)
> > > > > >    {
> > > > > > diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> > > > > > index ab341d6..c8728ca 100644
> > > > > > --- a/fs/xfs/xfs_ioctl.c
> > > > > > +++ b/fs/xfs/xfs_ioctl.c
> > > > > > @@ -16,6 +16,8 @@
> > > > > >    #include "xfs_rtalloc.h"
> > > > > >    #include "xfs_itable.h"
> > > > > >    #include "xfs_error.h"
> > > > > > +#include "xfs_da_format.h"
> > > > > > +#include "xfs_da_btree.h"
> > > > > >    #include "xfs_attr.h"
> > > > > >    #include "xfs_bmap.h"
> > > > > >    #include "xfs_bmap_util.h"
> > > > > > diff --git a/fs/xfs/xfs_ioctl32.c b/fs/xfs/xfs_ioctl32.c
> > > > > > index 5001dca..23f6990 100644
> > > > > > --- a/fs/xfs/xfs_ioctl32.c
> > > > > > +++ b/fs/xfs/xfs_ioctl32.c
> > > > > > @@ -21,6 +21,8 @@
> > > > > >    #include "xfs_fsops.h"
> > > > > >    #include "xfs_alloc.h"
> > > > > >    #include "xfs_rtalloc.h"
> > > > > > +#include "xfs_da_format.h"
> > > > > > +#include "xfs_da_btree.h"
> > > > > >    #include "xfs_attr.h"
> > > > > >    #include "xfs_ioctl.h"
> > > > > >    #include "xfs_ioctl32.h"
> > > > > > diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> > > > > > index e73c21a..561c467 100644
> > > > > > --- a/fs/xfs/xfs_iops.c
> > > > > > +++ b/fs/xfs/xfs_iops.c
> > > > > > @@ -17,6 +17,7 @@
> > > > > >    #include "xfs_acl.h"
> > > > > >    #include "xfs_quota.h"
> > > > > >    #include "xfs_error.h"
> > > > > > +#include "xfs_da_btree.h"
> > > > > >    #include "xfs_attr.h"
> > > > > >    #include "xfs_trans.h"
> > > > > >    #include "xfs_trace.h"
> > > > > > diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c
> > > > > > index 3013746..938e81d 100644
> > > > > > --- a/fs/xfs/xfs_xattr.c
> > > > > > +++ b/fs/xfs/xfs_xattr.c
> > > > > > @@ -11,6 +11,7 @@
> > > > > >    #include "xfs_mount.h"
> > > > > >    #include "xfs_da_format.h"
> > > > > >    #include "xfs_inode.h"
> > > > > > +#include "xfs_da_btree.h"
> > > > > >    #include "xfs_attr.h"
> > > > > >    #include "xfs_attr_leaf.h"
> > > > > >    #include "xfs_acl.h"
> > > > > > -- 
> > > > > > 2.7.4
> > > > > > 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 7/9] xfs: Add attr context to log item
  2019-04-24 12:17             ` Brian Foster
@ 2019-04-24 15:25               ` Darrick J. Wong
  2019-04-24 16:57                 ` Brian Foster
  0 siblings, 1 reply; 48+ messages in thread
From: Darrick J. Wong @ 2019-04-24 15:25 UTC (permalink / raw)
  To: Brian Foster; +Cc: Allison Henderson, linux-xfs

On Wed, Apr 24, 2019 at 08:17:48AM -0400, Brian Foster wrote:
> On Tue, Apr 23, 2019 at 09:10:16PM -0700, Darrick J. Wong wrote:
> > Sorry I'm late back to the party...
> > 
> > On Tue, Apr 23, 2019 at 07:24:40PM -0700, Allison Henderson wrote:
> > > 
> > > On 4/23/19 6:20 AM, Brian Foster wrote:
> > > > On Mon, Apr 22, 2019 at 03:01:27PM -0700, Allison Henderson wrote:
> > > > > 
> > > > > 
> > > > > On 4/22/19 6:03 AM, Brian Foster wrote:
> > > > > > On Fri, Apr 12, 2019 at 03:50:34PM -0700, Allison Henderson wrote:
> > > > > > > This patch modifies xfs_attr_item to store a xfs_da_args, a xfs_buf pointer
> > > > > > > and a new state type. We will use these in the next patch when
> > > > > > > we modify xfs_set_attr_args to roll transactions by returning EAGAIN.
> > > > > > > Because the subroutines of this function modify the contents of these
> > > > > > > structures, we need to find a place to store them where they remain
> > > > > > > instantiated across multiple calls to xfs_set_attr_args.
> > > > > > > 
> > > > > > > Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> > > > > > > ---
> > > > > > 
> > > > > > I see Darrick has already commented on the whole state thing. I'll
> > > > > > probably have to grok the next patch to comment further, but just a
> > > > > > couple initial thoughts:
> > > > > > 
> > > > > > First, I hit a build failure with this patch. It looks like there's a
> > > > > > missed include in the scrub code:
> > > > > > 
> > > > > >     ...
> > > > > >     CC [M]  fs/xfs/scrub/repair.o
> > > > > > In file included from fs/xfs/scrub/repair.c:32:
> > > > > > fs/xfs/libxfs/xfs_attr.h:105:21: error: field ‘xattri_args’ has incomplete type
> > > > > >     struct xfs_da_args xattri_args;   /* args context */
> > > > > Hmm, ok.  I'll get that corrected, I probably need to clean out my workspace
> > > > > and build from scratch.
> > > > > 
> > > > > >     ...
> > > > > > 
> > > > > > Second, the commit log suggests that the states will reflect the current
> > > > > > transaction roll points (i.e., establishing re-entry points down in
> > > > > > xfs_attr_set_args(). I'm kind of wondering if we should break these
> > > > > > xattr set sub-sequences down into smaller helper functions (refactoring
> > > > > > the existing code as we go) such that the mechanism could technically be
> > 
> > I had had the thought of "why not just give each step of setting an
> > attribute its own log item, so we don't have to have this STATE_NNN
> > business?" but then realized that will generate an insane amount of
> > boilerplate, and you're already close to a better solution, so I shut up
> > to think harder. :)
> > 
> 
> The thought of separating things down into smaller "ops" popped into my
> head (not necessarily separate/smaller log items), but I hadn't really
> thought it through to this point...
> 
> > > > > > used deferred or not. Re: the previous thought on whether to defer xattr
> > > > > > removes or not, there might also be cases where there's not a need to
> > > > > > defer xattr sets.
> > > > > > 
> > > > > > E.g., taking a quick peek into the next patch, the state 1 case in
> > > > > > xfs_attr_try_sf_addname() is actually a transaction commit, which I
> > > > > > think means we're done. We'd have done an attr memory allocation,
> > > > > > deferred op and transaction roll where none was necessary so it might
> > > > > > not be worth it to defer in that scenario. Hmm, it also looks like we
> > > > > > return -EAGAIN in places where we've not actually done any work, like if
> > > > > > a shortform add attempt returns -ENOSPC (or the -EAGAIN return before we
> > > > > > even attempt the sf add). That kind of looks like a waste of transaction
> > > > > > rolls and further suggests it might be cleaner to break this whole path
> > > > > > down into helpers and put it back together in a way more conducive to
> > > > > > deferred operations.
> > 
> > Er, agreed:
> > 
> > > > > Yes, this area is a bit of a wart the way it is right now.  I think you're
> > > > > right in that ultimately we may end up having to do a lot of refactoring in
> > > > > order to have more efficient "re-entry points".  The state machine is hard
> > > > > to get into subroutines, so it's limited in use in the top level function.
> > 
> > So my current understanding of the problem is that we have this big old
> > xfs_attr_set_args function that does multiple responsibilities requiring
> > transaction rolls, which we can't do directly inside a ->finish_item
> > handler:
> > 
> >  1. If no attr fork, add one.
> >  2. If shortform attr fork, try to put it in the sf area.
> >  3. If shortform attr fork and out of space, convert to leaf format.
> >  4. Add attr to leaf/node attr tree.
> > 
> 
> And there are a bunch of tx rolls down in the #4 codepath that this
> series currently just tosses away. I'm not quite sure how appropriate
> that is, but I also don't think we necessarily need to preserve each and
> every transaction roll as implemented by the current code.
> 
> IOW, I think it absolutely makes sense to step back from the current
> behavior and reassess the best/required places to roll xattr ops in
> progress as well as the transaction reservation itself.

Yes, it would help to make a list of every small step that could
possibly be required to set an attribute.  That will help narrow down
how many defer op pieces are needed.

Another thought I had is that having the finish_item continually logging
a new intent with the latest state means that we can free the old intent
item, which helps us avoid the problem of pinning the log tail at that
first intent item while we scramble around doing a whole lot of rolling
and other work to get to the done item.

> > So how about this: refactor each of these pieces into a separate
> > function, then add a separate XFS_ATTR_OP_FLAGS_* value for each of
> > these little pieces.  xfs_trans_attr() can call the appropriate little
> > function for the OP_FLAG and xfs_attr_finish_item can figure out which
> > state comes next based on the return value.
> > 
> > By directly mapping distinct OP_FLAGS to each piece of the attr setting
> > puzzle, you can use the existing "roll and come back" part of the defer
> > ops machinery.
> > 
> > If _finish_item thinks we're done then we just exit.  Otherwise, store
> > the new state in the (struct xfs_attr_item *) parameter passed into
> > _finish_item and return -EAGAIN, which puts the defer item back on the
> > defer op list, logs a new xattr intent with the new state, rolls the
> > transaction, and tries to finish the attr again.  I think you've already
> > done this last part.
> > 
> 
> That sounds plausible to me. One concern I have is that I think we
> should try to avoid creating more unnecessary complexity in the dfops
> state mechanism simply to accommodate a messy xattr implementation. For
> example, consider the following sequence for a simple set of an xattr
> that requires leaf format and remote value block(s):
> 
> - try sf add
> - returns -ENOSPC, convert to leaf and roll tx
> - attempt to add the xattr (xfs_attr_leaf_addname())
> 	- if -ENOSPC, convert to node and call xfs_attr_node_addname()
> 	- else call xfs_attr3_leaf_add_work()
> 		- add entry
> 		- if remoteval, set INCOMPLETE
> - roll tx
> - if remoteval, call xfs_attr_rmtval_set()
> 	- block allocation, tx roll loop
> 	- copy remote value into bufs, xfs_bwrite()
> - if remoteval, xfs_attr3_leaf_clearflag()
> 	- clear INCOMPLETE
> 	- update/log rmt pointers
> 	- roll tx
> 
> I'm wondering 1.) how much of this is necessary with an intent based
> implementation and 2.) how much of this can be refactored to not require
> complex state tracking.
> 
> For example, all of the format conversions that occur before we actually
> make any modifications associated with the xattr (i.e., -ENOSPC returns
> from the current format) seem to me could easily be performed and
> immediately return -EAGAIN without any state tracking. The retry should
> pick up the current format of the fork and retry there. Thus, ISTM we
> could drop the whole xfs_attr_leaf_addname() -> xfs_attr3_leaf_to_node()
> -> xfs_attr_node_addname() codepath in favor of a format conversion and
> -EAGAIN retry that calls directly into xfs_attr_node_addname().

That had been my other thought -- in theory we keep the inode locked
across all the transaction rolls, so we could auto-detect what we need
to do.

> Once we have leaf format and we're doing remote block allocation, how
> much could we get away with by re-looking up the entry, finding that
> we're still short of remote blocks and performing another
> xfs_bmapi_write() -> -EAGAIN cycle until we're good to copy in the xattr
> value?
> 
> What about all this INCOMPLETE stuff? Do we even need that with an
> intent based implementation?

No.  AFAIK the INCOMPLETE flag exists to hide attrs from userspace until
we're totally done setting them up, and is therefore unnecessary with an
intent implementation.  Repair zaps any INCOMPLETE attrs it finds.

> My understanding was that was because we
> had to roll the transaction and thus could leave an incomplete xattr on
> disk. I haven't looked too far into it so perhaps there's more to it
> than that, but if not and this is no longer a problem with an intent
> based implementation then perhaps much of that code and associated tx
> rolls can be bypassed as well.

Getting rid of the INCOMPLETE wonkiness would be the strongest argument
for switching the regular attr manipulation paths to use intents, though
we'd have to toggle it with some feature or other.

(Some feature or other being parent pointers, or possibly just migrating
the free space tracking parts of dir3 to a "new" attr4 format for better
speed.)

> This is not to say that we won't require any such state tracking as
> you've described above. The whole block allocation thing above may
> require a state marker to get around attempts to set the xattr name
> again and get back to the remote value block allocation code. It also
> looks like we can do post xattr set format changes (i.e., node -> leaf,
> leaf -> sf) that might require something like that to make sure we don't
> go an retry an xattr set we've already completed. The point is just that
> I'd prefer that we explore how much we can simplify this mess of an
> implementation as much as possible (the above is all very handwavy)
> first to reduce the state tracking complexity, particularly if these
> states end up written to the log via the intent.
> 
> Hmm, I'm starting to think that maybe what we really need to do here is
> step back from the code and logically map out what these states and the
> resulting operation flow needs to be, particularly since there are so
> many variations between different format conversions, renames, remote
> blocks, etc. Once we have this whole mess mapped out, coding it up
> should be more of an effort in refactoring.

Yep.

> > xfs_attri_recover then becomes much simpler -- we're passed in the
> > reconstructed log item from which we figure out which step we need to
> > do.  We call xfs_trans_attr() to do that one step, but unlike
> > _finish_item, we use the new state to construct a *new* attr intent and
> > attach it to the transaction, then call xfs_defer_move at the end to
> > move all the queued defer_ops to the parent_tp because log recovery
> > requires us to recover all the incomplete log intent items before
> > finishing any new ones that were created as part of recovery.
> > 
> > This does mean that we end up with dramatically separate code paths for
> > defer ops attr setting vs. regular attr setting, but as you point out
> > the parent pointer feature will give the new code paths plenty of exercise.
> > Tying the new log intent items to a new feature bit is key to preventing
> > old kernels from stumbling across our new intent items, so we needed to
> > preserve the old attr set paths anyway.
> > 
> 
> That's a good point wrt to the other discussion around the direct xattr
> codepath. It sounds like we do need to keep that entire path around
> regardless to support v4 filesystems and such. The current series just
> unconditionally switches things over to deferred ops.

Er... yikes.  XFS cannot suddenly introduce new ondisk formats for
existing filesystems.

> > Anyway, if this all seems confusing, you can track me down, because I
> > wrote most of this system and therefore have forgotten all of
> > it^W^W^W^W^Wam available to help. :)
> > 
> > > > > 
> > > > > I was also starting to wonder if maybe I could do some refactoring in
> > > > > xfs_defer_finish_noroll to capture the common code associated with the
> > > > > -EAGAIN handling.  Then maybe we could make a function pointer that we can
> > > > > pass through the finish_item interface.  The idea being that subroutines
> > > > > could use the function pointer to cycle out the transaction when needed
> > > > > instead of having to record states and back out like this. It'd be a new
> > 
> > The state tracking and rolling is already built into xfs_defer.c. :)
> > 
> > > > > parameter to pipe around, but it'd be more efficient than the state machine,
> > > > > and less surgery in the refactor.  And maybe a blessing to any other
> > > > > operations that might need to go through this transition in the future.
> > > > > Thoughts?
> > > > > 
> > > > 
> > > > That's an interesting idea. It still strikes me as a bit of a
> > > > fallback/hack as opposed to organizing the code to properly fit into the
> > > > dfops infrastructure, but it could be useful as a transient solution.
> > > >  From a high level, it looks like we'd have to create a new intent, relog
> > > > this item and all remaining items associated with the dfp to it, roll
> > > > the tx, and finally create a done item associated with the intent in the
> > > > new tx. You'd need access to the dfp for some of that, so it's not
> > > > immediately clear to me that this ends up much easier than fixing up
> > > > the xattr code.
> > 
> > (I think the code that handles EAGAIN being returned from finish_item
> > does this for you....)
> > 
> 
> Yeah, I'm not totally sure it's an ideal/feasible approach, but for the
> sake of clarity I think what Allison is getting at is that if there was
> a way to trigger a dfops -EAGAIN roll sequence via a callback/helper
> function, we wouldn't need to refactor the xattr subsystem to have
> -EAGAIN return points. Instead we could just invoke the callback at the
> existing roll points and achieve the same behavior (in theory). It's
> kind of like providing an inside-out xfs_defer_finish_noroll() -EAGAIN
> implementation via a helper function for code down in ->finish_item().

<nod> I grok that, but wonder if we really can invoke a roll while in
the middle of ->finish_item...?  Anyway, we can set aside my confusion
for now because I really think we need to see a map of all the pieces 

--D

> Brian
> 
> > > > 
> > > > BTW, if we did end up with something like that I'd probably prefer to
> > > > see it as an exported dfops helper function as opposed to a function
> > > > pointer being passed around, if possible.
> > > > 
> > > 
> > > Alrighty, I think for now I may try to pursue something more like what you
> > > proposed in the next patch and see where I get first.  Maybe I'll come back
> > > to this later if for some reason it doesn't work out, but I think what you
> > > have there is reasonable.
> > 
> > <nod>
> > 
> > --D
> > 
> > > 
> > > Thanks again for the reviews!
> > > Allison
> > > 
> > > > Brian
> > > > 
> > > > > Thanks again for the reviews!
> > > > > 
> > > > > Allison
> > > > > 
> > > > > > 
> > > > > > Brian
> > > > > > 
> > > > > > 
> > > > > > >    fs/xfs/libxfs/xfs_attr.h | 18 +++++++++++++++++-
> > > > > > >    fs/xfs/scrub/common.c    |  2 ++
> > > > > > >    fs/xfs/xfs_acl.c         |  2 ++
> > > > > > >    fs/xfs/xfs_attr_item.c   |  2 +-
> > > > > > >    fs/xfs/xfs_ioctl.c       |  2 ++
> > > > > > >    fs/xfs/xfs_ioctl32.c     |  2 ++
> > > > > > >    fs/xfs/xfs_iops.c        |  1 +
> > > > > > >    fs/xfs/xfs_xattr.c       |  1 +
> > > > > > >    8 files changed, 28 insertions(+), 2 deletions(-)
> > > > > > > 
> > > > > > > diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
> > > > > > > index 974c963..4ce3b0a 100644
> > > > > > > --- a/fs/xfs/libxfs/xfs_attr.h
> > > > > > > +++ b/fs/xfs/libxfs/xfs_attr.h
> > > > > > > @@ -77,6 +77,13 @@ typedef struct attrlist_ent {	/* data from attr_list() */
> > > > > > >    	char	a_name[1];	/* attr name (NULL terminated) */
> > > > > > >    } attrlist_ent_t;
> > > > > > > +/* Attr state machine types */
> > > > > > > +enum xfs_attr_state {
> > > > > > > +	XFS_ATTR_STATE1 = 1,
> > > > > > > +	XFS_ATTR_STATE2 = 2,
> > > > > > > +	XFS_ATTR_STATE3 = 3,
> > > > > > > +};
> > > > > > > +
> > > > > > >    /*
> > > > > > >     * List of attrs to commit later.
> > > > > > >     */
> > > > > > > @@ -88,7 +95,16 @@ struct xfs_attr_item {
> > > > > > >    	void		  *xattri_name;	      /* attr name */
> > > > > > >    	uint32_t	  xattri_name_len;    /* length of name */
> > > > > > >    	uint32_t	  xattri_flags;       /* attr flags */
> > > > > > > -	struct list_head  xattri_list;
> > > > > > > +
> > > > > > > +	/*
> > > > > > > +	 * Delayed attr parameters that need to remain instantiated
> > > > > > > +	 * across transaction rolls during the defer finish
> > > > > > > +	 */
> > > > > > > +	struct xfs_buf		*xattri_leaf_bp;  /* Leaf buf to release */
> > > > > > > +	enum xfs_attr_state	xattri_state;	  /* state machine marker */
> > > > > > > +	struct xfs_da_args	xattri_args;	  /* args context */
> > > > > > > +
> > > > > > > +	struct list_head	xattri_list;
> > > > > > >    	/*
> > > > > > >    	 * A byte array follows the header containing the file name and
> > > > > > > diff --git a/fs/xfs/scrub/common.c b/fs/xfs/scrub/common.c
> > > > > > > index 0c54ff5..270c32e 100644
> > > > > > > --- a/fs/xfs/scrub/common.c
> > > > > > > +++ b/fs/xfs/scrub/common.c
> > > > > > > @@ -30,6 +30,8 @@
> > > > > > >    #include "xfs_rmap_btree.h"
> > > > > > >    #include "xfs_log.h"
> > > > > > >    #include "xfs_trans_priv.h"
> > > > > > > +#include "xfs_da_format.h"
> > > > > > > +#include "xfs_da_btree.h"
> > > > > > >    #include "xfs_attr.h"
> > > > > > >    #include "xfs_reflink.h"
> > > > > > >    #include "scrub/xfs_scrub.h"
> > > > > > > diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c
> > > > > > > index 142de8d..9b1b93e 100644
> > > > > > > --- a/fs/xfs/xfs_acl.c
> > > > > > > +++ b/fs/xfs/xfs_acl.c
> > > > > > > @@ -10,6 +10,8 @@
> > > > > > >    #include "xfs_mount.h"
> > > > > > >    #include "xfs_inode.h"
> > > > > > >    #include "xfs_acl.h"
> > > > > > > +#include "xfs_da_format.h"
> > > > > > > +#include "xfs_da_btree.h"
> > > > > > >    #include "xfs_attr.h"
> > > > > > >    #include "xfs_trace.h"
> > > > > > >    #include <linux/slab.h>
> > > > > > > diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
> > > > > > > index 0ea19b4..36e6d1e 100644
> > > > > > > --- a/fs/xfs/xfs_attr_item.c
> > > > > > > +++ b/fs/xfs/xfs_attr_item.c
> > > > > > > @@ -19,10 +19,10 @@
> > > > > > >    #include "xfs_rmap.h"
> > > > > > >    #include "xfs_inode.h"
> > > > > > >    #include "xfs_icache.h"
> > > > > > > -#include "xfs_attr.h"
> > > > > > >    #include "xfs_shared.h"
> > > > > > >    #include "xfs_da_format.h"
> > > > > > >    #include "xfs_da_btree.h"
> > > > > > > +#include "xfs_attr.h"
> > > > > > >    static inline struct xfs_attri_log_item *ATTRI_ITEM(struct xfs_log_item *lip)
> > > > > > >    {
> > > > > > > diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> > > > > > > index ab341d6..c8728ca 100644
> > > > > > > --- a/fs/xfs/xfs_ioctl.c
> > > > > > > +++ b/fs/xfs/xfs_ioctl.c
> > > > > > > @@ -16,6 +16,8 @@
> > > > > > >    #include "xfs_rtalloc.h"
> > > > > > >    #include "xfs_itable.h"
> > > > > > >    #include "xfs_error.h"
> > > > > > > +#include "xfs_da_format.h"
> > > > > > > +#include "xfs_da_btree.h"
> > > > > > >    #include "xfs_attr.h"
> > > > > > >    #include "xfs_bmap.h"
> > > > > > >    #include "xfs_bmap_util.h"
> > > > > > > diff --git a/fs/xfs/xfs_ioctl32.c b/fs/xfs/xfs_ioctl32.c
> > > > > > > index 5001dca..23f6990 100644
> > > > > > > --- a/fs/xfs/xfs_ioctl32.c
> > > > > > > +++ b/fs/xfs/xfs_ioctl32.c
> > > > > > > @@ -21,6 +21,8 @@
> > > > > > >    #include "xfs_fsops.h"
> > > > > > >    #include "xfs_alloc.h"
> > > > > > >    #include "xfs_rtalloc.h"
> > > > > > > +#include "xfs_da_format.h"
> > > > > > > +#include "xfs_da_btree.h"
> > > > > > >    #include "xfs_attr.h"
> > > > > > >    #include "xfs_ioctl.h"
> > > > > > >    #include "xfs_ioctl32.h"
> > > > > > > diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> > > > > > > index e73c21a..561c467 100644
> > > > > > > --- a/fs/xfs/xfs_iops.c
> > > > > > > +++ b/fs/xfs/xfs_iops.c
> > > > > > > @@ -17,6 +17,7 @@
> > > > > > >    #include "xfs_acl.h"
> > > > > > >    #include "xfs_quota.h"
> > > > > > >    #include "xfs_error.h"
> > > > > > > +#include "xfs_da_btree.h"
> > > > > > >    #include "xfs_attr.h"
> > > > > > >    #include "xfs_trans.h"
> > > > > > >    #include "xfs_trace.h"
> > > > > > > diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c
> > > > > > > index 3013746..938e81d 100644
> > > > > > > --- a/fs/xfs/xfs_xattr.c
> > > > > > > +++ b/fs/xfs/xfs_xattr.c
> > > > > > > @@ -11,6 +11,7 @@
> > > > > > >    #include "xfs_mount.h"
> > > > > > >    #include "xfs_da_format.h"
> > > > > > >    #include "xfs_inode.h"
> > > > > > > +#include "xfs_da_btree.h"
> > > > > > >    #include "xfs_attr.h"
> > > > > > >    #include "xfs_attr_leaf.h"
> > > > > > >    #include "xfs_acl.h"
> > > > > > > -- 
> > > > > > > 2.7.4
> > > > > > > 

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 7/9] xfs: Add attr context to log item
  2019-04-24 15:25               ` Darrick J. Wong
@ 2019-04-24 16:57                 ` Brian Foster
  0 siblings, 0 replies; 48+ messages in thread
From: Brian Foster @ 2019-04-24 16:57 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Allison Henderson, linux-xfs

On Wed, Apr 24, 2019 at 08:25:33AM -0700, Darrick J. Wong wrote:
> On Wed, Apr 24, 2019 at 08:17:48AM -0400, Brian Foster wrote:
> > On Tue, Apr 23, 2019 at 09:10:16PM -0700, Darrick J. Wong wrote:
> > > Sorry I'm late back to the party...
> > > 
> > > On Tue, Apr 23, 2019 at 07:24:40PM -0700, Allison Henderson wrote:
> > > > 
> > > > On 4/23/19 6:20 AM, Brian Foster wrote:
> > > > > On Mon, Apr 22, 2019 at 03:01:27PM -0700, Allison Henderson wrote:
> > > > > > 
> > > > > > 
> > > > > > On 4/22/19 6:03 AM, Brian Foster wrote:
> > > > > > > On Fri, Apr 12, 2019 at 03:50:34PM -0700, Allison Henderson wrote:
> > > > > > > > This patch modifies xfs_attr_item to store a xfs_da_args, a xfs_buf pointer
> > > > > > > > and a new state type. We will use these in the next patch when
> > > > > > > > we modify xfs_set_attr_args to roll transactions by returning EAGAIN.
> > > > > > > > Because the subroutines of this function modify the contents of these
> > > > > > > > structures, we need to find a place to store them where they remain
> > > > > > > > instantiated across multiple calls to xfs_set_attr_args.
> > > > > > > > 
> > > > > > > > Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> > > > > > > > ---
> > > > > > > 
> > > > > > > I see Darrick has already commented on the whole state thing. I'll
> > > > > > > probably have to grok the next patch to comment further, but just a
> > > > > > > couple initial thoughts:
> > > > > > > 
> > > > > > > First, I hit a build failure with this patch. It looks like there's a
> > > > > > > missed include in the scrub code:
> > > > > > > 
> > > > > > >     ...
> > > > > > >     CC [M]  fs/xfs/scrub/repair.o
> > > > > > > In file included from fs/xfs/scrub/repair.c:32:
> > > > > > > fs/xfs/libxfs/xfs_attr.h:105:21: error: field ‘xattri_args’ has incomplete type
> > > > > > >     struct xfs_da_args xattri_args;   /* args context */
> > > > > > Hmm, ok.  I'll get that corrected, I probably need to clean out my workspace
> > > > > > and build from scratch.
> > > > > > 
> > > > > > >     ...
> > > > > > > 
> > > > > > > Second, the commit log suggests that the states will reflect the current
> > > > > > > transaction roll points (i.e., establishing re-entry points down in
> > > > > > > xfs_attr_set_args(). I'm kind of wondering if we should break these
> > > > > > > xattr set sub-sequences down into smaller helper functions (refactoring
> > > > > > > the existing code as we go) such that the mechanism could technically be
> > > 
> > > I had had the thought of "why not just give each step of setting an
> > > attribute its own log item, so we don't have to have this STATE_NNN
> > > business?" but then realized that will generate an insane amount of
> > > boilerplate, and you're already close to a better solution, so I shut up
> > > to think harder. :)
> > > 
> > 
> > The thought of separating things down into smaller "ops" popped into my
> > head (not necessarily separate/smaller log items), but I hadn't really
> > thought it through to this point...
> > 
> > > > > > > used deferred or not. Re: the previous thought on whether to defer xattr
> > > > > > > removes or not, there might also be cases where there's not a need to
> > > > > > > defer xattr sets.
> > > > > > > 
> > > > > > > E.g., taking a quick peek into the next patch, the state 1 case in
> > > > > > > xfs_attr_try_sf_addname() is actually a transaction commit, which I
> > > > > > > think means we're done. We'd have done an attr memory allocation,
> > > > > > > deferred op and transaction roll where none was necessary so it might
> > > > > > > not be worth it to defer in that scenario. Hmm, it also looks like we
> > > > > > > return -EAGAIN in places where we've not actually done any work, like if
> > > > > > > a shortform add attempt returns -ENOSPC (or the -EAGAIN return before we
> > > > > > > even attempt the sf add). That kind of looks like a waste of transaction
> > > > > > > rolls and further suggests it might be cleaner to break this whole path
> > > > > > > down into helpers and put it back together in a way more conducive to
> > > > > > > deferred operations.
> > > 
> > > Er, agreed:
> > > 
> > > > > > Yes, this area is a bit of a wart the way it is right now.  I think you're
> > > > > > right in that ultimately we may end up having to do a lot of refactoring in
> > > > > > order to have more efficient "re-entry points".  The state machine is hard
> > > > > > to get into subroutines, so it's limited in use in the top level function.
> > > 
> > > So my current understanding of the problem is that we have this big old
> > > xfs_attr_set_args function that does multiple responsibilities requiring
> > > transaction rolls, which we can't do directly inside a ->finish_item
> > > handler:
> > > 
> > >  1. If no attr fork, add one.
> > >  2. If shortform attr fork, try to put it in the sf area.
> > >  3. If shortform attr fork and out of space, convert to leaf format.
> > >  4. Add attr to leaf/node attr tree.
> > > 
> > 
> > And there are a bunch of tx rolls down in the #4 codepath that this
> > series currently just tosses away. I'm not quite sure how appropriate
> > that is, but I also don't think we necessarily need to preserve each and
> > every transaction roll as implemented by the current code.
> > 
> > IOW, I think it absolutely makes sense to step back from the current
> > behavior and reassess the best/required places to roll xattr ops in
> > progress as well as the transaction reservation itself.
> 
> Yes, it would help to make a list of every small step that could
> possibly be required to set an attribute.  That will help narrow down
> how many defer op pieces are needed.
> 
> Another thought I had is that having the finish_item continually logging
> a new intent with the latest state means that we can free the old intent
> item, which helps us avoid the problem of pinning the log tail at that
> first intent item while we scramble around doing a whole lot of rolling
> and other work to get to the done item.
> 
> > > So how about this: refactor each of these pieces into a separate
> > > function, then add a separate XFS_ATTR_OP_FLAGS_* value for each of
> > > these little pieces.  xfs_trans_attr() can call the appropriate little
> > > function for the OP_FLAG and xfs_attr_finish_item can figure out which
> > > state comes next based on the return value.
> > > 
> > > By directly mapping distinct OP_FLAGS to each piece of the attr setting
> > > puzzle, you can use the existing "roll and come back" part of the defer
> > > ops machinery.
> > > 
> > > If _finish_item thinks we're done then we just exit.  Otherwise, store
> > > the new state in the (struct xfs_attr_item *) parameter passed into
> > > _finish_item and return -EAGAIN, which puts the defer item back on the
> > > defer op list, logs a new xattr intent with the new state, rolls the
> > > transaction, and tries to finish the attr again.  I think you've already
> > > done this last part.
> > > 
> > 
> > That sounds plausible to me. One concern I have is that I think we
> > should try to avoid creating more unnecessary complexity in the dfops
> > state mechanism simply to accommodate a messy xattr implementation. For
> > example, consider the following sequence for a simple set of an xattr
> > that requires leaf format and remote value block(s):
> > 
> > - try sf add
> > - returns -ENOSPC, convert to leaf and roll tx
> > - attempt to add the xattr (xfs_attr_leaf_addname())
> > 	- if -ENOSPC, convert to node and call xfs_attr_node_addname()
> > 	- else call xfs_attr3_leaf_add_work()
> > 		- add entry
> > 		- if remoteval, set INCOMPLETE
> > - roll tx
> > - if remoteval, call xfs_attr_rmtval_set()
> > 	- block allocation, tx roll loop
> > 	- copy remote value into bufs, xfs_bwrite()
> > - if remoteval, xfs_attr3_leaf_clearflag()
> > 	- clear INCOMPLETE
> > 	- update/log rmt pointers
> > 	- roll tx
> > 
> > I'm wondering 1.) how much of this is necessary with an intent based
> > implementation and 2.) how much of this can be refactored to not require
> > complex state tracking.
> > 
> > For example, all of the format conversions that occur before we actually
> > make any modifications associated with the xattr (i.e., -ENOSPC returns
> > from the current format) seem to me could easily be performed and
> > immediately return -EAGAIN without any state tracking. The retry should
> > pick up the current format of the fork and retry there. Thus, ISTM we
> > could drop the whole xfs_attr_leaf_addname() -> xfs_attr3_leaf_to_node()
> > -> xfs_attr_node_addname() codepath in favor of a format conversion and
> > -EAGAIN retry that calls directly into xfs_attr_node_addname().
> 
> That had been my other thought -- in theory we keep the inode locked
> across all the transaction rolls, so we could auto-detect what we need
> to do.
> 

Indeed, at the very least it might reduce the number of "on-disk" state
markers we have to define.

> > Once we have leaf format and we're doing remote block allocation, how
> > much could we get away with by re-looking up the entry, finding that
> > we're still short of remote blocks and performing another
> > xfs_bmapi_write() -> -EAGAIN cycle until we're good to copy in the xattr
> > value?
> > 
> > What about all this INCOMPLETE stuff? Do we even need that with an
> > intent based implementation?
> 
> No.  AFAIK the INCOMPLETE flag exists to hide attrs from userspace until
> we're totally done setting them up, and is therefore unnecessary with an
> intent implementation.  Repair zaps any INCOMPLETE attrs it finds.
> 
> > My understanding was that was because we
> > had to roll the transaction and thus could leave an incomplete xattr on
> > disk. I haven't looked too far into it so perhaps there's more to it
> > than that, but if not and this is no longer a problem with an intent
> > based implementation then perhaps much of that code and associated tx
> > rolls can be bypassed as well.
> 
> Getting rid of the INCOMPLETE wonkiness would be the strongest argument
> for switching the regular attr manipulation paths to use intents, though
> we'd have to toggle it with some feature or other.
> 
> (Some feature or other being parent pointers, or possibly just migrating
> the free space tracking parts of dir3 to a "new" attr4 format for better
> speed.)
> 
> > This is not to say that we won't require any such state tracking as
> > you've described above. The whole block allocation thing above may
> > require a state marker to get around attempts to set the xattr name
> > again and get back to the remote value block allocation code. It also
> > looks like we can do post xattr set format changes (i.e., node -> leaf,
> > leaf -> sf) that might require something like that to make sure we don't
> > go an retry an xattr set we've already completed. The point is just that
> > I'd prefer that we explore how much we can simplify this mess of an
> > implementation as much as possible (the above is all very handwavy)
> > first to reduce the state tracking complexity, particularly if these
> > states end up written to the log via the intent.
> > 
> > Hmm, I'm starting to think that maybe what we really need to do here is
> > step back from the code and logically map out what these states and the
> > resulting operation flow needs to be, particularly since there are so
> > many variations between different format conversions, renames, remote
> > blocks, etc. Once we have this whole mess mapped out, coding it up
> > should be more of an effort in refactoring.
> 
> Yep.
> 
> > > xfs_attri_recover then becomes much simpler -- we're passed in the
> > > reconstructed log item from which we figure out which step we need to
> > > do.  We call xfs_trans_attr() to do that one step, but unlike
> > > _finish_item, we use the new state to construct a *new* attr intent and
> > > attach it to the transaction, then call xfs_defer_move at the end to
> > > move all the queued defer_ops to the parent_tp because log recovery
> > > requires us to recover all the incomplete log intent items before
> > > finishing any new ones that were created as part of recovery.
> > > 
> > > This does mean that we end up with dramatically separate code paths for
> > > defer ops attr setting vs. regular attr setting, but as you point out
> > > the parent pointer feature will give the new code paths plenty of exercise.
> > > Tying the new log intent items to a new feature bit is key to preventing
> > > old kernels from stumbling across our new intent items, so we needed to
> > > preserve the old attr set paths anyway.
> > > 
> > 
> > That's a good point wrt to the other discussion around the direct xattr
> > codepath. It sounds like we do need to keep that entire path around
> > regardless to support v4 filesystems and such. The current series just
> > unconditionally switches things over to deferred ops.
> 
> Er... yikes.  XFS cannot suddenly introduce new ondisk formats for
> existing filesystems.
> 

We were discussing whether to preserve the existing codepath with
respect to flexibility/efficiency, but the whole backwards compatibility
aspect just didn't register with me until you mentioned it. I think that
kind of makes that decision for us. :P

> > > Anyway, if this all seems confusing, you can track me down, because I
> > > wrote most of this system and therefore have forgotten all of
> > > it^W^W^W^W^Wam available to help. :)
> > > 
> > > > > > 
> > > > > > I was also starting to wonder if maybe I could do some refactoring in
> > > > > > xfs_defer_finish_noroll to capture the common code associated with the
> > > > > > -EAGAIN handling.  Then maybe we could make a function pointer that we can
> > > > > > pass through the finish_item interface.  The idea being that subroutines
> > > > > > could use the function pointer to cycle out the transaction when needed
> > > > > > instead of having to record states and back out like this. It'd be a new
> > > 
> > > The state tracking and rolling is already built into xfs_defer.c. :)
> > > 
> > > > > > parameter to pipe around, but it'd be more efficient than the state machine,
> > > > > > and less surgery in the refactor.  And maybe a blessing to any other
> > > > > > operations that might need to go through this transition in the future.
> > > > > > Thoughts?
> > > > > > 
> > > > > 
> > > > > That's an interesting idea. It still strikes me as a bit of a
> > > > > fallback/hack as opposed to organizing the code to properly fit into the
> > > > > dfops infrastructure, but it could be useful as a transient solution.
> > > > >  From a high level, it looks like we'd have to create a new intent, relog
> > > > > this item and all remaining items associated with the dfp to it, roll
> > > > > the tx, and finally create a done item associated with the intent in the
> > > > > new tx. You'd need access to the dfp for some of that, so it's not
> > > > > immediately clear to me that this ends up much easier than fixing up
> > > > > the xattr code.
> > > 
> > > (I think the code that handles EAGAIN being returned from finish_item
> > > does this for you....)
> > > 
> > 
> > Yeah, I'm not totally sure it's an ideal/feasible approach, but for the
> > sake of clarity I think what Allison is getting at is that if there was
> > a way to trigger a dfops -EAGAIN roll sequence via a callback/helper
> > function, we wouldn't need to refactor the xattr subsystem to have
> > -EAGAIN return points. Instead we could just invoke the callback at the
> > existing roll points and achieve the same behavior (in theory). It's
> > kind of like providing an inside-out xfs_defer_finish_noroll() -EAGAIN
> > implementation via a helper function for code down in ->finish_item().
> 
> <nod> I grok that, but wonder if we really can invoke a roll while in
> the middle of ->finish_item...?  Anyway, we can set aside my confusion
> for now because I really think we need to see a map of all the pieces 
> 

Ok, I'm not really sure either. It was just an idea to bat around with
the rest. I agree that an informal, logical map/breakdown is the best
next step here. That gives us something concrete to review and refine.

Brian

> --D
> 
> > Brian
> > 
> > > > > 
> > > > > BTW, if we did end up with something like that I'd probably prefer to
> > > > > see it as an exported dfops helper function as opposed to a function
> > > > > pointer being passed around, if possible.
> > > > > 
> > > > 
> > > > Alrighty, I think for now I may try to pursue something more like what you
> > > > proposed in the next patch and see where I get first.  Maybe I'll come back
> > > > to this later if for some reason it doesn't work out, but I think what you
> > > > have there is reasonable.
> > > 
> > > <nod>
> > > 
> > > --D
> > > 
> > > > 
> > > > Thanks again for the reviews!
> > > > Allison
> > > > 
> > > > > Brian
> > > > > 
> > > > > > Thanks again for the reviews!
> > > > > > 
> > > > > > Allison
> > > > > > 
> > > > > > > 
> > > > > > > Brian
> > > > > > > 
> > > > > > > 
> > > > > > > >    fs/xfs/libxfs/xfs_attr.h | 18 +++++++++++++++++-
> > > > > > > >    fs/xfs/scrub/common.c    |  2 ++
> > > > > > > >    fs/xfs/xfs_acl.c         |  2 ++
> > > > > > > >    fs/xfs/xfs_attr_item.c   |  2 +-
> > > > > > > >    fs/xfs/xfs_ioctl.c       |  2 ++
> > > > > > > >    fs/xfs/xfs_ioctl32.c     |  2 ++
> > > > > > > >    fs/xfs/xfs_iops.c        |  1 +
> > > > > > > >    fs/xfs/xfs_xattr.c       |  1 +
> > > > > > > >    8 files changed, 28 insertions(+), 2 deletions(-)
> > > > > > > > 
> > > > > > > > diff --git a/fs/xfs/libxfs/xfs_attr.h b/fs/xfs/libxfs/xfs_attr.h
> > > > > > > > index 974c963..4ce3b0a 100644
> > > > > > > > --- a/fs/xfs/libxfs/xfs_attr.h
> > > > > > > > +++ b/fs/xfs/libxfs/xfs_attr.h
> > > > > > > > @@ -77,6 +77,13 @@ typedef struct attrlist_ent {	/* data from attr_list() */
> > > > > > > >    	char	a_name[1];	/* attr name (NULL terminated) */
> > > > > > > >    } attrlist_ent_t;
> > > > > > > > +/* Attr state machine types */
> > > > > > > > +enum xfs_attr_state {
> > > > > > > > +	XFS_ATTR_STATE1 = 1,
> > > > > > > > +	XFS_ATTR_STATE2 = 2,
> > > > > > > > +	XFS_ATTR_STATE3 = 3,
> > > > > > > > +};
> > > > > > > > +
> > > > > > > >    /*
> > > > > > > >     * List of attrs to commit later.
> > > > > > > >     */
> > > > > > > > @@ -88,7 +95,16 @@ struct xfs_attr_item {
> > > > > > > >    	void		  *xattri_name;	      /* attr name */
> > > > > > > >    	uint32_t	  xattri_name_len;    /* length of name */
> > > > > > > >    	uint32_t	  xattri_flags;       /* attr flags */
> > > > > > > > -	struct list_head  xattri_list;
> > > > > > > > +
> > > > > > > > +	/*
> > > > > > > > +	 * Delayed attr parameters that need to remain instantiated
> > > > > > > > +	 * across transaction rolls during the defer finish
> > > > > > > > +	 */
> > > > > > > > +	struct xfs_buf		*xattri_leaf_bp;  /* Leaf buf to release */
> > > > > > > > +	enum xfs_attr_state	xattri_state;	  /* state machine marker */
> > > > > > > > +	struct xfs_da_args	xattri_args;	  /* args context */
> > > > > > > > +
> > > > > > > > +	struct list_head	xattri_list;
> > > > > > > >    	/*
> > > > > > > >    	 * A byte array follows the header containing the file name and
> > > > > > > > diff --git a/fs/xfs/scrub/common.c b/fs/xfs/scrub/common.c
> > > > > > > > index 0c54ff5..270c32e 100644
> > > > > > > > --- a/fs/xfs/scrub/common.c
> > > > > > > > +++ b/fs/xfs/scrub/common.c
> > > > > > > > @@ -30,6 +30,8 @@
> > > > > > > >    #include "xfs_rmap_btree.h"
> > > > > > > >    #include "xfs_log.h"
> > > > > > > >    #include "xfs_trans_priv.h"
> > > > > > > > +#include "xfs_da_format.h"
> > > > > > > > +#include "xfs_da_btree.h"
> > > > > > > >    #include "xfs_attr.h"
> > > > > > > >    #include "xfs_reflink.h"
> > > > > > > >    #include "scrub/xfs_scrub.h"
> > > > > > > > diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c
> > > > > > > > index 142de8d..9b1b93e 100644
> > > > > > > > --- a/fs/xfs/xfs_acl.c
> > > > > > > > +++ b/fs/xfs/xfs_acl.c
> > > > > > > > @@ -10,6 +10,8 @@
> > > > > > > >    #include "xfs_mount.h"
> > > > > > > >    #include "xfs_inode.h"
> > > > > > > >    #include "xfs_acl.h"
> > > > > > > > +#include "xfs_da_format.h"
> > > > > > > > +#include "xfs_da_btree.h"
> > > > > > > >    #include "xfs_attr.h"
> > > > > > > >    #include "xfs_trace.h"
> > > > > > > >    #include <linux/slab.h>
> > > > > > > > diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
> > > > > > > > index 0ea19b4..36e6d1e 100644
> > > > > > > > --- a/fs/xfs/xfs_attr_item.c
> > > > > > > > +++ b/fs/xfs/xfs_attr_item.c
> > > > > > > > @@ -19,10 +19,10 @@
> > > > > > > >    #include "xfs_rmap.h"
> > > > > > > >    #include "xfs_inode.h"
> > > > > > > >    #include "xfs_icache.h"
> > > > > > > > -#include "xfs_attr.h"
> > > > > > > >    #include "xfs_shared.h"
> > > > > > > >    #include "xfs_da_format.h"
> > > > > > > >    #include "xfs_da_btree.h"
> > > > > > > > +#include "xfs_attr.h"
> > > > > > > >    static inline struct xfs_attri_log_item *ATTRI_ITEM(struct xfs_log_item *lip)
> > > > > > > >    {
> > > > > > > > diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> > > > > > > > index ab341d6..c8728ca 100644
> > > > > > > > --- a/fs/xfs/xfs_ioctl.c
> > > > > > > > +++ b/fs/xfs/xfs_ioctl.c
> > > > > > > > @@ -16,6 +16,8 @@
> > > > > > > >    #include "xfs_rtalloc.h"
> > > > > > > >    #include "xfs_itable.h"
> > > > > > > >    #include "xfs_error.h"
> > > > > > > > +#include "xfs_da_format.h"
> > > > > > > > +#include "xfs_da_btree.h"
> > > > > > > >    #include "xfs_attr.h"
> > > > > > > >    #include "xfs_bmap.h"
> > > > > > > >    #include "xfs_bmap_util.h"
> > > > > > > > diff --git a/fs/xfs/xfs_ioctl32.c b/fs/xfs/xfs_ioctl32.c
> > > > > > > > index 5001dca..23f6990 100644
> > > > > > > > --- a/fs/xfs/xfs_ioctl32.c
> > > > > > > > +++ b/fs/xfs/xfs_ioctl32.c
> > > > > > > > @@ -21,6 +21,8 @@
> > > > > > > >    #include "xfs_fsops.h"
> > > > > > > >    #include "xfs_alloc.h"
> > > > > > > >    #include "xfs_rtalloc.h"
> > > > > > > > +#include "xfs_da_format.h"
> > > > > > > > +#include "xfs_da_btree.h"
> > > > > > > >    #include "xfs_attr.h"
> > > > > > > >    #include "xfs_ioctl.h"
> > > > > > > >    #include "xfs_ioctl32.h"
> > > > > > > > diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> > > > > > > > index e73c21a..561c467 100644
> > > > > > > > --- a/fs/xfs/xfs_iops.c
> > > > > > > > +++ b/fs/xfs/xfs_iops.c
> > > > > > > > @@ -17,6 +17,7 @@
> > > > > > > >    #include "xfs_acl.h"
> > > > > > > >    #include "xfs_quota.h"
> > > > > > > >    #include "xfs_error.h"
> > > > > > > > +#include "xfs_da_btree.h"
> > > > > > > >    #include "xfs_attr.h"
> > > > > > > >    #include "xfs_trans.h"
> > > > > > > >    #include "xfs_trace.h"
> > > > > > > > diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c
> > > > > > > > index 3013746..938e81d 100644
> > > > > > > > --- a/fs/xfs/xfs_xattr.c
> > > > > > > > +++ b/fs/xfs/xfs_xattr.c
> > > > > > > > @@ -11,6 +11,7 @@
> > > > > > > >    #include "xfs_mount.h"
> > > > > > > >    #include "xfs_da_format.h"
> > > > > > > >    #include "xfs_inode.h"
> > > > > > > > +#include "xfs_da_btree.h"
> > > > > > > >    #include "xfs_attr.h"
> > > > > > > >    #include "xfs_attr_leaf.h"
> > > > > > > >    #include "xfs_acl.h"
> > > > > > > > -- 
> > > > > > > > 2.7.4
> > > > > > > > 

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2019-04-24 16:57 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-04-12 22:50 [PATCH 0/9] xfs: Delayed Attributes Allison Henderson
2019-04-12 22:50 ` [PATCH 1/9] xfs: Remove all strlen in all xfs_attr_* functions for attr names Allison Henderson
2019-04-14 23:02   ` Dave Chinner
2019-04-15 20:08     ` Allison Henderson
2019-04-15 21:18       ` Dave Chinner
2019-04-16  1:33         ` Allison Henderson
2019-04-17 15:42   ` Brian Foster
2019-04-12 22:50 ` [PATCH 2/9] xfs: Hold inode locks in xfs_ialloc Allison Henderson
2019-04-17 15:44   ` Brian Foster
2019-04-17 17:35     ` Allison Henderson
2019-04-12 22:50 ` [PATCH 3/9] xfs: Add trans toggle to attr routines Allison Henderson
2019-04-18 15:27   ` Brian Foster
2019-04-18 21:23     ` Allison Henderson
2019-04-12 22:50 ` [PATCH 4/9] xfs: Set up infastructure for deferred attribute operations Allison Henderson
2019-04-18 15:48   ` Brian Foster
2019-04-18 21:27     ` Allison Henderson
2019-04-22 11:00       ` Brian Foster
2019-04-22 22:00         ` Allison Henderson
2019-04-12 22:50 ` [PATCH 5/9] xfs: Add xfs_attr_set_deferred and xfs_attr_remove_deferred Allison Henderson
2019-04-18 15:49   ` Brian Foster
2019-04-18 21:28     ` Allison Henderson
2019-04-22 11:01       ` Brian Foster
2019-04-22 22:01         ` Allison Henderson
2019-04-23 13:00           ` Brian Foster
2019-04-24  2:24             ` Allison Henderson
2019-04-12 22:50 ` [PATCH 6/9] xfs: Add xfs_has_attr and subroutines Allison Henderson
2019-04-15  2:46   ` Su Yue
2019-04-15 20:13     ` Allison Henderson
2019-04-22 13:00   ` Brian Foster
2019-04-22 22:01     ` Allison Henderson
2019-04-12 22:50 ` [PATCH 7/9] xfs: Add attr context to log item Allison Henderson
2019-04-15 22:50   ` Darrick J. Wong
2019-04-16  2:30     ` Allison Henderson
2019-04-16  3:21       ` Allison Henderson
2019-04-22 13:03   ` Brian Foster
2019-04-22 22:01     ` Allison Henderson
2019-04-23 13:20       ` Brian Foster
2019-04-24  2:24         ` Allison Henderson
2019-04-24  4:10           ` Darrick J. Wong
2019-04-24 12:17             ` Brian Foster
2019-04-24 15:25               ` Darrick J. Wong
2019-04-24 16:57                 ` Brian Foster
2019-04-12 22:50 ` [PATCH 8/9] xfs: Roll delayed attr operations by returning EAGAIN Allison Henderson
2019-04-15 23:31   ` Darrick J. Wong
2019-04-16 19:54     ` Allison Henderson
2019-04-23 14:19   ` Brian Foster
2019-04-24  2:24     ` Allison Henderson
2019-04-12 22:50 ` [PATCH 9/9] xfs: Remove roll_trans boolean Allison Henderson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.