All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/17] Parent Pointers V3
@ 2017-10-18 22:55 Allison Henderson
  2017-10-18 22:55 ` [PATCH 01/17] Add helper functions xfs_attr_set_args and xfs_attr_remove_args Allison Henderson
                   ` (17 more replies)
  0 siblings, 18 replies; 66+ messages in thread
From: Allison Henderson @ 2017-10-18 22:55 UTC (permalink / raw)
  To: linux-xfs; +Cc: Allison Henderson

Hi all,

This is the third version of parent pointer attributes for xfs.
I've integrated the suggestions made since v2, mostly moving the
attr buffers in the xfs_attr_log_item to pointers that point to
xfs_attr_item. I've also implementing the recovery routines for
the xfs_attr_log_format.  If I missed anything please point it
out.  As always, comments and feedback are appreciated.  Thank
you!

Allison Henderson (7):
  Add helper functions xfs_attr_set_args and xfs_attr_remove_args
  Set up infastructure for deferred attribute operations
  Add xfs_attr_set_defered and xfs_attr_remove_defered
  Remove all strlen calls in all xfs_attr_* functions for attr names.
  Add the extra space requirements for parent pointer attributes when
    calculating the minimum log size during mkfs
  Add parent pointers to rename
  Add the parent pointer support to the superblock version 5.

Brian Foster (1):
  xfs_bmap_add_attrfork(): re-add error handling from set_attrforkoff()
    call

Dave Chinner (5):
  xfs: define parent pointer xattr format
  :xfs: extent transaction reservations for parent attributes
  xfs: parent pointer attribute creation
  xfs: add parent attributes to link
  xfs: remove parent pointers in unlink

Mark Tinguely (4):
  xfs: get directory offset when adding directory name
  xfs: get directory offset when removing directory name
  xfs: get directory offset when replacing a directory name
  xfs: add parent pointer support to attribute code

 fs/xfs/Makefile                |   3 +
 fs/xfs/libxfs/xfs_attr.c       | 476 +++++++++++++++++++++++++++-----------
 fs/xfs/libxfs/xfs_bmap.c       |  51 ++--
 fs/xfs/libxfs/xfs_bmap.h       |   1 +
 fs/xfs/libxfs/xfs_da_btree.h   |   1 +
 fs/xfs/libxfs/xfs_da_format.h  |  12 +-
 fs/xfs/libxfs/xfs_defer.h      |   1 +
 fs/xfs/libxfs/xfs_dir2.c       |  41 ++--
 fs/xfs/libxfs/xfs_dir2.h       |  10 +-
 fs/xfs/libxfs/xfs_dir2_block.c |   9 +-
 fs/xfs/libxfs/xfs_dir2_leaf.c  |   8 +-
 fs/xfs/libxfs/xfs_dir2_node.c  |   8 +-
 fs/xfs/libxfs/xfs_dir2_sf.c    |   6 +
 fs/xfs/libxfs/xfs_format.h     |  37 ++-
 fs/xfs/libxfs/xfs_fs.h         |   1 +
 fs/xfs/libxfs/xfs_log_format.h |  36 ++-
 fs/xfs/libxfs/xfs_log_rlimit.c |  34 +++
 fs/xfs/libxfs/xfs_parent.c     | 163 +++++++++++++
 fs/xfs/libxfs/xfs_trans_resv.c | 103 +++++++--
 fs/xfs/libxfs/xfs_types.h      |   1 +
 fs/xfs/xfs_acl.c               |  12 +-
 fs/xfs/xfs_attr.h              |  68 +++++-
 fs/xfs/xfs_attr_item.c         | 512 +++++++++++++++++++++++++++++++++++++++++
 fs/xfs/xfs_attr_item.h         | 111 +++++++++
 fs/xfs/xfs_fsops.c             |   4 +-
 fs/xfs/xfs_inode.c             | 146 +++++++++---
 fs/xfs/xfs_ioctl.c             |  13 +-
 fs/xfs/xfs_iops.c              |   6 +-
 fs/xfs/xfs_log_recover.c       | 140 +++++++++++
 fs/xfs/xfs_qm.c                |   2 +-
 fs/xfs/xfs_qm.h                |   1 +
 fs/xfs/xfs_super.c             |   1 +
 fs/xfs/xfs_symlink.c           |   2 +-
 fs/xfs/xfs_trans.h             |  13 ++
 fs/xfs/xfs_trans_attr.c        | 286 +++++++++++++++++++++++
 fs/xfs/xfs_xattr.c             |  10 +-
 36 files changed, 2064 insertions(+), 265 deletions(-)
 create mode 100644 fs/xfs/libxfs/xfs_parent.c
 create mode 100644 fs/xfs/xfs_attr_item.c
 create mode 100644 fs/xfs/xfs_attr_item.h
 create mode 100644 fs/xfs/xfs_trans_attr.c

-- 
2.7.4


^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH 01/17] Add helper functions xfs_attr_set_args and xfs_attr_remove_args
  2017-10-18 22:55 [PATCH 00/17] Parent Pointers V3 Allison Henderson
@ 2017-10-18 22:55 ` Allison Henderson
  2017-10-19 20:03   ` Darrick J. Wong
  2017-10-18 22:55 ` [PATCH 02/17] Set up infastructure for deferred attribute operations Allison Henderson
                   ` (16 subsequent siblings)
  17 siblings, 1 reply; 66+ messages in thread
From: Allison Henderson @ 2017-10-18 22:55 UTC (permalink / raw)
  To: linux-xfs; +Cc: Allison Henderson

These sub-routines set or remove the attributes specified in
@args. We will use this later for setting parent pointers as a
deferred attribute operation.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/libxfs/xfs_attr.c | 322 +++++++++++++++++++++++++++--------------------
 fs/xfs/xfs_attr.h        |   2 +
 2 files changed, 189 insertions(+), 135 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index 6249c92..b00ec1f 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -203,6 +203,185 @@ xfs_attr_calc_size(
 	return nblks;
 }
 
+/*
+ * set the attribute specified in @args. In the case of the parent attribute
+ * being set, we do not want to roll the transaction on shortform-to-leaf
+ * conversion, as the attribute must be added in the same transaction as the
+ * parent directory modifications. Hence @roll_trans needs to be set
+ * appropriately to control whether the transaction is committed during this
+ * function.
+ */
+int
+xfs_attr_set_args(
+	struct xfs_da_args	*args,
+	int			flags,
+	bool			roll_trans)
+{
+	struct xfs_inode	*dp = args->dp;
+	struct xfs_mount        *mp = dp->i_mount;
+	struct xfs_trans_res    tres;
+	int			rsvd = 0;
+	int			error = 0;
+
+	tres.tr_logres = M_RES(mp)->tr_attrsetm.tr_logres +
+			 M_RES(mp)->tr_attrsetrt.tr_logres * args->total;
+	tres.tr_logcount = XFS_ATTRSET_LOG_COUNT;
+	tres.tr_logflags = XFS_TRANS_PERM_LOG_RES;
+
+	/*
+	 * Root fork attributes can use reserved data blocks for this
+	 * operation if necessary
+	 */
+	error = xfs_trans_alloc(mp, &tres, args->total, 0,
+				rsvd ? XFS_TRANS_RESERVE : 0, &args->trans);
+	if (error)
+		goto out;
+
+	error = xfs_trans_reserve_quota_nblks(args->trans, dp, args->total, 0,
+					      rsvd ? XFS_QMOPT_RES_REGBLKS |
+						     XFS_QMOPT_FORCE_RES :
+						     XFS_QMOPT_RES_REGBLKS);
+	if (error)
+		goto out;
+
+	xfs_trans_ijoin(args->trans, dp, 0);
+	/*
+	 * If the attribute list is non-existent or a shortform list,
+	 * upgrade it to a single-leaf-block attribute list.
+	 */
+	if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL ||
+	    (dp->i_d.di_aformat == XFS_DINODE_FMT_EXTENTS &&
+	     dp->i_d.di_anextents == 0)) {
+
+		/*
+		 * Build initial attribute list (if required).
+		 */
+		if (dp->i_d.di_aformat == XFS_DINODE_FMT_EXTENTS)
+			xfs_attr_shortform_create(args);
+
+		/*
+		 * Try to add the attr to the attribute list in the inode.
+		 */
+		error = xfs_attr_shortform_addname(args);
+		if (error != -ENOSPC) {
+			ASSERT(args->trans);
+			if (!error && (flags & ATTR_KERNOTIME) == 0)
+				xfs_trans_ichgtime(args->trans, dp,
+						   XFS_ICHGTIME_CHG);
+			goto out;
+		}
+
+		/*
+		 * It won't fit in the shortform, transform to a leaf block.
+		 * GROT: another possible req'mt for a double-split btree op.
+		 */
+		error = xfs_attr_shortform_to_leaf(args);
+		if (error)
+			goto out;
+		xfs_defer_ijoin(args->dfops, dp);
+		if (roll_trans) {
+			error = xfs_defer_finish(&args->trans, args->dfops);
+			if (error) {
+				args->trans = NULL;
+				goto out;
+			}
+
+			/*
+			 * Commit the leaf transformation.  We'll need another
+			 * (linked) transaction to add the new attribute to the
+			 * leaf.
+			 */
+			error = xfs_trans_roll_inode(&args->trans, dp);
+			if (error)
+				goto out;
+		}
+	}
+
+	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
+		error = xfs_attr_leaf_addname(args);
+	else
+		error = xfs_attr_node_addname(args);
+	if (error)
+		goto out;
+
+	if ((flags & ATTR_KERNOTIME) == 0)
+		xfs_trans_ichgtime(args->trans, dp, XFS_ICHGTIME_CHG);
+
+	xfs_trans_log_inode(args->trans, dp, XFS_ILOG_CORE);
+out:
+	return error;
+}
+
+/*
+ * Remove the attribute specified in @args.
+ */
+int
+xfs_attr_remove_args(
+	struct xfs_da_args      *args,
+	int			flags)
+{
+	struct xfs_inode	*dp = args->dp;
+	struct xfs_mount	*mp = dp->i_mount;
+	int			error;
+	int                     rsvd = 0;
+
+	error = xfs_qm_dqattach_locked(dp, 0);
+	if (error)
+		return error;
+
+	/*
+	 * Root fork attributes can use reserved data blocks for this
+	 * operation if necessary
+	 */
+	if (flags & ATTR_ROOT)
+		rsvd = XFS_TRANS_RESERVE;
+	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_attrrm,
+		XFS_ATTRRM_SPACE_RES(mp), 0, rsvd, &args->trans);
+
+	if (error)
+		goto out;
+
+	/*
+	 * No need to make quota reservations here. We expect to release some
+	 * blocks not allocate in the common case.
+	 */
+	xfs_trans_ijoin(args->trans, dp, 0);
+
+	if (!xfs_inode_hasattr(dp)) {
+		error = -ENOATTR;
+	} else if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL) {
+		ASSERT(dp->i_afp->if_flags & XFS_IFINLINE);
+		error = xfs_attr_shortform_remove(args);
+	} else if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
+		error = xfs_attr_leaf_removename(args);
+	} else {
+		error = xfs_attr_node_removename(args);
+	}
+
+	if (error)
+		goto out;
+
+	/*
+	 * If this is a synchronous mount, make sure that the
+	 * transaction goes to disk before returning to the user.
+	 */
+	if (mp->m_flags & XFS_MOUNT_WSYNC)
+		xfs_trans_set_sync(args->trans);
+
+	if ((flags & ATTR_KERNOTIME) == 0)
+		xfs_trans_ichgtime(args->trans, dp, XFS_ICHGTIME_CHG);
+
+	xfs_trans_log_inode(args->trans, dp, XFS_ILOG_CORE);
+
+	return error;
+
+out:
+	if (args->trans)
+		xfs_trans_cancel(args->trans);
+
+	return error;
+}
+
 int
 xfs_attr_set(
 	struct xfs_inode	*dp,
@@ -214,10 +393,9 @@ xfs_attr_set(
 	struct xfs_mount	*mp = dp->i_mount;
 	struct xfs_da_args	args;
 	struct xfs_defer_ops	dfops;
-	struct xfs_trans_res	tres;
 	xfs_fsblock_t		firstblock;
 	int			rsvd = (flags & ATTR_ROOT) != 0;
-	int			error, err2, local;
+	int			error, local;
 
 	XFS_STATS_INC(mp, xs_attr_set);
 
@@ -252,106 +430,11 @@ xfs_attr_set(
 			return error;
 	}
 
-	tres.tr_logres = M_RES(mp)->tr_attrsetm.tr_logres +
-			 M_RES(mp)->tr_attrsetrt.tr_logres * args.total;
-	tres.tr_logcount = XFS_ATTRSET_LOG_COUNT;
-	tres.tr_logflags = XFS_TRANS_PERM_LOG_RES;
-
-	/*
-	 * Root fork attributes can use reserved data blocks for this
-	 * operation if necessary
-	 */
-	error = xfs_trans_alloc(mp, &tres, args.total, 0,
-			rsvd ? XFS_TRANS_RESERVE : 0, &args.trans);
-	if (error)
-		return error;
-
 	xfs_ilock(dp, XFS_ILOCK_EXCL);
-	error = xfs_trans_reserve_quota_nblks(args.trans, dp, args.total, 0,
-				rsvd ? XFS_QMOPT_RES_REGBLKS | XFS_QMOPT_FORCE_RES :
-				       XFS_QMOPT_RES_REGBLKS);
-	if (error) {
-		xfs_iunlock(dp, XFS_ILOCK_EXCL);
-		xfs_trans_cancel(args.trans);
-		return error;
-	}
-
-	xfs_trans_ijoin(args.trans, dp, 0);
-
-	/*
-	 * If the attribute list is non-existent or a shortform list,
-	 * upgrade it to a single-leaf-block attribute list.
-	 */
-	if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL ||
-	    (dp->i_d.di_aformat == XFS_DINODE_FMT_EXTENTS &&
-	     dp->i_d.di_anextents == 0)) {
-
-		/*
-		 * Build initial attribute list (if required).
-		 */
-		if (dp->i_d.di_aformat == XFS_DINODE_FMT_EXTENTS)
-			xfs_attr_shortform_create(&args);
-
-		/*
-		 * Try to add the attr to the attribute list in
-		 * the inode.
-		 */
-		error = xfs_attr_shortform_addname(&args);
-		if (error != -ENOSPC) {
-			/*
-			 * Commit the shortform mods, and we're done.
-			 * NOTE: this is also the error path (EEXIST, etc).
-			 */
-			ASSERT(args.trans != NULL);
-
-			/*
-			 * If this is a synchronous mount, make sure that
-			 * the transaction goes to disk before returning
-			 * to the user.
-			 */
-			if (mp->m_flags & XFS_MOUNT_WSYNC)
-				xfs_trans_set_sync(args.trans);
-
-			if (!error && (flags & ATTR_KERNOTIME) == 0) {
-				xfs_trans_ichgtime(args.trans, dp,
-							XFS_ICHGTIME_CHG);
-			}
-			err2 = xfs_trans_commit(args.trans);
-			xfs_iunlock(dp, XFS_ILOCK_EXCL);
-
-			return error ? error : err2;
-		}
-
-		/*
-		 * It won't fit in the shortform, transform to a leaf block.
-		 * GROT: another possible req'mt for a double-split btree op.
-		 */
-		xfs_defer_init(args.dfops, args.firstblock);
-		error = xfs_attr_shortform_to_leaf(&args);
-		if (error)
-			goto out_defer_cancel;
-		xfs_defer_ijoin(args.dfops, dp);
-		error = xfs_defer_finish(&args.trans, args.dfops);
-		if (error)
-			goto out_defer_cancel;
-
-		/*
-		 * Commit the leaf transformation.  We'll need another (linked)
-		 * transaction to add the new attribute to the leaf.
-		 */
-
-		error = xfs_trans_roll_inode(&args.trans, dp);
-		if (error)
-			goto out;
-
-	}
-
-	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
-		error = xfs_attr_leaf_addname(&args);
-	else
-		error = xfs_attr_node_addname(&args);
+	xfs_defer_init(args.dfops, args.firstblock);
+	error = xfs_attr_set_args(&args, flags, true);
 	if (error)
-		goto out;
+		goto out_defer_cancel;
 
 	/*
 	 * If this is a synchronous mount, make sure that the
@@ -360,9 +443,6 @@ xfs_attr_set(
 	if (mp->m_flags & XFS_MOUNT_WSYNC)
 		xfs_trans_set_sync(args.trans);
 
-	if ((flags & ATTR_KERNOTIME) == 0)
-		xfs_trans_ichgtime(args.trans, dp, XFS_ICHGTIME_CHG);
-
 	/*
 	 * Commit the last in the sequence of transactions.
 	 */
@@ -374,10 +454,6 @@ xfs_attr_set(
 
 out_defer_cancel:
 	xfs_defer_cancel(&dfops);
-	args.trans = NULL;
-out:
-	if (args.trans)
-		xfs_trans_cancel(args.trans);
 	xfs_iunlock(dp, XFS_ILOCK_EXCL);
 	return error;
 }
@@ -417,38 +493,15 @@ xfs_attr_remove(
 	 */
 	args.op_flags = XFS_DA_OP_OKNOENT;
 
-	error = xfs_qm_dqattach(dp, 0);
-	if (error)
-		return error;
-
-	/*
-	 * Root fork attributes can use reserved data blocks for this
-	 * operation if necessary
-	 */
-	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_attrrm,
-			XFS_ATTRRM_SPACE_RES(mp), 0,
-			(flags & ATTR_ROOT) ? XFS_TRANS_RESERVE : 0,
-			&args.trans);
-	if (error)
-		return error;
-
 	xfs_ilock(dp, XFS_ILOCK_EXCL);
 	/*
 	 * No need to make quota reservations here. We expect to release some
 	 * blocks not allocate in the common case.
 	 */
 	xfs_trans_ijoin(args.trans, dp, 0);
+	xfs_defer_init(args.dfops, args.firstblock);
 
-	if (!xfs_inode_hasattr(dp)) {
-		error = -ENOATTR;
-	} else if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL) {
-		ASSERT(dp->i_afp->if_flags & XFS_IFINLINE);
-		error = xfs_attr_shortform_remove(&args);
-	} else if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
-		error = xfs_attr_leaf_removename(&args);
-	} else {
-		error = xfs_attr_node_removename(&args);
-	}
+	error = xfs_attr_remove_args(&args, flags);
 
 	if (error)
 		goto out;
@@ -460,9 +513,6 @@ xfs_attr_remove(
 	if (mp->m_flags & XFS_MOUNT_WSYNC)
 		xfs_trans_set_sync(args.trans);
 
-	if ((flags & ATTR_KERNOTIME) == 0)
-		xfs_trans_ichgtime(args.trans, dp, XFS_ICHGTIME_CHG);
-
 	/*
 	 * Commit the last in the sequence of transactions.
 	 */
@@ -473,6 +523,8 @@ xfs_attr_remove(
 	return error;
 
 out:
+	xfs_defer_cancel(&dfops);
+
 	if (args.trans)
 		xfs_trans_cancel(args.trans);
 	xfs_iunlock(dp, XFS_ILOCK_EXCL);
diff --git a/fs/xfs/xfs_attr.h b/fs/xfs/xfs_attr.h
index 5d5a5e2..8542606 100644
--- a/fs/xfs/xfs_attr.h
+++ b/fs/xfs/xfs_attr.h
@@ -149,7 +149,9 @@ int xfs_attr_get(struct xfs_inode *ip, const unsigned char *name,
 		 unsigned char *value, int *valuelenp, int flags);
 int xfs_attr_set(struct xfs_inode *dp, const unsigned char *name,
 		 unsigned char *value, int valuelen, int flags);
+int xfs_attr_set_args(struct xfs_da_args *args, int flags, bool roll_trans);
 int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name, int flags);
+int xfs_attr_remove_args(struct xfs_da_args *args, int flags);
 int xfs_attr_list(struct xfs_inode *dp, char *buffer, int bufsize,
 		  int flags, struct attrlist_cursor_kern *cursor);
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH 02/17] Set up infastructure for deferred attribute operations
  2017-10-18 22:55 [PATCH 00/17] Parent Pointers V3 Allison Henderson
  2017-10-18 22:55 ` [PATCH 01/17] Add helper functions xfs_attr_set_args and xfs_attr_remove_args Allison Henderson
@ 2017-10-18 22:55 ` Allison Henderson
  2017-10-19 19:02   ` Darrick J. Wong
  2017-10-18 22:55 ` [PATCH 03/17] Add xfs_attr_set_defered and xfs_attr_remove_defered Allison Henderson
                   ` (15 subsequent siblings)
  17 siblings, 1 reply; 66+ messages in thread
From: Allison Henderson @ 2017-10-18 22:55 UTC (permalink / raw)
  To: linux-xfs; +Cc: Allison Henderson

This patch adds two new log item types for setting or
removing attributes as deferred operations.  The
xfs_attri_log_item logs an intent to set or remove an
attribute.  The corresponding xfs_attrd_log_item holds
a reference to the xfs_attri_log_item and is freed once
the transaction is done.  Both log items use a generic
xfs_attr_log_format structure that contains the attribute
name, value, flags, inode, and an op_flag that indicates
if the operations is a set or remove.

At the moment, this feature will only be used by the parent
pointer patch set which uses attributes to store information
about an inodes parent.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/Makefile                |   2 +
 fs/xfs/libxfs/xfs_attr.c       |   2 +-
 fs/xfs/libxfs/xfs_defer.h      |   1 +
 fs/xfs/libxfs/xfs_log_format.h |  36 ++-
 fs/xfs/libxfs/xfs_types.h      |   1 +
 fs/xfs/xfs_attr.h              |  20 +-
 fs/xfs/xfs_attr_item.c         | 512 +++++++++++++++++++++++++++++++++++++++++
 fs/xfs/xfs_attr_item.h         | 111 +++++++++
 fs/xfs/xfs_log_recover.c       | 140 +++++++++++
 fs/xfs/xfs_super.c             |   1 +
 fs/xfs/xfs_trans.h             |  13 ++
 fs/xfs/xfs_trans_attr.c        | 286 +++++++++++++++++++++++
 12 files changed, 1121 insertions(+), 4 deletions(-)

diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile
index a6e955b..ec6486b 100644
--- a/fs/xfs/Makefile
+++ b/fs/xfs/Makefile
@@ -106,6 +106,7 @@ xfs-y				+= xfs_log.o \
 				   xfs_bmap_item.o \
 				   xfs_buf_item.o \
 				   xfs_extfree_item.o \
+				   xfs_attr_item.o \
 				   xfs_icreate_item.o \
 				   xfs_inode_item.o \
 				   xfs_refcount_item.o \
@@ -115,6 +116,7 @@ xfs-y				+= xfs_log.o \
 				   xfs_trans_bmap.o \
 				   xfs_trans_buf.o \
 				   xfs_trans_extfree.o \
+				   xfs_trans_attr.o \
 				   xfs_trans_inode.o \
 				   xfs_trans_refcount.o \
 				   xfs_trans_rmap.o \
diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index b00ec1f..5325ec2 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -74,7 +74,7 @@ STATIC int xfs_attr_fillstate(xfs_da_state_t *state);
 STATIC int xfs_attr_refillstate(xfs_da_state_t *state);
 
 
-STATIC int
+int
 xfs_attr_args_init(
 	struct xfs_da_args	*args,
 	struct xfs_inode	*dp,
diff --git a/fs/xfs/libxfs/xfs_defer.h b/fs/xfs/libxfs/xfs_defer.h
index d4f046d..ef0f8bf 100644
--- a/fs/xfs/libxfs/xfs_defer.h
+++ b/fs/xfs/libxfs/xfs_defer.h
@@ -55,6 +55,7 @@ enum xfs_defer_ops_type {
 	XFS_DEFER_OPS_TYPE_REFCOUNT,
 	XFS_DEFER_OPS_TYPE_RMAP,
 	XFS_DEFER_OPS_TYPE_FREE,
+	XFS_DEFER_OPS_TYPE_ATTR,
 	XFS_DEFER_OPS_TYPE_MAX,
 };
 
diff --git a/fs/xfs/libxfs/xfs_log_format.h b/fs/xfs/libxfs/xfs_log_format.h
index 8372e9b..b0ce87e 100644
--- a/fs/xfs/libxfs/xfs_log_format.h
+++ b/fs/xfs/libxfs/xfs_log_format.h
@@ -18,6 +18,8 @@
 #ifndef	__XFS_LOG_FORMAT_H__
 #define __XFS_LOG_FORMAT_H__
 
+#include "xfs_attr.h"
+
 struct xfs_mount;
 struct xfs_trans_res;
 
@@ -116,7 +118,12 @@ static inline uint xlog_get_cycle(char *ptr)
 #define XLOG_REG_TYPE_CUD_FORMAT	24
 #define XLOG_REG_TYPE_BUI_FORMAT	25
 #define XLOG_REG_TYPE_BUD_FORMAT	26
-#define XLOG_REG_TYPE_MAX		26
+#define XLOG_REG_TYPE_ATTRI_FORMAT	27
+#define XLOG_REG_TYPE_ATTRD_FORMAT	28
+#define XLOG_REG_TYPE_ATTR_NAME		29
+#define XLOG_REG_TYPE_ATTR_VALUE	30
+#define XLOG_REG_TYPE_MAX		31
+
 
 /*
  * Flags to log operation header
@@ -239,6 +246,8 @@ typedef struct xfs_trans_header {
 #define	XFS_LI_CUD		0x1243
 #define	XFS_LI_BUI		0x1244	/* bmbt update intent */
 #define	XFS_LI_BUD		0x1245
+#define	XFS_LI_ATTRI		0x1246  /* attr set/remove intent*/
+#define	XFS_LI_ATTRD		0x1247  /* attr set/remove done */
 
 #define XFS_LI_TYPE_DESC \
 	{ XFS_LI_EFI,		"XFS_LI_EFI" }, \
@@ -254,7 +263,9 @@ typedef struct xfs_trans_header {
 	{ XFS_LI_CUI,		"XFS_LI_CUI" }, \
 	{ XFS_LI_CUD,		"XFS_LI_CUD" }, \
 	{ XFS_LI_BUI,		"XFS_LI_BUI" }, \
-	{ XFS_LI_BUD,		"XFS_LI_BUD" }
+	{ XFS_LI_BUD,		"XFS_LI_BUD" }, \
+	{ XFS_LI_ATTRI,		"XFS_LI_ATTRI" }, \
+	{ XFS_LI_ATTRD,		"XFS_LI_ATTRD" }
 
 /*
  * Inode Log Item Format definitions.
@@ -863,4 +874,25 @@ struct xfs_icreate_log {
 	__be32		icl_gen;	/* inode generation number to use */
 };
 
+/* Flags for deferred attribute operations */
+#define ATTR_OP_FLAGS_SET	0x01	/* Set the attribute */
+#define ATTR_OP_FLAGS_REMOVE	0x02	/* Remove the attribute */
+#define ATTR_OP_FLAGS_MAX	0x02	/* Max flags */
+
+/*
+ * This is the structure used to lay out an attr log item in the
+ * log.
+ */
+struct xfs_attr_log_format {
+	uint64_t	id;		/* attri identifier */
+	xfs_ino_t       ino;		/* the inode for this attr operation */
+	uint32_t        op_flags;	/* marks the op as a set or remove */
+	uint32_t        name_len;	/* attr name length */
+	uint32_t        value_len;	/* attr value length */
+	uint32_t        attr_flags;	/* attr flags */
+	uint16_t	type;		/* attri log item type */
+	uint16_t	size;		/* size of this item */
+	uint32_t	pad;		/* pad to 64 bit aligned */
+};
+
 #endif /* __XFS_LOG_FORMAT_H__ */
diff --git a/fs/xfs/libxfs/xfs_types.h b/fs/xfs/libxfs/xfs_types.h
index 0220159..5372063 100644
--- a/fs/xfs/libxfs/xfs_types.h
+++ b/fs/xfs/libxfs/xfs_types.h
@@ -23,6 +23,7 @@ typedef uint32_t	prid_t;		/* project ID */
 typedef uint32_t	xfs_agblock_t;	/* blockno in alloc. group */
 typedef uint32_t	xfs_agino_t;	/* inode # within allocation grp */
 typedef uint32_t	xfs_extlen_t;	/* extent length in blocks */
+typedef uint32_t	xfs_attrlen_t;	/* attr length */
 typedef uint32_t	xfs_agnumber_t;	/* allocation group number */
 typedef int32_t		xfs_extnum_t;	/* # of extents in a file */
 typedef int16_t		xfs_aextnum_t;	/* # extents in an attribute fork */
diff --git a/fs/xfs/xfs_attr.h b/fs/xfs/xfs_attr.h
index 8542606..34bb4cb 100644
--- a/fs/xfs/xfs_attr.h
+++ b/fs/xfs/xfs_attr.h
@@ -18,6 +18,8 @@
 #ifndef __XFS_ATTR_H__
 #define	__XFS_ATTR_H__
 
+#include "libxfs/xfs_defer.h"
+
 struct xfs_inode;
 struct xfs_da_args;
 struct xfs_attr_list_context;
@@ -87,6 +89,20 @@ typedef struct attrlist_ent {	/* data from attr_list() */
 } attrlist_ent_t;
 
 /*
+ * List of attrs to commit later.
+ */
+struct xfs_attr_item {
+	xfs_ino_t	  xattri_ino;
+	uint32_t	  xattri_op_flags;
+	uint32_t	  xattri_value_len;   /* length of name and val */
+	uint32_t	  xattri_name_len;    /* length of name */
+	uint32_t	  xattri_flags;       /* attr flags */
+	char		  xattri_name[XATTR_NAME_MAX];
+	char              xattri_value[XATTR_SIZE_MAX];
+	struct list_head  xattri_list;
+};
+
+/*
  * Given a pointer to the (char*) buffer containing the attr_list() result,
  * and an index, return a pointer to the indicated attribute in the buffer.
  */
@@ -154,6 +170,8 @@ int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name, int flags);
 int xfs_attr_remove_args(struct xfs_da_args *args, int flags);
 int xfs_attr_list(struct xfs_inode *dp, char *buffer, int bufsize,
 		  int flags, struct attrlist_cursor_kern *cursor);
-
+int xfs_attr_args_init(struct xfs_da_args *args, struct xfs_inode *dp,
+		       const unsigned char *name, int flags);
+int xfs_attr_calc_size(struct xfs_da_args *args, int *local);
 
 #endif	/* __XFS_ATTR_H__ */
diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
new file mode 100644
index 0000000..8cbe9b0
--- /dev/null
+++ b/fs/xfs/xfs_attr_item.c
@@ -0,0 +1,512 @@
+/*
+ * Copyright (c) 2017 Oracle, Inc.
+ * All Rights Reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation Inc.
+ */
+#include "xfs.h"
+#include "xfs_fs.h"
+#include "xfs_format.h"
+#include "xfs_log_format.h"
+#include "xfs_trans_resv.h"
+#include "xfs_bit.h"
+#include "xfs_mount.h"
+#include "xfs_trans.h"
+#include "xfs_trans_priv.h"
+#include "xfs_buf_item.h"
+#include "xfs_attr_item.h"
+#include "xfs_log.h"
+#include "xfs_btree.h"
+#include "xfs_rmap.h"
+
+
+static inline struct xfs_attri_log_item *ATTRI_ITEM(struct xfs_log_item *lip)
+{
+	return container_of(lip, struct xfs_attri_log_item, item);
+}
+
+void
+xfs_attri_item_free(
+	struct xfs_attri_log_item	*attrip)
+{
+	kmem_free(attrip->item.li_lv_shadow);
+	kmem_free(attrip);
+}
+
+/*
+ * This returns the number of iovecs needed to log the given attri item.
+ * We only need 1 iovec for an attri item.  It just logs the attr_log_format
+ * structure.
+ */
+static inline int
+xfs_attri_item_sizeof(
+	struct xfs_attri_log_item *attrip)
+{
+	return sizeof(struct xfs_attr_log_format);
+}
+
+STATIC void
+xfs_attri_item_size(
+	struct xfs_log_item	*lip,
+	int			*nvecs,
+	int			*nbytes)
+{
+	struct xfs_attri_log_item       *attrip = ATTRI_ITEM(lip);
+
+	*nvecs += 1;
+	*nbytes += xfs_attri_item_sizeof(attrip);
+
+	if (attrip->name_len > 0) {
+		*nvecs += 1;
+		nbytes += attrip->name_len;
+	}
+
+	if (attrip->value_len > 0) {
+		*nvecs += 1;
+		nbytes += attrip->value_len;
+	}
+}
+
+/*
+ * This is called to fill in the vector of log iovecs for the
+ * given attri log item. We use only 1 iovec, and we point that
+ * at the attri_log_format structure embedded in the attri item.
+ * It is at this point that we assert that all of the attr
+ * slots in the attri item have been filled.
+ */
+STATIC void
+xfs_attri_item_format(
+	struct xfs_log_item	*lip,
+	struct xfs_log_vec	*lv)
+{
+	struct xfs_attri_log_item	*attrip = ATTRI_ITEM(lip);
+	struct xfs_log_iovec	*vecp = NULL;
+
+	attrip->format.type = XFS_LI_ATTRI;
+	attrip->format.size = 1;
+
+	xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTRI_FORMAT,
+			&attrip->format,
+			xfs_attri_item_sizeof(attrip));
+	if (attrip->name_len > 0)
+		xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTR_NAME,
+				attrip->name, attrip->name_len);
+
+	if (attrip->value_len > 0)
+		xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTR_VALUE,
+				attrip->value, attrip->value_len);
+}
+
+
+/*
+ * Pinning has no meaning for an attri item, so just return.
+ */
+STATIC void
+xfs_attri_item_pin(
+	struct xfs_log_item	*lip)
+{
+}
+
+/*
+ * The unpin operation is the last place an ATTRI is manipulated in the log. It
+ * is either inserted in the AIL or aborted in the event of a log I/O error. In
+ * either case, the EFI transaction has been successfully committed to make it
+ * this far. Therefore, we expect whoever committed the ATTRI to either
+ * construct and commit the ATTRD or drop the ATTRD's reference in the event of
+ * error. Simply drop the log's ATTRI reference now that the log is done with
+ * it.
+ */
+STATIC void
+xfs_attri_item_unpin(
+	struct xfs_log_item	*lip,
+	int			remove)
+{
+	struct xfs_attri_log_item	*attrip = ATTRI_ITEM(lip);
+
+	xfs_attri_release(attrip);
+}
+
+/*
+ * attri items have no locking or pushing.  However, since ATTRIs are pulled
+ * from the AIL when their corresponding ATTRDs are committed to disk, their
+ * situation is very similar to being pinned.  Return XFS_ITEM_PINNED so that
+ * the caller will eventually flush the log.  This should help in getting the
+ * ATTRI out of the AIL.
+ */
+STATIC uint
+xfs_attri_item_push(
+	struct xfs_log_item	*lip,
+	struct list_head	*buffer_list)
+{
+	return XFS_ITEM_PINNED;
+}
+
+/*
+ * The ATTRI has been either committed or aborted if the transaction has been
+ * cancelled. If the transaction was cancelled, an ATTRD isn't going to be
+ * constructed and thus we free the ATTRI here directly.
+ */
+STATIC void
+xfs_attri_item_unlock(
+	struct xfs_log_item	*lip)
+{
+	if (lip->li_flags & XFS_LI_ABORTED)
+		xfs_attri_item_free(ATTRI_ITEM(lip));
+}
+
+/*
+ * The ATTRI is logged only once and cannot be moved in the log, so simply
+ * return the lsn at which it's been logged.
+ */
+STATIC xfs_lsn_t
+xfs_attri_item_committed(
+	struct xfs_log_item	*lip,
+	xfs_lsn_t		lsn)
+{
+	return lsn;
+}
+
+STATIC void
+xfs_attri_item_committing(
+	struct xfs_log_item	*lip,
+	xfs_lsn_t		lsn)
+{
+}
+
+/*
+ * This is the ops vector shared by all attri log items.
+ */
+static const struct xfs_item_ops xfs_attri_item_ops = {
+	.iop_size	= xfs_attri_item_size,
+	.iop_format	= xfs_attri_item_format,
+	.iop_pin	= xfs_attri_item_pin,
+	.iop_unpin	= xfs_attri_item_unpin,
+	.iop_unlock	= xfs_attri_item_unlock,
+	.iop_committed	= xfs_attri_item_committed,
+	.iop_push	= xfs_attri_item_push,
+	.iop_committing = xfs_attri_item_committing
+};
+
+
+/*
+ * Allocate and initialize an attri item
+ */
+struct xfs_attri_log_item *
+xfs_attri_init(
+	struct xfs_mount	*mp)
+
+{
+	struct xfs_attri_log_item	*attrip;
+	uint			size;
+
+	size = (uint)(sizeof(struct xfs_attri_log_item));
+	attrip = kmem_zalloc(size, KM_SLEEP);
+
+	xfs_log_item_init(mp, &(attrip->item), XFS_LI_ATTRI,
+			  &xfs_attri_item_ops);
+	attrip->format.id = (uintptr_t)(void *)attrip;
+	atomic_set(&attrip->refcount, 2);
+
+	return attrip;
+}
+
+/*
+ * Copy an attr format buffer from the given buf, and into the destination
+ * attr format structure.
+ */
+int
+xfs_attr_copy_format(struct xfs_log_iovec *buf,
+		      struct xfs_attr_log_format *dst_attr_fmt)
+{
+	struct xfs_attr_log_format *src_attr_fmt = buf->i_addr;
+	uint len = sizeof(struct xfs_attr_log_format);
+
+	if (buf->i_len == len) {
+		memcpy((char *)dst_attr_fmt, (char *)src_attr_fmt, len);
+		return 0;
+	}
+	return -EFSCORRUPTED;
+}
+
+/*
+ * Freeing the attri requires that we remove it from the AIL if it has already
+ * been placed there. However, the ATTRI may not yet have been placed in the
+ * AIL when called by xfs_attri_release() from ATTRD processing due to the
+ * ordering of committed vs unpin operations in bulk insert operations. Hence
+ * the reference count to ensure only the last caller frees the ATTRI.
+ */
+void
+xfs_attri_release(
+	struct xfs_attri_log_item	*attrip)
+{
+	ASSERT(atomic_read(&attrip->refcount) > 0);
+	if (atomic_dec_and_test(&attrip->refcount)) {
+		xfs_trans_ail_remove(&attrip->item,
+				     SHUTDOWN_LOG_IO_ERROR);
+		xfs_attri_item_free(attrip);
+	}
+}
+
+static inline struct xfs_attrd_log_item *ATTRD_ITEM(struct xfs_log_item *lip)
+{
+	return container_of(lip, struct xfs_attrd_log_item, item);
+}
+
+STATIC void
+xfs_attrd_item_free(struct xfs_attrd_log_item *attrdp)
+{
+	kmem_free(attrdp->item.li_lv_shadow);
+	kmem_free(attrdp);
+}
+
+/*
+ * This returns the number of iovecs needed to log the given attrd item.
+ * We only need 1 iovec for an attrd item.  It just logs the attr_log_format
+ * structure.
+ */
+static inline int
+xfs_attrd_item_sizeof(
+	struct xfs_attrd_log_item *attrdp)
+{
+	return sizeof(struct xfs_attr_log_format);
+}
+
+STATIC void
+xfs_attrd_item_size(
+	struct xfs_log_item	*lip,
+	int			*nvecs,
+	int			*nbytes)
+{
+	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
+	*nvecs += 1;
+	*nbytes += xfs_attrd_item_sizeof(attrdp);
+
+	if (attrdp->name_len > 0) {
+		*nvecs += 1;
+		nbytes += attrdp->name_len;
+	}
+
+	if (attrdp->value_len > 0) {
+		*nvecs += 1;
+		nbytes += attrdp->value_len;
+	}
+}
+
+/*
+ * This is called to fill in the vector of log iovecs for the
+ * given attrd log item. We use only 1 iovec, and we point that
+ * at the attr_log_format structure embedded in the attrd item.
+ * It is at this point that we assert that all of the attr
+ * slots in the attrd item have been filled.
+ */
+STATIC void
+xfs_attrd_item_format(
+	struct xfs_log_item	*lip,
+	struct xfs_log_vec	*lv)
+{
+	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
+	struct xfs_log_iovec	*vecp = NULL;
+
+	attrdp->format.type = XFS_LI_ATTRD;
+	attrdp->format.size = 1;
+
+	xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTRD_FORMAT,
+			&attrdp->format,
+			xfs_attrd_item_sizeof(attrdp));
+
+	if (attrdp->name_len > 0)
+		xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTR_NAME,
+				attrdp->name, attrdp->name_len);
+
+	if (attrdp->value_len > 0)
+		xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTR_VALUE,
+				attrdp->value, attrdp->value_len);
+}
+
+/*
+ * Pinning has no meaning for an attrd item, so just return.
+ */
+STATIC void
+xfs_attrd_item_pin(
+	struct xfs_log_item	*lip)
+{
+}
+
+/*
+ * Since pinning has no meaning for an attrd item, unpinning does
+ * not either.
+ */
+STATIC void
+xfs_attrd_item_unpin(
+	struct xfs_log_item	*lip,
+	int			remove)
+{
+}
+
+/*
+ * There isn't much you can do to push on an attrd item.  It is simply stuck
+ * waiting for the log to be flushed to disk.
+ */
+STATIC uint
+xfs_attrd_item_push(
+	struct xfs_log_item	*lip,
+	struct list_head	*buffer_list)
+{
+	return XFS_ITEM_PINNED;
+}
+
+/*
+ * The ATTRD is either committed or aborted if the transaction is cancelled. If
+ * the transaction is cancelled, drop our reference to the ATTRI and free the
+ * ATTRD.
+ */
+STATIC void
+xfs_attrd_item_unlock(
+	struct xfs_log_item	*lip)
+{
+	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
+
+	if (lip->li_flags & XFS_LI_ABORTED) {
+		xfs_attri_release(attrdp->attrip);
+		xfs_attrd_item_free(attrdp);
+	}
+}
+
+/*
+ * When the attrd item is committed to disk, all we need to do is delete our
+ * reference to our partner attri item and then free ourselves. Since we're
+ * freeing ourselves we must return -1 to keep the transaction code from
+ * further referencing this item.
+ */
+STATIC xfs_lsn_t
+xfs_attrd_item_committed(
+	struct xfs_log_item	*lip,
+	xfs_lsn_t		lsn)
+{
+	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
+
+	/*
+	 * Drop the ATTRI reference regardless of whether the ATTRD has been
+	 * aborted. Once the ATTRD transaction is constructed, it is the sole
+	 * responsibility of the ATTRD to release the ATTRI (even if the ATTRI
+	 * is aborted due to log I/O error).
+	 */
+	xfs_attri_release(attrdp->attrip);
+	xfs_attrd_item_free(attrdp);
+
+	return (xfs_lsn_t)-1;
+}
+
+STATIC void
+xfs_attrd_item_committing(
+	struct xfs_log_item	*lip,
+	xfs_lsn_t		lsn)
+{
+}
+
+/*
+ * This is the ops vector shared by all attrd log items.
+ */
+static const struct xfs_item_ops xfs_attrd_item_ops = {
+	.iop_size	= xfs_attrd_item_size,
+	.iop_format	= xfs_attrd_item_format,
+	.iop_pin	= xfs_attrd_item_pin,
+	.iop_unpin	= xfs_attrd_item_unpin,
+	.iop_unlock	= xfs_attrd_item_unlock,
+	.iop_committed	= xfs_attrd_item_committed,
+	.iop_push	= xfs_attrd_item_push,
+	.iop_committing = xfs_attrd_item_committing
+};
+
+/*
+ * Allocate and initialize an attrd item
+ */
+struct xfs_attrd_log_item *
+xfs_attrd_init(
+	struct xfs_mount	*mp,
+	struct xfs_attri_log_item	*attrip)
+
+{
+	struct xfs_attrd_log_item	*attrdp;
+	uint			size;
+
+	size = (uint)(sizeof(struct xfs_attrd_log_item));
+	attrdp = kmem_zalloc(size, KM_SLEEP);
+
+	xfs_log_item_init(mp, &attrdp->item, XFS_LI_ATTRD,
+			  &xfs_attrd_item_ops);
+	attrdp->attrip = attrip;
+	attrdp->format.id = attrip->format.id;
+
+	return attrdp;
+}
+
+/*
+ * Process an attr intent item that was recovered from
+ * the log.  We need to delete the attr that it describes.
+ */
+int
+xfs_attri_recover(
+	struct xfs_mount	*mp,
+	struct xfs_attri_log_item	*attrip)
+{
+	struct xfs_attrd_log_item	*attrdp;
+	struct xfs_trans	*tp;
+	int			error = 0;
+	struct xfs_attr_log_format	*attrp;
+
+	ASSERT(!test_bit(XFS_ATTRI_RECOVERED, &attrip->flags));
+
+	/*
+	 * First check the validity of the attr described by the
+	 * ATTRI.  If any are bad, then assume that all are bad and
+	 * just toss the ATTRI.
+	 */
+	attrp = &attrip->format;
+	if (attrp->value_len == 0 ||
+	    attrp->name_len == 0 ||
+	    attrp->op_flags > ATTR_OP_FLAGS_MAX) {
+		/*
+		 * This will pull the ATTRI from the AIL and
+		 * free the memory associated with it.
+		 */
+		set_bit(XFS_ATTRI_RECOVERED, &attrip->flags);
+		xfs_attri_release(attrip);
+		return -EIO;
+	}
+
+	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_itruncate, 0, 0, 0, &tp);
+	if (error)
+		return error;
+	attrdp = xfs_trans_get_attrd(tp, attrip);
+	attrp = &attrip->format;
+
+	error = xfs_trans_attr(tp, attrdp, attrp->ino,
+				attrp->op_flags,
+				attrp->attr_flags,
+				attrp->name_len,
+				attrp->value_len,
+				attrip->name,
+				attrip->value);
+	if (error)
+		goto abort_error;
+
+
+	set_bit(XFS_ATTRI_RECOVERED, &attrip->flags);
+	error = xfs_trans_commit(tp);
+	return error;
+
+abort_error:
+	xfs_trans_cancel(tp);
+	return error;
+}
diff --git a/fs/xfs/xfs_attr_item.h b/fs/xfs/xfs_attr_item.h
new file mode 100644
index 0000000..023675d
--- /dev/null
+++ b/fs/xfs/xfs_attr_item.h
@@ -0,0 +1,111 @@
+/*
+ * Copyright (c) 2017 Oracle, Inc.
+ * All Rights Reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation Inc.
+ */
+#ifndef	__XFS_ATTR_ITEM_H__
+#define	__XFS_ATTR_ITEM_H__
+
+/* kernel only ATTRI/ATTRD definitions */
+
+struct xfs_mount;
+struct kmem_zone;
+
+/*
+ * Max number of attrs in fast allocation path.
+ */
+#define XFS_ATTRI_MAX_FAST_ATTRS        16
+
+
+/*
+ * Define ATTR flag bits. Manipulated by set/clear/test_bit operators.
+ */
+#define	XFS_ATTRI_RECOVERED	1
+
+/*
+ * This is the "attr intention" log item.  It is used to log the fact
+ * that some attrs need to be processed.  It is used in conjunction with the
+ * "attr done" log item described below.
+ *
+ * The ATTRI is reference counted so that it is not freed prior to both the
+ * ATTRI and ATTRD being committed and unpinned. This ensures the ATTRI is
+ * inserted into the AIL even in the event of out of order ATTRI/ATTRD
+ * processing. In other words, an ATTRI is born with two references:
+ *
+ *      1.) an ATTRI held reference to track ATTRI AIL insertion
+ *      2.) an ATTRD held reference to track ATTRD commit
+ *
+ * On allocation, both references are the responsibility of the caller. Once
+ * the ATTRI is added to and dirtied in a transaction, ownership of reference
+ * one transfers to the transaction. The reference is dropped once the ATTRI is
+ * inserted to the AIL or in the event of failure along the way (e.g., commit
+ * failure, log I/O error, etc.). Note that the caller remains responsible for
+ * the ATTRD reference under all circumstances to this point. The caller has no
+ * means to detect failure once the transaction is committed, however.
+ * Therefore, an ATTRD is required after this point, even in the event of
+ * unrelated failure.
+ *
+ * Once an ATTRD is allocated and dirtied in a transaction, reference two
+ * transfers to the transaction. The ATTRD reference is dropped once it reaches
+ * the unpin handler. Similar to the ATTRI, the reference also drops in the
+ * event of commit failure or log I/O errors. Note that the ATTRD is not
+ * inserted in the AIL, so at this point both the ATTI and ATTRD are freed.
+ */
+struct xfs_attri_log_item {
+	xfs_log_item_t			item;
+	atomic_t			refcount;
+	unsigned long			flags;	/* misc flags */
+	int				name_len;
+	void				*name;
+	int				value_len;
+	void				*value;
+	struct xfs_attr_log_format	format;
+};
+
+/*
+ * This is the "attr done" log item.  It is used to log
+ * the fact that some attrs earlier mentioned in an attri item
+ * have been freed.
+ */
+struct xfs_attrd_log_item {
+	struct xfs_log_item		item;
+	struct xfs_attri_log_item	*attrip;
+	uint				next_attr;
+	int				name_len;
+	void				*name;
+	int				value_len;
+	void				*value;
+	struct xfs_attr_log_format	format;
+};
+
+/*
+ * Max number of attrs in fast allocation path.
+ */
+#define	XFS_ATTRD_MAX_FAST_ATTRS	16
+
+extern struct kmem_zone	*xfs_attri_zone;
+extern struct kmem_zone	*xfs_attrd_zone;
+
+struct xfs_attri_log_item	*xfs_attri_init(struct xfs_mount *mp);
+struct xfs_attrd_log_item	*xfs_attrd_init(struct xfs_mount *mp,
+					struct xfs_attri_log_item *attrip);
+int xfs_attr_copy_format(struct xfs_log_iovec *buf,
+			 struct xfs_attr_log_format *dst_attri_fmt);
+void			xfs_attri_item_free(struct xfs_attri_log_item *attrip);
+void			xfs_attri_release(struct xfs_attri_log_item *attrip);
+
+int			xfs_attri_recover(struct xfs_mount *mp,
+					struct xfs_attri_log_item *attrip);
+
+#endif	/* __XFS_ATTR_ITEM_H__ */
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index ee34899..8326f56 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -33,6 +33,7 @@
 #include "xfs_log_recover.h"
 #include "xfs_inode_item.h"
 #include "xfs_extfree_item.h"
+#include "xfs_attr_item.h"
 #include "xfs_trans_priv.h"
 #include "xfs_alloc.h"
 #include "xfs_ialloc.h"
@@ -1956,6 +1957,8 @@ xlog_recover_reorder_trans(
 		case XFS_LI_CUD:
 		case XFS_LI_BUI:
 		case XFS_LI_BUD:
+		case XFS_LI_ATTRI:
+		case XFS_LI_ATTRD:
 			trace_xfs_log_recover_item_reorder_tail(log,
 							trans, item, pass);
 			list_move_tail(&item->ri_list, &inode_list);
@@ -3489,6 +3492,92 @@ xlog_recover_efd_pass2(
 	return 0;
 }
 
+STATIC int
+xlog_recover_attri_pass2(
+	struct xlog                     *log,
+	struct xlog_recover_item        *item,
+	xfs_lsn_t                       lsn)
+{
+	int                             error;
+	struct xfs_mount                *mp = log->l_mp;
+	struct xfs_attri_log_item       *attrip;
+	struct xfs_attr_log_format     *attri_formatp;
+
+	attri_formatp = item->ri_buf[0].i_addr;
+
+	attrip = xfs_attri_init(mp);
+	error = xfs_attr_copy_format(&item->ri_buf[0], &attrip->format);
+	if (error) {
+		xfs_attri_item_free(attrip);
+		return error;
+	}
+
+	spin_lock(&log->l_ailp->xa_lock);
+	/*
+	 * The ATTRI has two references. One for the ATTRD and one for ATTRI to
+	 * ensure it makes it into the AIL. Insert the ATTRI into the AIL
+	 * directly and drop the ATTRI reference. Note that
+	 * xfs_trans_ail_update() drops the AIL lock.
+	 */
+	xfs_trans_ail_update(log->l_ailp, &attrip->item, lsn);
+	xfs_attri_release(attrip);
+	return 0;
+}
+
+
+/*
+ * This routine is called when an ATTRD format structure is found in a committed
+ * transaction in the log. Its purpose is to cancel the corresponding ATTRI if
+ * it was still in the log. To do this it searches the AIL for the ATTRI with
+ * an id equal to that in the ATTRD format structure. If we find it we drop
+ * the ATTRD reference, which removes the ATTRI from the AIL and frees it.
+ */
+STATIC int
+xlog_recover_attrd_pass2(
+	struct xlog                     *log,
+	struct xlog_recover_item        *item)
+{
+	struct xfs_attr_log_format    *attrd_formatp;
+	struct xfs_attri_log_item      *attrip = NULL;
+	struct xfs_log_item          *lip;
+	uint64_t                attri_id;
+	struct xfs_ail_cursor   cur;
+	struct xfs_ail          *ailp = log->l_ailp;
+
+	attrd_formatp = item->ri_buf[0].i_addr;
+	ASSERT((item->ri_buf[0].i_len ==
+				(sizeof(struct xfs_attr_log_format))));
+	attri_id = attrd_formatp->id;
+
+	/*
+	 * Search for the ATTRI with the id in the ATTRD format structure in the
+	 * AIL.
+	 */
+	spin_lock(&ailp->xa_lock);
+	lip = xfs_trans_ail_cursor_first(ailp, &cur, 0);
+	while (lip != NULL) {
+		if (lip->li_type == XFS_LI_ATTRI) {
+			attrip = (struct xfs_attri_log_item *)lip;
+			if (attrip->format.id == attri_id) {
+				/*
+				 * Drop the ATTRD reference to the ATTRI. This
+				 * removes the ATTRI from the AIL and frees it.
+				 */
+				spin_unlock(&ailp->xa_lock);
+				xfs_attri_release(attrip);
+				spin_lock(&ailp->xa_lock);
+				break;
+			}
+		}
+		lip = xfs_trans_ail_cursor_next(ailp, &cur);
+	}
+
+	xfs_trans_ail_cursor_done(&cur);
+	spin_unlock(&ailp->xa_lock);
+
+	return 0;
+}
+
 /*
  * This routine is called to create an in-core extent rmap update
  * item from the rui format structure which was logged on disk.
@@ -4108,6 +4197,10 @@ xlog_recover_commit_pass2(
 		return xlog_recover_efi_pass2(log, item, trans->r_lsn);
 	case XFS_LI_EFD:
 		return xlog_recover_efd_pass2(log, item);
+	case XFS_LI_ATTRI:
+		return xlog_recover_attri_pass2(log, item, trans->r_lsn);
+	case XFS_LI_ATTRD:
+		return xlog_recover_attrd_pass2(log, item);
 	case XFS_LI_RUI:
 		return xlog_recover_rui_pass2(log, item, trans->r_lsn);
 	case XFS_LI_RUD:
@@ -4669,6 +4762,49 @@ xlog_recover_cancel_efi(
 	spin_lock(&ailp->xa_lock);
 }
 
+/* Recover the ATTRI if necessary. */
+STATIC int
+xlog_recover_process_attri(
+	struct xfs_mount                *mp,
+	struct xfs_ail                  *ailp,
+	struct xfs_log_item             *lip)
+{
+	struct xfs_attri_log_item       *attrip;
+	int                             error;
+
+	/*
+	 * Skip ATTRIs that we've already processed.
+	 */
+	attrip = container_of(lip, struct xfs_attri_log_item, item);
+	if (test_bit(XFS_ATTRI_RECOVERED, &attrip->flags))
+		return 0;
+
+	spin_unlock(&ailp->xa_lock);
+	error = xfs_attri_recover(mp, attrip);
+	spin_lock(&ailp->xa_lock);
+
+	return error;
+}
+
+/* Release the ATTRI since we're cancelling everything. */
+STATIC void
+xlog_recover_cancel_attri(
+	struct xfs_mount                *mp,
+	struct xfs_ail                  *ailp,
+	struct xfs_log_item             *lip)
+{
+	struct xfs_attri_log_item         *attrip;
+
+	attrip = container_of(lip, struct xfs_attri_log_item, item);
+
+	spin_unlock(&ailp->xa_lock);
+	xfs_attri_release(attrip);
+	spin_lock(&ailp->xa_lock);
+}
+
+
+
+
 /* Recover the RUI if necessary. */
 STATIC int
 xlog_recover_process_rui(
@@ -4861,6 +4997,10 @@ xlog_recover_process_intents(
 		case XFS_LI_EFI:
 			error = xlog_recover_process_efi(log->l_mp, ailp, lip);
 			break;
+		case XFS_LI_ATTRI:
+			error = xlog_recover_process_attri(log->l_mp,
+							   ailp, lip);
+			break;
 		case XFS_LI_RUI:
 			error = xlog_recover_process_rui(log->l_mp, ailp, lip);
 			break;
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 584cf2d..046ced4 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -2024,6 +2024,7 @@ init_xfs_fs(void)
 	xfs_rmap_update_init_defer_op();
 	xfs_refcount_update_init_defer_op();
 	xfs_bmap_update_init_defer_op();
+	xfs_attr_init_defer_op();
 
 	xfs_dir_startup();
 
diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
index 815b53d2..d003637 100644
--- a/fs/xfs/xfs_trans.h
+++ b/fs/xfs/xfs_trans.h
@@ -40,6 +40,9 @@ struct xfs_cud_log_item;
 struct xfs_defer_ops;
 struct xfs_bui_log_item;
 struct xfs_bud_log_item;
+struct xfs_attrd_log_item;
+struct xfs_attri_log_item;
+
 
 typedef struct xfs_log_item {
 	struct list_head		li_ail;		/* AIL pointers */
@@ -223,12 +226,22 @@ void		xfs_trans_dirty_buf(struct xfs_trans *, struct xfs_buf *);
 void		xfs_trans_log_inode(xfs_trans_t *, struct xfs_inode *, uint);
 
 void		xfs_extent_free_init_defer_op(void);
+void            xfs_attr_init_defer_op(void);
+
 struct xfs_efd_log_item	*xfs_trans_get_efd(struct xfs_trans *,
 				  struct xfs_efi_log_item *,
 				  uint);
 int		xfs_trans_free_extent(struct xfs_trans *,
 				      struct xfs_efd_log_item *, xfs_fsblock_t,
 				      xfs_extlen_t, struct xfs_owner_info *);
+struct xfs_attrd_log_item *
+xfs_trans_get_attrd(struct xfs_trans *tp,
+		    struct xfs_attri_log_item *attrip);
+int xfs_trans_attr(struct xfs_trans *tp, struct xfs_attrd_log_item *attrdp,
+			xfs_ino_t ino, uint32_t attr_op_flags, uint32_t flags,
+			uint32_t name_len, uint32_t value_len,
+			char *name, char *value);
+
 int		xfs_trans_commit(struct xfs_trans *);
 int		xfs_trans_roll(struct xfs_trans **);
 int		xfs_trans_roll_inode(struct xfs_trans **, struct xfs_inode *);
diff --git a/fs/xfs/xfs_trans_attr.c b/fs/xfs/xfs_trans_attr.c
new file mode 100644
index 0000000..39eb18d
--- /dev/null
+++ b/fs/xfs/xfs_trans_attr.c
@@ -0,0 +1,286 @@
+/*
+ * Copyright (c) 2017, Oracle Inc.
+ * All Rights Reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation Inc.
+ */
+#include "xfs.h"
+#include "xfs_fs.h"
+#include "xfs_shared.h"
+#include "xfs_format.h"
+#include "xfs_log_format.h"
+#include "xfs_trans_resv.h"
+#include "xfs_bit.h"
+#include "xfs_mount.h"
+#include "xfs_defer.h"
+#include "xfs_trans.h"
+#include "xfs_trans_priv.h"
+#include "xfs_attr_item.h"
+#include "xfs_alloc.h"
+#include "xfs_bmap.h"
+#include "xfs_trace.h"
+#include "libxfs/xfs_da_format.h"
+#include "xfs_da_btree.h"
+#include "xfs_attr.h"
+#include "xfs_inode.h"
+#include "xfs_icache.h"
+
+/*
+ * This routine is called to allocate an "extent free done"
+ * log item that will hold nextents worth of extents.  The
+ * caller must use all nextents extents, because we are not
+ * flexible about this at all.
+ */
+struct xfs_attrd_log_item *
+xfs_trans_get_attrd(struct xfs_trans		*tp,
+		  struct xfs_attri_log_item	*attrip)
+{
+	struct xfs_attrd_log_item			*attrdp;
+
+	ASSERT(tp != NULL);
+
+	attrdp = xfs_attrd_init(tp->t_mountp, attrip);
+	ASSERT(attrdp != NULL);
+
+	/*
+	 * Get a log_item_desc to point at the new item.
+	 */
+	xfs_trans_add_item(tp, &attrdp->item);
+	return attrdp;
+}
+
+/*
+ * Delete an attr and log it to the ATTRD. Note that the transaction is marked
+ * dirty regardless of whether the attr delete succeeds or fails to support the
+ * ATTRI/ATTRD lifecycle rules.
+ */
+int
+xfs_trans_attr(
+	struct xfs_trans	*tp,
+	struct xfs_attrd_log_item	*attrdp,
+	xfs_ino_t		ino,
+	uint32_t		op_flags,
+	uint32_t                flags,
+	uint32_t		name_len,
+	uint32_t		value_len,
+	char			*name,
+	char			*value)
+{
+	uint			next_attr;
+	struct xfs_attr_log_format *attrp;
+	int			error;
+	int                     local;
+	struct xfs_da_args      args;
+	struct xfs_inode	*dp;
+	struct xfs_defer_ops    dfops;
+	xfs_fsblock_t		firstblock = NULLFSBLOCK;
+	struct xfs_mount	*mp = tp->t_mountp;
+
+	error = xfs_iget(mp, tp, ino, flags, 0, &dp);
+	if (error)
+		return error;
+
+	ASSERT(XFS_IFORK_Q((dp)));
+	tp->t_flags |= XFS_TRANS_RESERVE;
+
+	error = xfs_attr_args_init(&args, dp, name, flags);
+	if (error)
+		return error;
+
+	args.name = name;
+	args.namelen = name_len;
+	args.hashval = xfs_da_hashname(args.name, args.namelen);
+	args.value = value;
+	args.valuelen = value_len;
+	args.dfops = &dfops;
+	args.firstblock = &firstblock;
+	args.op_flags = XFS_DA_OP_OKNOENT;
+	args.total = xfs_attr_calc_size(&args, &local);
+	args.trans = tp;
+	ASSERT(local);
+
+	xfs_ilock(dp, XFS_ILOCK_EXCL);
+	xfs_defer_init(args.dfops, args.firstblock);
+
+	if (op_flags & ATTR_OP_FLAGS_SET) {
+		args.op_flags |= XFS_DA_OP_ADDNAME;
+		error = xfs_attr_set_args(&args, flags, false);
+	} else if (op_flags & ATTR_OP_FLAGS_REMOVE) {
+		error = xfs_attr_remove_args(&args, flags);
+	} else {
+		ASSERT(0);
+	}
+
+	if (error)
+		xfs_defer_cancel(&dfops);
+
+	xfs_iunlock(dp, XFS_ILOCK_EXCL);
+
+	/*
+	 * Mark the transaction dirty, even on error. This ensures the
+	 * transaction is aborted, which:
+	 *
+	 * 1.) releases the ATTRI and frees the ATTRD
+	 * 2.) shuts down the filesystem
+	 */
+	tp->t_flags |= XFS_TRANS_DIRTY;
+	attrdp->item.li_desc->lid_flags |= XFS_LID_DIRTY;
+
+	next_attr = attrdp->next_attr;
+	attrp = &(attrdp->format);
+	attrp->ino = ino;
+	attrp->op_flags = op_flags;
+	attrp->value_len = value_len;
+	attrp->name_len = name_len;
+	attrp->attr_flags = flags;
+
+	attrdp->name = name;
+	attrdp->value = value;
+	attrdp->name_len = name_len;
+	attrdp->value_len = value_len;
+	attrdp->next_attr++;
+
+	return error;
+}
+
+static int
+xfs_attr_diff_items(
+	void				*priv,
+	struct list_head		*a,
+	struct list_head		*b)
+{
+	return 0;
+}
+
+/* Get an ATTRI. */
+STATIC void *
+xfs_attr_create_intent(
+	struct xfs_trans		*tp,
+	unsigned int			count)
+{
+	struct xfs_attri_log_item		*attrip;
+
+	ASSERT(tp != NULL);
+	ASSERT(count > 0);
+
+	attrip = xfs_attri_init(tp->t_mountp);
+	ASSERT(attrip != NULL);
+
+	/*
+	 * Get a log_item_desc to point at the new item.
+	 */
+	xfs_trans_add_item(tp, &attrip->item);
+	return attrip;
+}
+
+/* Log an attr to the intent item. */
+STATIC void
+xfs_attr_log_item(
+	struct xfs_trans		*tp,
+	void				*intent,
+	struct list_head		*item)
+{
+	struct xfs_attri_log_item	*attrip = intent;
+	struct xfs_attr_item		*free;
+	struct xfs_attr_log_format	*attrp;
+
+	free = container_of(item, struct xfs_attr_item, xattri_list);
+
+	tp->t_flags |= XFS_TRANS_DIRTY;
+	attrip->item.li_desc->lid_flags |= XFS_LID_DIRTY;
+
+	attrp = &attrip->format;
+	attrp->ino = free->xattri_ino;
+	attrp->op_flags = free->xattri_op_flags;
+	attrp->value_len = free->xattri_value_len;
+	attrp->name_len = free->xattri_name_len;
+	attrp->attr_flags = free->xattri_flags;
+
+	attrip->name = &(free->xattri_name[0]);
+	attrip->value = &(free->xattri_value[0]);
+	attrip->name_len = free->xattri_name_len;
+	attrip->value_len = free->xattri_value_len;
+}
+
+/* Get an ATTRD so we can process all the attrs. */
+STATIC void *
+xfs_attr_create_done(
+	struct xfs_trans		*tp,
+	void				*intent,
+	unsigned int			count)
+{
+	return xfs_trans_get_attrd(tp, intent);
+}
+
+/* Process an attr. */
+STATIC int
+xfs_attr_finish_item(
+	struct xfs_trans		*tp,
+	struct xfs_defer_ops		*dop,
+	struct list_head		*item,
+	void				*done_item,
+	void				**state)
+{
+	struct xfs_attr_item	*free;
+	int				error;
+
+	free = container_of(item, struct xfs_attr_item, xattri_list);
+	error = xfs_trans_attr(tp, done_item,
+			free->xattri_ino,
+			free->xattri_op_flags,
+			free->xattri_flags,
+			free->xattri_name_len,
+			free->xattri_value_len,
+			free->xattri_name,
+			free->xattri_value);
+	kmem_free(free);
+	return error;
+}
+
+/* Abort all pending EFIs. */
+STATIC void
+xfs_attr_abort_intent(
+	void				*intent)
+{
+	xfs_attri_release(intent);
+}
+
+/* Cancel an attr */
+STATIC void
+xfs_attr_cancel_item(
+	struct list_head		*item)
+{
+	struct xfs_attr_item	*free;
+
+	free = container_of(item, struct xfs_attr_item, xattri_list);
+	kmem_free(free);
+}
+
+static const struct xfs_defer_op_type xfs_attr_defer_type = {
+	.type		= XFS_DEFER_OPS_TYPE_ATTR,
+	.max_items	= XFS_ATTRI_MAX_FAST_ATTRS,
+	.diff_items	= xfs_attr_diff_items,
+	.create_intent	= xfs_attr_create_intent,
+	.abort_intent	= xfs_attr_abort_intent,
+	.log_item	= xfs_attr_log_item,
+	.create_done	= xfs_attr_create_done,
+	.finish_item	= xfs_attr_finish_item,
+	.cancel_item	= xfs_attr_cancel_item,
+};
+
+/* Register the deferred op type. */
+void
+xfs_attr_init_defer_op(void)
+{
+	xfs_defer_init_op_type(&xfs_attr_defer_type);
+}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH 03/17] Add xfs_attr_set_defered and xfs_attr_remove_defered
  2017-10-18 22:55 [PATCH 00/17] Parent Pointers V3 Allison Henderson
  2017-10-18 22:55 ` [PATCH 01/17] Add helper functions xfs_attr_set_args and xfs_attr_remove_args Allison Henderson
  2017-10-18 22:55 ` [PATCH 02/17] Set up infastructure for deferred attribute operations Allison Henderson
@ 2017-10-18 22:55 ` Allison Henderson
  2017-10-19 19:13   ` Darrick J. Wong
  2017-10-18 22:55 ` [PATCH 04/17] Remove all strlen calls in all xfs_attr_* functions for attr names Allison Henderson
                   ` (14 subsequent siblings)
  17 siblings, 1 reply; 66+ messages in thread
From: Allison Henderson @ 2017-10-18 22:55 UTC (permalink / raw)
  To: linux-xfs; +Cc: Allison Henderson

These routines set up set and start a new deferred attribute
operation.  These functions are meant to be called by other
code needing to initiate a deferred attribute operation.  We
will use these routines later in the parent pointer patches.

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/libxfs/xfs_attr.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++
 fs/xfs/xfs_attr.h        |  7 ++++++
 2 files changed, 65 insertions(+)

diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index 5325ec2..59f3502 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -458,6 +458,37 @@ xfs_attr_set(
 	return error;
 }
 
+int
+xfs_attr_set_deferred(
+	struct xfs_inode	*dp,
+	struct xfs_defer_ops    *dfops,
+	const unsigned char	*name,
+	unsigned int		namelen,
+	unsigned char		*value,
+	unsigned int		valuelen,
+	int			flags)
+{
+
+	struct xfs_attr_item     *new;
+
+	ASSERT(namelen != 0);
+	ASSERT(valuelen != 0);
+
+	new = kmem_alloc(sizeof(struct xfs_attr_item), KM_SLEEP|KM_NOFS);
+	memset(new, 0, sizeof(struct xfs_attr_item));
+	new->xattri_ino = dp->i_ino;
+	new->xattri_op_flags = ATTR_OP_FLAGS_SET;
+	new->xattri_name_len = namelen;
+	new->xattri_value_len = valuelen;
+	new->xattri_flags = flags;
+	memcpy(new->xattri_name, name, namelen);
+	memcpy(&new->xattri_value, value, valuelen);
+
+	xfs_defer_add(dfops, XFS_DEFER_OPS_TYPE_ATTR, &new->xattri_list);
+
+	return 0;
+}
+
 /*
  * Generic handler routine to remove a name from an attribute list.
  * Transitions attribute list from Btree to shortform as necessary.
@@ -531,6 +562,33 @@ xfs_attr_remove(
 	return error;
 }
 
+int
+xfs_attr_remove_deferred(
+	struct xfs_inode        *dp,
+	struct xfs_defer_ops    *dfops,
+	const unsigned char     *name,
+	unsigned int		namelen,
+	int                     flags)
+{
+
+	struct xfs_attr_item     *new;
+
+	ASSERT(namelen != 0);
+
+	new = kmem_alloc(sizeof(struct xfs_attr_item), KM_SLEEP|KM_NOFS);
+	memset(new, 0, sizeof(struct xfs_attr_item));
+	new->xattri_ino = dp->i_ino;
+	new->xattri_op_flags = ATTR_OP_FLAGS_REMOVE;
+	new->xattri_name_len = namelen;
+	new->xattri_value_len = 0;
+	new->xattri_flags = flags;
+	memcpy(new->xattri_name, name, namelen);
+
+	xfs_defer_add(dfops, XFS_DEFER_OPS_TYPE_ATTR, &new->xattri_list);
+
+	return 0;
+}
+
 /*========================================================================
  * External routines when attribute list is inside the inode
  *========================================================================*/
diff --git a/fs/xfs/xfs_attr.h b/fs/xfs/xfs_attr.h
index 34bb4cb..f4a53fd 100644
--- a/fs/xfs/xfs_attr.h
+++ b/fs/xfs/xfs_attr.h
@@ -173,5 +173,12 @@ int xfs_attr_list(struct xfs_inode *dp, char *buffer, int bufsize,
 int xfs_attr_args_init(struct xfs_da_args *args, struct xfs_inode *dp,
 		       const unsigned char *name, int flags);
 int xfs_attr_calc_size(struct xfs_da_args *args, int *local);
+int xfs_attr_set_deferred(struct xfs_inode *dp, struct xfs_defer_ops *dfops,
+			  const unsigned char *name, unsigned int name_len,
+			  unsigned char *value, unsigned int valuelen,
+			  int flags);
+int xfs_attr_remove_deferred(struct xfs_inode *dp, struct xfs_defer_ops *dfops,
+			    const unsigned char *name, unsigned int namelen,
+			    int flags);
 
 #endif	/* __XFS_ATTR_H__ */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH 04/17] Remove all strlen calls in all xfs_attr_* functions for attr names.
  2017-10-18 22:55 [PATCH 00/17] Parent Pointers V3 Allison Henderson
                   ` (2 preceding siblings ...)
  2017-10-18 22:55 ` [PATCH 03/17] Add xfs_attr_set_defered and xfs_attr_remove_defered Allison Henderson
@ 2017-10-18 22:55 ` Allison Henderson
  2017-10-19 19:15   ` Darrick J. Wong
  2017-10-18 22:55 ` [PATCH 05/17] xfs: get directory offset when adding directory name Allison Henderson
                   ` (13 subsequent siblings)
  17 siblings, 1 reply; 66+ messages in thread
From: Allison Henderson @ 2017-10-18 22:55 UTC (permalink / raw)
  To: linux-xfs; +Cc: Allison Henderson

Parent pointer attributes use a binary name, so strlen will not work.
Calling functions will need to pass in the name length

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/libxfs/xfs_attr.c | 12 ++++++++----
 fs/xfs/xfs_acl.c         | 12 +++++++-----
 fs/xfs/xfs_attr.h        |  9 +++++----
 fs/xfs/xfs_ioctl.c       | 13 ++++++++++---
 fs/xfs/xfs_iops.c        |  6 ++++--
 fs/xfs/xfs_trans_attr.c  |  2 +-
 fs/xfs/xfs_xattr.c       | 10 +++++++---
 7 files changed, 42 insertions(+), 22 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index 59f3502..b94f0cd 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -79,6 +79,7 @@ xfs_attr_args_init(
 	struct xfs_da_args	*args,
 	struct xfs_inode	*dp,
 	const unsigned char	*name,
+	int			namelen,
 	int			flags)
 {
 
@@ -91,7 +92,7 @@ xfs_attr_args_init(
 	args->dp = dp;
 	args->flags = flags;
 	args->name = name;
-	args->namelen = strlen((const char *)name);
+	args->namelen = namelen;
 	if (args->namelen >= MAXNAMELEN)
 		return -EFAULT;		/* match IRIX behaviour */
 
@@ -137,6 +138,7 @@ int
 xfs_attr_get(
 	struct xfs_inode	*ip,
 	const unsigned char	*name,
+	int			namelen,
 	unsigned char		*value,
 	int			*valuelenp,
 	int			flags)
@@ -150,7 +152,7 @@ xfs_attr_get(
 	if (XFS_FORCED_SHUTDOWN(ip->i_mount))
 		return -EIO;
 
-	error = xfs_attr_args_init(&args, ip, name, flags);
+	error = xfs_attr_args_init(&args, ip, name, namelen, flags);
 	if (error)
 		return error;
 
@@ -386,6 +388,7 @@ int
 xfs_attr_set(
 	struct xfs_inode	*dp,
 	const unsigned char	*name,
+	int			namelen,
 	unsigned char		*value,
 	int			valuelen,
 	int			flags)
@@ -402,7 +405,7 @@ xfs_attr_set(
 	if (XFS_FORCED_SHUTDOWN(dp->i_mount))
 		return -EIO;
 
-	error = xfs_attr_args_init(&args, dp, name, flags);
+	error = xfs_attr_args_init(&args, dp, name, namelen, flags);
 	if (error)
 		return error;
 
@@ -497,6 +500,7 @@ int
 xfs_attr_remove(
 	struct xfs_inode	*dp,
 	const unsigned char	*name,
+	int			namelen,
 	int			flags)
 {
 	struct xfs_mount	*mp = dp->i_mount;
@@ -510,7 +514,7 @@ xfs_attr_remove(
 	if (XFS_FORCED_SHUTDOWN(dp->i_mount))
 		return -EIO;
 
-	error = xfs_attr_args_init(&args, dp, name, flags);
+	error = xfs_attr_args_init(&args, dp, name, namelen, flags);
 	if (error)
 		return error;
 
diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c
index 7034e17..72eca24 100644
--- a/fs/xfs/xfs_acl.c
+++ b/fs/xfs/xfs_acl.c
@@ -153,8 +153,8 @@ xfs_get_acl(struct inode *inode, int type)
 	if (!xfs_acl)
 		return ERR_PTR(-ENOMEM);
 
-	error = xfs_attr_get(ip, ea_name, (unsigned char *)xfs_acl,
-							&len, ATTR_ROOT);
+	error = xfs_attr_get(ip, ea_name, strlen((const char *)ea_name),
+			     (unsigned char *)xfs_acl, &len, ATTR_ROOT);
 	if (error) {
 		/*
 		 * If the attribute doesn't exist make sure we have a negative
@@ -204,15 +204,17 @@ __xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type)
 		len -= sizeof(struct xfs_acl_entry) *
 			 (XFS_ACL_MAX_ENTRIES(ip->i_mount) - acl->a_count);
 
-		error = xfs_attr_set(ip, ea_name, (unsigned char *)xfs_acl,
-				len, ATTR_ROOT);
+		error = xfs_attr_set(ip, ea_name, strlen((const char *)ea_name),
+				     (unsigned char *)xfs_acl, len, ATTR_ROOT);
 
 		kmem_free(xfs_acl);
 	} else {
 		/*
 		 * A NULL ACL argument means we want to remove the ACL.
 		 */
-		error = xfs_attr_remove(ip, ea_name, ATTR_ROOT);
+		error = xfs_attr_remove(ip, ea_name,
+					strlen((const char *)ea_name),
+					ATTR_ROOT);
 
 		/*
 		 * If the attribute didn't exist to start with that's fine.
diff --git a/fs/xfs/xfs_attr.h b/fs/xfs/xfs_attr.h
index f4a53fd..532567e 100644
--- a/fs/xfs/xfs_attr.h
+++ b/fs/xfs/xfs_attr.h
@@ -161,17 +161,18 @@ int xfs_attr_list_int_ilocked(struct xfs_attr_list_context *);
 int xfs_attr_list_int(struct xfs_attr_list_context *);
 int xfs_inode_hasattr(struct xfs_inode *ip);
 int xfs_attr_get_ilocked(struct xfs_inode *ip, struct xfs_da_args *args);
-int xfs_attr_get(struct xfs_inode *ip, const unsigned char *name,
+int xfs_attr_get(struct xfs_inode *ip, const unsigned char *name, int namelen,
 		 unsigned char *value, int *valuelenp, int flags);
-int xfs_attr_set(struct xfs_inode *dp, const unsigned char *name,
+int xfs_attr_set(struct xfs_inode *dp, const unsigned char *name, int namelen,
 		 unsigned char *value, int valuelen, int flags);
 int xfs_attr_set_args(struct xfs_da_args *args, int flags, bool roll_trans);
-int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name, int flags);
+int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name,
+		    int namelen, int flags);
 int xfs_attr_remove_args(struct xfs_da_args *args, int flags);
 int xfs_attr_list(struct xfs_inode *dp, char *buffer, int bufsize,
 		  int flags, struct attrlist_cursor_kern *cursor);
 int xfs_attr_args_init(struct xfs_da_args *args, struct xfs_inode *dp,
-		       const unsigned char *name, int flags);
+		       const unsigned char *name, int namelen, int flags);
 int xfs_attr_calc_size(struct xfs_da_args *args, int *local);
 int xfs_attr_set_deferred(struct xfs_inode *dp, struct xfs_defer_ops *dfops,
 			  const unsigned char *name, unsigned int name_len,
diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index aa75389..1c9f813 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -448,6 +448,7 @@ xfs_attrmulti_attr_get(
 {
 	unsigned char		*kbuf;
 	int			error = -EFAULT;
+	int			namelen;
 
 	if (*len > XFS_XATTR_SIZE_MAX)
 		return -EINVAL;
@@ -455,7 +456,9 @@ xfs_attrmulti_attr_get(
 	if (!kbuf)
 		return -ENOMEM;
 
-	error = xfs_attr_get(XFS_I(inode), name, kbuf, (int *)len, flags);
+	namelen = strlen((const char *)name);
+	error = xfs_attr_get(XFS_I(inode), name, namelen,
+			     kbuf, (int *)len, flags);
 	if (error)
 		goto out_kfree;
 
@@ -477,6 +480,7 @@ xfs_attrmulti_attr_set(
 {
 	unsigned char		*kbuf;
 	int			error;
+	int			namelen;
 
 	if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
 		return -EPERM;
@@ -487,7 +491,8 @@ xfs_attrmulti_attr_set(
 	if (IS_ERR(kbuf))
 		return PTR_ERR(kbuf);
 
-	error = xfs_attr_set(XFS_I(inode), name, kbuf, len, flags);
+	namelen = strlen((const char *)name);
+	error = xfs_attr_set(XFS_I(inode), name, namelen, kbuf, len, flags);
 	if (!error)
 		xfs_forget_acl(inode, name, flags);
 	kfree(kbuf);
@@ -501,10 +506,12 @@ xfs_attrmulti_attr_remove(
 	uint32_t		flags)
 {
 	int			error;
+	int			namelen;
 
 	if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
 		return -EPERM;
-	error = xfs_attr_remove(XFS_I(inode), name, flags);
+	namelen = strlen((const char *)name);
+	error = xfs_attr_remove(XFS_I(inode), name, namelen, flags);
 	if (!error)
 		xfs_forget_acl(inode, name, flags);
 	return error;
diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index 17081c7..5247bfc 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -70,8 +70,10 @@ xfs_initxattrs(
 	int			error = 0;
 
 	for (xattr = xattr_array; xattr->name != NULL; xattr++) {
-		error = xfs_attr_set(ip, xattr->name, xattr->value,
-				      xattr->value_len, ATTR_SECURE);
+		error = xfs_attr_set(ip, xattr->name,
+				     strlen((const char *)xattr->name),
+				     xattr->value, xattr->value_len,
+				     ATTR_SECURE);
 		if (error < 0)
 			break;
 	}
diff --git a/fs/xfs/xfs_trans_attr.c b/fs/xfs/xfs_trans_attr.c
index 39eb18d..a45e9d0 100644
--- a/fs/xfs/xfs_trans_attr.c
+++ b/fs/xfs/xfs_trans_attr.c
@@ -93,7 +93,7 @@ xfs_trans_attr(
 	ASSERT(XFS_IFORK_Q((dp)));
 	tp->t_flags |= XFS_TRANS_RESERVE;
 
-	error = xfs_attr_args_init(&args, dp, name, flags);
+	error = xfs_attr_args_init(&args, dp, name, name_len, flags);
 	if (error)
 		return error;
 
diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c
index 0594db4..4ef09c4 100644
--- a/fs/xfs/xfs_xattr.c
+++ b/fs/xfs/xfs_xattr.c
@@ -38,6 +38,7 @@ xfs_xattr_get(const struct xattr_handler *handler, struct dentry *unused,
 	int xflags = handler->flags;
 	struct xfs_inode *ip = XFS_I(inode);
 	int error, asize = size;
+	int namelen = strlen((const char *)name);
 
 	/* Convert Linux syscall to XFS internal ATTR flags */
 	if (!size) {
@@ -45,7 +46,8 @@ xfs_xattr_get(const struct xattr_handler *handler, struct dentry *unused,
 		value = NULL;
 	}
 
-	error = xfs_attr_get(ip, (unsigned char *)name, value, &asize, xflags);
+	error = xfs_attr_get(ip, (unsigned char *)name, namelen, value,
+			     &asize, xflags);
 	if (error)
 		return error;
 	return asize;
@@ -81,6 +83,7 @@ xfs_xattr_set(const struct xattr_handler *handler, struct dentry *unused,
 	int			xflags = handler->flags;
 	struct xfs_inode	*ip = XFS_I(inode);
 	int			error;
+	int			namelen = strlen((const char *)name);
 
 	/* Convert Linux syscall to XFS internal ATTR flags */
 	if (flags & XATTR_CREATE)
@@ -89,8 +92,9 @@ xfs_xattr_set(const struct xattr_handler *handler, struct dentry *unused,
 		xflags |= ATTR_REPLACE;
 
 	if (!value)
-		return xfs_attr_remove(ip, (unsigned char *)name, xflags);
-	error = xfs_attr_set(ip, (unsigned char *)name,
+		return xfs_attr_remove(ip, (unsigned char *)name,
+				       namelen, xflags);
+	error = xfs_attr_set(ip, (unsigned char *)name, namelen,
 				(void *)value, size, xflags);
 	if (!error)
 		xfs_forget_acl(inode, name, xflags);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH 05/17] xfs: get directory offset when adding directory name
  2017-10-18 22:55 [PATCH 00/17] Parent Pointers V3 Allison Henderson
                   ` (3 preceding siblings ...)
  2017-10-18 22:55 ` [PATCH 04/17] Remove all strlen calls in all xfs_attr_* functions for attr names Allison Henderson
@ 2017-10-18 22:55 ` Allison Henderson
  2017-10-18 22:55 ` [PATCH 06/17] xfs: get directory offset when removing " Allison Henderson
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 66+ messages in thread
From: Allison Henderson @ 2017-10-18 22:55 UTC (permalink / raw)
  To: linux-xfs; +Cc: Mark Tinguely, Dave Chinner, Allison Henderson

From: Mark Tinguely <tinguely@sgi.com>

Return the directory offset information when adding an entry to the
directory.

This offset will be used as the parent pointer offset in xfs_create,
xfs_symlink, xfs_link and xfs_rename.

[dchinner: forward ported and cleaned up]
[dchinner: no s-o-b from Mark]
[bfoster: rebased, use args->geo in dir code]
[achender: rebased, chaged __uint32_t to xfs_dir2_dataptr_t]

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/libxfs/xfs_da_btree.h   | 1 +
 fs/xfs/libxfs/xfs_dir2.c       | 8 ++++++--
 fs/xfs/libxfs/xfs_dir2.h       | 3 ++-
 fs/xfs/libxfs/xfs_dir2_block.c | 1 +
 fs/xfs/libxfs/xfs_dir2_leaf.c  | 2 ++
 fs/xfs/libxfs/xfs_dir2_node.c  | 2 ++
 fs/xfs/libxfs/xfs_dir2_sf.c    | 2 ++
 fs/xfs/xfs_inode.c             | 9 +++++----
 fs/xfs/xfs_symlink.c           | 2 +-
 9 files changed, 22 insertions(+), 8 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_da_btree.h b/fs/xfs/libxfs/xfs_da_btree.h
index ae6de17..bce96d6 100644
--- a/fs/xfs/libxfs/xfs_da_btree.h
+++ b/fs/xfs/libxfs/xfs_da_btree.h
@@ -86,6 +86,7 @@ typedef struct xfs_da_args {
 	int		rmtvaluelen2;	/* remote attr value length in bytes */
 	int		op_flags;	/* operation flags */
 	enum xfs_dacmp	cmpresult;	/* name compare result for lookups */
+	xfs_dir2_dataptr_t offset;	/* OUT: offset in directory */
 } xfs_da_args_t;
 
 /*
diff --git a/fs/xfs/libxfs/xfs_dir2.c b/fs/xfs/libxfs/xfs_dir2.c
index ccf9783..a1ca460 100644
--- a/fs/xfs/libxfs/xfs_dir2.c
+++ b/fs/xfs/libxfs/xfs_dir2.c
@@ -268,7 +268,8 @@ xfs_dir_createname(
 	xfs_ino_t		inum,		/* new entry inode number */
 	xfs_fsblock_t		*first,		/* bmap's firstblock */
 	struct xfs_defer_ops	*dfops,		/* bmap's freeblock list */
-	xfs_extlen_t		total)		/* bmap's total block count */
+	xfs_extlen_t		total,		/* bmap's total block count */
+	xfs_dir2_dataptr_t	*offset)	/* OUT entry's dir offset */
 {
 	struct xfs_da_args	*args;
 	int			rval;
@@ -323,6 +324,9 @@ xfs_dir_createname(
 	else
 		rval = xfs_dir2_node_addname(args);
 
+	/* return the location that this entry was place in the parent inode */
+	if (offset)
+		*offset = args->offset;
 out_free:
 	kmem_free(args);
 	return rval;
@@ -570,7 +574,7 @@ xfs_dir_canenter(
 	xfs_inode_t	*dp,
 	struct xfs_name	*name)		/* name of entry to add */
 {
-	return xfs_dir_createname(tp, dp, name, 0, NULL, NULL, 0);
+	return xfs_dir_createname(tp, dp, name, 0, NULL, NULL, 0, NULL);
 }
 
 /*
diff --git a/fs/xfs/libxfs/xfs_dir2.h b/fs/xfs/libxfs/xfs_dir2.h
index 21c8f8b..e349900 100644
--- a/fs/xfs/libxfs/xfs_dir2.h
+++ b/fs/xfs/libxfs/xfs_dir2.h
@@ -131,7 +131,8 @@ extern int xfs_dir_init(struct xfs_trans *tp, struct xfs_inode *dp,
 extern int xfs_dir_createname(struct xfs_trans *tp, struct xfs_inode *dp,
 				struct xfs_name *name, xfs_ino_t inum,
 				xfs_fsblock_t *first,
-				struct xfs_defer_ops *dfops, xfs_extlen_t tot);
+				struct xfs_defer_ops *dfops, xfs_extlen_t tot,
+				xfs_dir2_dataptr_t *offset);
 extern int xfs_dir_lookup(struct xfs_trans *tp, struct xfs_inode *dp,
 				struct xfs_name *name, xfs_ino_t *inum,
 				struct xfs_name *ci_name);
diff --git a/fs/xfs/libxfs/xfs_dir2_block.c b/fs/xfs/libxfs/xfs_dir2_block.c
index 43c902f..79684d5 100644
--- a/fs/xfs/libxfs/xfs_dir2_block.c
+++ b/fs/xfs/libxfs/xfs_dir2_block.c
@@ -552,6 +552,7 @@ xfs_dir2_block_addname(
 	dp->d_ops->data_put_ftype(dep, args->filetype);
 	tagp = dp->d_ops->data_entry_tag_p(dep);
 	*tagp = cpu_to_be16((char *)dep - (char *)hdr);
+	args->offset = xfs_dir2_byte_to_dataptr((char *)dep - (char *)hdr);
 	/*
 	 * Clean up the bestfree array and log the header, tail, and entry.
 	 */
diff --git a/fs/xfs/libxfs/xfs_dir2_leaf.c b/fs/xfs/libxfs/xfs_dir2_leaf.c
index 27297a6..2ac7a7e 100644
--- a/fs/xfs/libxfs/xfs_dir2_leaf.c
+++ b/fs/xfs/libxfs/xfs_dir2_leaf.c
@@ -863,6 +863,8 @@ xfs_dir2_leaf_addname(
 	dp->d_ops->data_put_ftype(dep, args->filetype);
 	tagp = dp->d_ops->data_entry_tag_p(dep);
 	*tagp = cpu_to_be16((char *)dep - (char *)hdr);
+	args->offset = xfs_dir2_db_off_to_dataptr(args->geo, use_block,
+						(char *)dep - (char *)hdr);
 	/*
 	 * Need to scan fix up the bestfree table.
 	 */
diff --git a/fs/xfs/libxfs/xfs_dir2_node.c b/fs/xfs/libxfs/xfs_dir2_node.c
index 682e2bf..8bc91f8 100644
--- a/fs/xfs/libxfs/xfs_dir2_node.c
+++ b/fs/xfs/libxfs/xfs_dir2_node.c
@@ -2022,6 +2022,8 @@ xfs_dir2_node_addname_int(
 	dp->d_ops->data_put_ftype(dep, args->filetype);
 	tagp = dp->d_ops->data_entry_tag_p(dep);
 	*tagp = cpu_to_be16((char *)dep - (char *)hdr);
+	args->offset = xfs_dir2_db_off_to_dataptr(args->geo, dbno,
+						  (char *)dep - (char *)hdr);
 	xfs_dir2_data_log_entry(args, dbp, dep);
 	/*
 	 * Rescan the block for bestfree if needed.
diff --git a/fs/xfs/libxfs/xfs_dir2_sf.c b/fs/xfs/libxfs/xfs_dir2_sf.c
index be8b975..489bdef 100644
--- a/fs/xfs/libxfs/xfs_dir2_sf.c
+++ b/fs/xfs/libxfs/xfs_dir2_sf.c
@@ -407,6 +407,7 @@ xfs_dir2_sf_addname_easy(
 	memcpy(sfep->name, args->name, sfep->namelen);
 	dp->d_ops->sf_put_ino(sfp, sfep, args->inumber);
 	dp->d_ops->sf_put_ftype(sfep, args->filetype);
+	args->offset = xfs_dir2_byte_to_dataptr(offset);
 
 	/*
 	 * Update the header and inode.
@@ -498,6 +499,7 @@ xfs_dir2_sf_addname_hard(
 	memcpy(sfep->name, args->name, sfep->namelen);
 	dp->d_ops->sf_put_ino(sfp, sfep, args->inumber);
 	dp->d_ops->sf_put_ftype(sfep, args->filetype);
+	args->offset = xfs_dir2_byte_to_dataptr(offset);
 	sfp->count++;
 	if (args->inumber > XFS_DIR2_MAX_SHORT_INUM && !objchange)
 		sfp->i8count++;
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 4ec5b7f..3abcb17 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -1252,7 +1252,8 @@ xfs_create(
 
 	error = xfs_dir_createname(tp, dp, name, ip->i_ino,
 					&first_block, &dfops, resblks ?
-					resblks - XFS_IALLOC_SPACE_RES(mp) : 0);
+					resblks - XFS_IALLOC_SPACE_RES(mp) : 0,
+					NULL);
 	if (error) {
 		ASSERT(error != -ENOSPC);
 		goto out_trans_cancel;
@@ -1495,7 +1496,7 @@ xfs_link(
 	}
 
 	error = xfs_dir_createname(tp, tdp, target_name, sip->i_ino,
-					&first_block, &dfops, resblks);
+				   &first_block, &dfops, resblks, NULL);
 	if (error)
 		goto error_return;
 	xfs_trans_ichgtime(tp, tdp, XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG);
@@ -3031,8 +3032,8 @@ xfs_rename(
 		 * to account for the ".." reference from the new entry.
 		 */
 		error = xfs_dir_createname(tp, target_dp, target_name,
-						src_ip->i_ino, &first_block,
-						&dfops, spaceres);
+					   src_ip->i_ino, &first_block, &dfops,
+					   spaceres, NULL);
 		if (error)
 			goto out_bmap_cancel;
 
diff --git a/fs/xfs/xfs_symlink.c b/fs/xfs/xfs_symlink.c
index 68d3ca2..fc803ae 100644
--- a/fs/xfs/xfs_symlink.c
+++ b/fs/xfs/xfs_symlink.c
@@ -363,7 +363,7 @@ xfs_symlink(
 	 * Create the directory entry for the symlink.
 	 */
 	error = xfs_dir_createname(tp, dp, link_name, ip->i_ino,
-					&first_block, &dfops, resblks);
+				   &first_block, &dfops, resblks, NULL);
 	if (error)
 		goto out_bmap_cancel;
 	xfs_trans_ichgtime(tp, dp, XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH 06/17] xfs: get directory offset when removing directory name
  2017-10-18 22:55 [PATCH 00/17] Parent Pointers V3 Allison Henderson
                   ` (4 preceding siblings ...)
  2017-10-18 22:55 ` [PATCH 05/17] xfs: get directory offset when adding directory name Allison Henderson
@ 2017-10-18 22:55 ` Allison Henderson
  2017-10-19 19:17   ` Darrick J. Wong
  2017-10-18 22:55 ` [PATCH 07/17] xfs: get directory offset when replacing a " Allison Henderson
                   ` (11 subsequent siblings)
  17 siblings, 1 reply; 66+ messages in thread
From: Allison Henderson @ 2017-10-18 22:55 UTC (permalink / raw)
  To: linux-xfs; +Cc: Mark Tinguely, Dave Chinner, Allison Henderson

From: Mark Tinguely <tinguely@sgi.com>

Return the directory offset information when removing an entry to the
directory.

This offset will be used as the parent pointer offset in xfs_remove.

[dchinner: forward ported and cleaned up]
[achender: rebased, changed __unint32_t to xfs_dir2_dataptr_t]

Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
v2: Changed typedefs to raw struct types

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/libxfs/xfs_dir2.c       | 15 +++++++++------
 fs/xfs/libxfs/xfs_dir2.h       |  4 +++-
 fs/xfs/libxfs/xfs_dir2_block.c |  4 ++--
 fs/xfs/libxfs/xfs_dir2_leaf.c  |  5 +++--
 fs/xfs/libxfs/xfs_dir2_node.c  |  5 +++--
 fs/xfs/libxfs/xfs_dir2_sf.c    |  2 ++
 fs/xfs/xfs_inode.c             |  7 ++++---
 7 files changed, 26 insertions(+), 16 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_dir2.c b/fs/xfs/libxfs/xfs_dir2.c
index a1ca460..0511eb9 100644
--- a/fs/xfs/libxfs/xfs_dir2.c
+++ b/fs/xfs/libxfs/xfs_dir2.c
@@ -443,13 +443,14 @@ xfs_dir_lookup(
  */
 int
 xfs_dir_removename(
-	xfs_trans_t	*tp,
-	xfs_inode_t	*dp,
-	struct xfs_name	*name,
-	xfs_ino_t	ino,
-	xfs_fsblock_t	*first,		/* bmap's firstblock */
+	struct xfs_trans	*tp,
+	struct xfs_inode	*dp,
+	struct xfs_name		*name,
+	xfs_ino_t		ino,
+	xfs_fsblock_t	*first,			/* bmap's firstblock */
 	struct xfs_defer_ops	*dfops,		/* bmap's freeblock list */
-	xfs_extlen_t	total)		/* bmap's total block count */
+	xfs_extlen_t		total,		/* bmap's total block count */
+	xfs_dir2_dataptr_t	*offset)	/* OUT: offset in directory */
 {
 	struct xfs_da_args *args;
 	int		rval;
@@ -495,6 +496,8 @@ xfs_dir_removename(
 		rval = xfs_dir2_leaf_removename(args);
 	else
 		rval = xfs_dir2_node_removename(args);
+	if (offset)
+		*offset = args->offset;
 out_free:
 	kmem_free(args);
 	return rval;
diff --git a/fs/xfs/libxfs/xfs_dir2.h b/fs/xfs/libxfs/xfs_dir2.h
index e349900..e1bd05d 100644
--- a/fs/xfs/libxfs/xfs_dir2.h
+++ b/fs/xfs/libxfs/xfs_dir2.h
@@ -139,7 +139,9 @@ extern int xfs_dir_lookup(struct xfs_trans *tp, struct xfs_inode *dp,
 extern int xfs_dir_removename(struct xfs_trans *tp, struct xfs_inode *dp,
 				struct xfs_name *name, xfs_ino_t ino,
 				xfs_fsblock_t *first,
-				struct xfs_defer_ops *dfops, xfs_extlen_t tot);
+				struct xfs_defer_ops *dfops,
+				xfs_extlen_t tot,
+				xfs_dir2_dataptr_t *offset);
 extern int xfs_dir_replace(struct xfs_trans *tp, struct xfs_inode *dp,
 				struct xfs_name *name, xfs_ino_t inum,
 				xfs_fsblock_t *first,
diff --git a/fs/xfs/libxfs/xfs_dir2_block.c b/fs/xfs/libxfs/xfs_dir2_block.c
index 79684d5..4dbe2fc 100644
--- a/fs/xfs/libxfs/xfs_dir2_block.c
+++ b/fs/xfs/libxfs/xfs_dir2_block.c
@@ -791,9 +791,9 @@ xfs_dir2_block_removename(
 	/*
 	 * Point to the data entry using the leaf entry.
 	 */
+	args->offset = be32_to_cpu(blp[ent].address);
 	dep = (xfs_dir2_data_entry_t *)((char *)hdr +
-			xfs_dir2_dataptr_to_off(args->geo,
-						be32_to_cpu(blp[ent].address)));
+			xfs_dir2_dataptr_to_off(args->geo, args->offset));
 	/*
 	 * Mark the data entry's space free.
 	 */
diff --git a/fs/xfs/libxfs/xfs_dir2_leaf.c b/fs/xfs/libxfs/xfs_dir2_leaf.c
index 2ac7a7e..197e627 100644
--- a/fs/xfs/libxfs/xfs_dir2_leaf.c
+++ b/fs/xfs/libxfs/xfs_dir2_leaf.c
@@ -1383,9 +1383,10 @@ xfs_dir2_leaf_removename(
 	 * Point to the leaf entry, use that to point to the data entry.
 	 */
 	lep = &ents[index];
-	db = xfs_dir2_dataptr_to_db(args->geo, be32_to_cpu(lep->address));
+	args->offset = be32_to_cpu(lep->address);
+	db = xfs_dir2_dataptr_to_db(args->geo, args->offset);
 	dep = (xfs_dir2_data_entry_t *)((char *)hdr +
-		xfs_dir2_dataptr_to_off(args->geo, be32_to_cpu(lep->address)));
+		xfs_dir2_dataptr_to_off(args->geo, args->offset));
 	needscan = needlog = 0;
 	oldbest = be16_to_cpu(bf[0].length);
 	ltp = xfs_dir2_leaf_tail_p(args->geo, leaf);
diff --git a/fs/xfs/libxfs/xfs_dir2_node.c b/fs/xfs/libxfs/xfs_dir2_node.c
index 8bc91f8..13d5244 100644
--- a/fs/xfs/libxfs/xfs_dir2_node.c
+++ b/fs/xfs/libxfs/xfs_dir2_node.c
@@ -1238,9 +1238,10 @@ xfs_dir2_leafn_remove(
 	/*
 	 * Extract the data block and offset from the entry.
 	 */
-	db = xfs_dir2_dataptr_to_db(args->geo, be32_to_cpu(lep->address));
+	args->offset = be32_to_cpu(lep->address);
+	db = xfs_dir2_dataptr_to_db(args->geo, args->offset);
 	ASSERT(dblk->blkno == db);
-	off = xfs_dir2_dataptr_to_off(args->geo, be32_to_cpu(lep->address));
+	off = xfs_dir2_dataptr_to_off(args->geo, args->offset);
 	ASSERT(dblk->index == off);
 
 	/*
diff --git a/fs/xfs/libxfs/xfs_dir2_sf.c b/fs/xfs/libxfs/xfs_dir2_sf.c
index 489bdef..9e90c22 100644
--- a/fs/xfs/libxfs/xfs_dir2_sf.c
+++ b/fs/xfs/libxfs/xfs_dir2_sf.c
@@ -919,6 +919,8 @@ xfs_dir2_sf_removename(
 								XFS_CMP_EXACT) {
 			ASSERT(dp->d_ops->sf_get_ino(sfp, sfep) ==
 			       args->inumber);
+			args->offset = xfs_dir2_byte_to_dataptr(
+						xfs_dir2_sf_get_offset(sfep));
 			break;
 		}
 	}
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 3abcb17..358a98a 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -2639,8 +2639,8 @@ xfs_remove(
 		goto out_trans_cancel;
 
 	xfs_defer_init(&dfops, &first_block);
-	error = xfs_dir_removename(tp, dp, name, ip->i_ino,
-					&first_block, &dfops, resblks);
+	error = xfs_dir_removename(tp, dp, name, ip->i_ino, &first_block,
+				   &dfops, resblks, NULL);
 	if (error) {
 		ASSERT(error != -ENOENT);
 		goto out_bmap_cancel;
@@ -3150,7 +3150,8 @@ xfs_rename(
 					&first_block, &dfops, spaceres);
 	} else
 		error = xfs_dir_removename(tp, src_dp, src_name, src_ip->i_ino,
-					   &first_block, &dfops, spaceres);
+					   &first_block, &dfops, spaceres,
+					   NULL);
 	if (error)
 		goto out_bmap_cancel;
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH 07/17] xfs: get directory offset when replacing a directory name
  2017-10-18 22:55 [PATCH 00/17] Parent Pointers V3 Allison Henderson
                   ` (5 preceding siblings ...)
  2017-10-18 22:55 ` [PATCH 06/17] xfs: get directory offset when removing " Allison Henderson
@ 2017-10-18 22:55 ` Allison Henderson
  2017-10-18 22:55 ` [PATCH 08/17] xfs: add parent pointer support to attribute code Allison Henderson
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 66+ messages in thread
From: Allison Henderson @ 2017-10-18 22:55 UTC (permalink / raw)
  To: linux-xfs; +Cc: Mark Tinguely, Dave Chinner, Allison Henderson

From: Mark Tinguely <tinguely@sgi.com>

Return the directory offset information when replacing an entry to the
directory.

This offset will be used as the parent pointer offset in xfs_rename.

[dchinner: forward ported and cleaned up]
[achender: rebased, changed __unint32_t to xfs_dir2_dataptr_t]

Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
v2: Changed typedefs to raw struct types

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/libxfs/xfs_dir2.c       | 16 ++++++++++------
 fs/xfs/libxfs/xfs_dir2.h       |  3 ++-
 fs/xfs/libxfs/xfs_dir2_block.c |  4 ++--
 fs/xfs/libxfs/xfs_dir2_leaf.c  |  1 +
 fs/xfs/libxfs/xfs_dir2_node.c  |  1 +
 fs/xfs/libxfs/xfs_dir2_sf.c    |  2 ++
 fs/xfs/xfs_inode.c             | 28 +++++++++++++---------------
 7 files changed, 31 insertions(+), 24 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_dir2.c b/fs/xfs/libxfs/xfs_dir2.c
index 0511eb9..486f808 100644
--- a/fs/xfs/libxfs/xfs_dir2.c
+++ b/fs/xfs/libxfs/xfs_dir2.c
@@ -508,13 +508,14 @@ xfs_dir_removename(
  */
 int
 xfs_dir_replace(
-	xfs_trans_t	*tp,
-	xfs_inode_t	*dp,
-	struct xfs_name	*name,		/* name of entry to replace */
-	xfs_ino_t	inum,		/* new inode number */
-	xfs_fsblock_t	*first,		/* bmap's firstblock */
+	struct xfs_trans	*tp,
+	struct xfs_inode	*dp,
+	struct xfs_name		*name,		/* name of entry to replace */
+	xfs_ino_t		inum,		/* new inode number */
+	xfs_fsblock_t		*first,		/* bmap's firstblock */
 	struct xfs_defer_ops	*dfops,		/* bmap's freeblock list */
-	xfs_extlen_t	total)		/* bmap's total block count */
+	xfs_extlen_t		total,		/* bmap's total block count */
+	xfs_dir2_dataptr_t	*offset)	/* OUT: offset in directory */
 {
 	struct xfs_da_args *args;
 	int		rval;
@@ -563,6 +564,9 @@ xfs_dir_replace(
 		rval = xfs_dir2_leaf_replace(args);
 	else
 		rval = xfs_dir2_node_replace(args);
+
+	if (offset)
+		*offset = args->offset;
 out_free:
 	kmem_free(args);
 	return rval;
diff --git a/fs/xfs/libxfs/xfs_dir2.h b/fs/xfs/libxfs/xfs_dir2.h
index e1bd05d..5cc0b3f 100644
--- a/fs/xfs/libxfs/xfs_dir2.h
+++ b/fs/xfs/libxfs/xfs_dir2.h
@@ -145,7 +145,8 @@ extern int xfs_dir_removename(struct xfs_trans *tp, struct xfs_inode *dp,
 extern int xfs_dir_replace(struct xfs_trans *tp, struct xfs_inode *dp,
 				struct xfs_name *name, xfs_ino_t inum,
 				xfs_fsblock_t *first,
-				struct xfs_defer_ops *dfops, xfs_extlen_t tot);
+				struct xfs_defer_ops *dfops, xfs_extlen_t tot,
+				xfs_dir2_dataptr_t *offset);
 extern int xfs_dir_canenter(struct xfs_trans *tp, struct xfs_inode *dp,
 				struct xfs_name *name);
 
diff --git a/fs/xfs/libxfs/xfs_dir2_block.c b/fs/xfs/libxfs/xfs_dir2_block.c
index 4dbe2fc..69dfe64 100644
--- a/fs/xfs/libxfs/xfs_dir2_block.c
+++ b/fs/xfs/libxfs/xfs_dir2_block.c
@@ -865,9 +865,9 @@ xfs_dir2_block_replace(
 	/*
 	 * Point to the data entry we need to change.
 	 */
+	args->offset = be32_to_cpu(blp[ent].address);
 	dep = (xfs_dir2_data_entry_t *)((char *)hdr +
-			xfs_dir2_dataptr_to_off(args->geo,
-						be32_to_cpu(blp[ent].address)));
+			xfs_dir2_dataptr_to_off(args->geo, args->offset));
 	ASSERT(be64_to_cpu(dep->inumber) != args->inumber);
 	/*
 	 * Change the inode number to the new value.
diff --git a/fs/xfs/libxfs/xfs_dir2_leaf.c b/fs/xfs/libxfs/xfs_dir2_leaf.c
index 197e627..770b93f 100644
--- a/fs/xfs/libxfs/xfs_dir2_leaf.c
+++ b/fs/xfs/libxfs/xfs_dir2_leaf.c
@@ -1518,6 +1518,7 @@ xfs_dir2_leaf_replace(
 	/*
 	 * Point to the data entry.
 	 */
+	args->offset = be32_to_cpu(lep->address);
 	dep = (xfs_dir2_data_entry_t *)
 	      ((char *)dbp->b_addr +
 	       xfs_dir2_dataptr_to_off(args->geo, be32_to_cpu(lep->address)));
diff --git a/fs/xfs/libxfs/xfs_dir2_node.c b/fs/xfs/libxfs/xfs_dir2_node.c
index 13d5244..860a612 100644
--- a/fs/xfs/libxfs/xfs_dir2_node.c
+++ b/fs/xfs/libxfs/xfs_dir2_node.c
@@ -2237,6 +2237,7 @@ xfs_dir2_node_replace(
 		hdr = state->extrablk.bp->b_addr;
 		ASSERT(hdr->magic == cpu_to_be32(XFS_DIR2_DATA_MAGIC) ||
 		       hdr->magic == cpu_to_be32(XFS_DIR3_DATA_MAGIC));
+		args->offset = be32_to_cpu(lep->address);
 		dep = (xfs_dir2_data_entry_t *)
 		      ((char *)hdr +
 		       xfs_dir2_dataptr_to_off(args->geo,
diff --git a/fs/xfs/libxfs/xfs_dir2_sf.c b/fs/xfs/libxfs/xfs_dir2_sf.c
index 9e90c22..295458f 100644
--- a/fs/xfs/libxfs/xfs_dir2_sf.c
+++ b/fs/xfs/libxfs/xfs_dir2_sf.c
@@ -1045,6 +1045,8 @@ xfs_dir2_sf_replace(
 				ASSERT(args->inumber != ino);
 				dp->d_ops->sf_put_ino(sfp, sfep, args->inumber);
 				dp->d_ops->sf_put_ftype(sfep, args->filetype);
+				args->offset = xfs_dir2_byte_to_dataptr(
+						  xfs_dir2_sf_get_offset(sfep));
 				break;
 			}
 		}
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 358a98a..f7986d8 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -2774,16 +2774,14 @@ xfs_cross_rename(
 	int		dp2_flags = 0;
 
 	/* Swap inode number for dirent in first parent */
-	error = xfs_dir_replace(tp, dp1, name1,
-				ip2->i_ino,
-				first_block, dfops, spaceres);
+	error = xfs_dir_replace(tp, dp1, name1, ip2->i_ino, first_block, dfops,
+				spaceres, NULL);
 	if (error)
 		goto out_trans_abort;
 
 	/* Swap inode number for dirent in second parent */
-	error = xfs_dir_replace(tp, dp2, name2,
-				ip1->i_ino,
-				first_block, dfops, spaceres);
+	error = xfs_dir_replace(tp, dp2, name2, ip1->i_ino, first_block, dfops,
+				spaceres, NULL);
 	if (error)
 		goto out_trans_abort;
 
@@ -2797,8 +2795,8 @@ xfs_cross_rename(
 
 		if (S_ISDIR(VFS_I(ip2)->i_mode)) {
 			error = xfs_dir_replace(tp, ip2, &xfs_name_dotdot,
-						dp1->i_ino, first_block,
-						dfops, spaceres);
+						dp1->i_ino, first_block, dfops,
+						spaceres, NULL);
 			if (error)
 				goto out_trans_abort;
 
@@ -2824,8 +2822,8 @@ xfs_cross_rename(
 
 		if (S_ISDIR(VFS_I(ip1)->i_mode)) {
 			error = xfs_dir_replace(tp, ip1, &xfs_name_dotdot,
-						dp2->i_ino, first_block,
-						dfops, spaceres);
+						dp2->i_ino, first_block, dfops,
+						spaceres, NULL);
 			if (error)
 				goto out_trans_abort;
 
@@ -3072,8 +3070,8 @@ xfs_rename(
 		 * name at the destination directory, remove it first.
 		 */
 		error = xfs_dir_replace(tp, target_dp, target_name,
-					src_ip->i_ino,
-					&first_block, &dfops, spaceres);
+					src_ip->i_ino, &first_block, &dfops,
+					spaceres, NULL);
 		if (error)
 			goto out_bmap_cancel;
 
@@ -3107,8 +3105,8 @@ xfs_rename(
 		 * directory.
 		 */
 		error = xfs_dir_replace(tp, src_ip, &xfs_name_dotdot,
-					target_dp->i_ino,
-					&first_block, &dfops, spaceres);
+					target_dp->i_ino, &first_block, &dfops,
+					spaceres, NULL);
 		ASSERT(error != -EEXIST);
 		if (error)
 			goto out_bmap_cancel;
@@ -3147,7 +3145,7 @@ xfs_rename(
 	 */
 	if (wip) {
 		error = xfs_dir_replace(tp, src_dp, src_name, wip->i_ino,
-					&first_block, &dfops, spaceres);
+					&first_block, &dfops, spaceres, NULL);
 	} else
 		error = xfs_dir_removename(tp, src_dp, src_name, src_ip->i_ino,
 					   &first_block, &dfops, spaceres,
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH 08/17] xfs: add parent pointer support to attribute code
  2017-10-18 22:55 [PATCH 00/17] Parent Pointers V3 Allison Henderson
                   ` (6 preceding siblings ...)
  2017-10-18 22:55 ` [PATCH 07/17] xfs: get directory offset when replacing a " Allison Henderson
@ 2017-10-18 22:55 ` Allison Henderson
  2017-10-18 22:55 ` [PATCH 09/17] xfs: define parent pointer xattr format Allison Henderson
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 66+ messages in thread
From: Allison Henderson @ 2017-10-18 22:55 UTC (permalink / raw)
  To: linux-xfs; +Cc: Mark Tinguely, Dave Chinner, Allison Henderson

From: Mark Tinguely <tinguely@sgi.com>

Add the new parent attribute type. XFS_ATTR_PARENT is used only for
parent pointer entries; it uses reserved blocks like XFS_ATTR_ROOT.

[dchinner: forward ported and cleaned up]
[achender: rebased]

Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/libxfs/xfs_attr.c      |  2 +-
 fs/xfs/libxfs/xfs_da_format.h | 12 ++++++++----
 fs/xfs/xfs_attr.h             |  2 ++
 3 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index b94f0cd..8f8bfff9 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -397,7 +397,7 @@ xfs_attr_set(
 	struct xfs_da_args	args;
 	struct xfs_defer_ops	dfops;
 	xfs_fsblock_t		firstblock;
-	int			rsvd = (flags & ATTR_ROOT) != 0;
+	bool			rsvd = (flags & (ATTR_ROOT | ATTR_PARENT)) != 0;
 	int			error, local;
 
 	XFS_STATS_INC(mp, xs_attr_set);
diff --git a/fs/xfs/libxfs/xfs_da_format.h b/fs/xfs/libxfs/xfs_da_format.h
index 3771edc..5f94c84 100644
--- a/fs/xfs/libxfs/xfs_da_format.h
+++ b/fs/xfs/libxfs/xfs_da_format.h
@@ -758,24 +758,28 @@ struct xfs_attr3_icleaf_hdr {
 #define	XFS_ATTR_LOCAL_BIT	0	/* attr is stored locally */
 #define	XFS_ATTR_ROOT_BIT	1	/* limit access to trusted attrs */
 #define	XFS_ATTR_SECURE_BIT	2	/* limit access to secure attrs */
+#define XFS_ATTR_PARENT_BIT	3	/* parent pointer secure attrs */
 #define	XFS_ATTR_INCOMPLETE_BIT	7	/* attr in middle of create/delete */
 #define XFS_ATTR_LOCAL		(1 << XFS_ATTR_LOCAL_BIT)
 #define XFS_ATTR_ROOT		(1 << XFS_ATTR_ROOT_BIT)
 #define XFS_ATTR_SECURE		(1 << XFS_ATTR_SECURE_BIT)
+#define XFS_ATTR_PARENT		(1 << XFS_ATTR_PARENT_BIT)
 #define XFS_ATTR_INCOMPLETE	(1 << XFS_ATTR_INCOMPLETE_BIT)
 
 /*
  * Conversion macros for converting namespace bits from argument flags
  * to ondisk flags.
  */
-#define XFS_ATTR_NSP_ARGS_MASK		(ATTR_ROOT | ATTR_SECURE)
-#define XFS_ATTR_NSP_ONDISK_MASK	(XFS_ATTR_ROOT | XFS_ATTR_SECURE)
+#define XFS_ATTR_NSP_ARGS_MASK		(ATTR_ROOT | ATTR_SECURE | XFS_ATTR_PARENT)
+#define XFS_ATTR_NSP_ONDISK_MASK	(XFS_ATTR_ROOT | XFS_ATTR_SECURE | XFS_ATTR_PARENT)
 #define XFS_ATTR_NSP_ONDISK(flags)	((flags) & XFS_ATTR_NSP_ONDISK_MASK)
 #define XFS_ATTR_NSP_ARGS(flags)	((flags) & XFS_ATTR_NSP_ARGS_MASK)
 #define XFS_ATTR_NSP_ARGS_TO_ONDISK(x)	(((x) & ATTR_ROOT ? XFS_ATTR_ROOT : 0) |\
-					 ((x) & ATTR_SECURE ? XFS_ATTR_SECURE : 0))
+					 ((x) & ATTR_SECURE ? XFS_ATTR_SECURE : 0) | \
+					 ((x) & ATTR_PARENT ? XFS_ATTR_PARENT : 0))
 #define XFS_ATTR_NSP_ONDISK_TO_ARGS(x)	(((x) & XFS_ATTR_ROOT ? ATTR_ROOT : 0) |\
-					 ((x) & XFS_ATTR_SECURE ? ATTR_SECURE : 0))
+					 ((x) & XFS_ATTR_SECURE ? ATTR_SECURE : 0) | \
+					 ((x) & XFS_ATTR_PARENT ? ATTR_PARENT : 0))
 
 /*
  * Alignment for namelist and valuelist entries (since they are mixed
diff --git a/fs/xfs/xfs_attr.h b/fs/xfs/xfs_attr.h
index 532567e..7901c3b 100644
--- a/fs/xfs/xfs_attr.h
+++ b/fs/xfs/xfs_attr.h
@@ -46,6 +46,7 @@ struct xfs_attr_list_context;
 #define ATTR_SECURE	0x0008	/* use attrs in security namespace */
 #define ATTR_CREATE	0x0010	/* pure create: fail if attr already exists */
 #define ATTR_REPLACE	0x0020	/* pure set: fail if attr does not exist */
+#define ATTR_PARENT	0x0040	/*  use attrs in parent namespace */
 
 #define ATTR_KERNOTIME	0x1000	/* [kernel] don't update inode timestamps */
 #define ATTR_KERNOVAL	0x2000	/* [kernel] get attr size only, not value */
@@ -57,6 +58,7 @@ struct xfs_attr_list_context;
 	{ ATTR_SECURE,		"SECURE" }, \
 	{ ATTR_CREATE,		"CREATE" }, \
 	{ ATTR_REPLACE,		"REPLACE" }, \
+	{ ATTR_PARENT,		"PARENT" }, \
 	{ ATTR_KERNOTIME,	"KERNOTIME" }, \
 	{ ATTR_KERNOVAL,	"KERNOVAL" }
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH 09/17] xfs: define parent pointer xattr format
  2017-10-18 22:55 [PATCH 00/17] Parent Pointers V3 Allison Henderson
                   ` (7 preceding siblings ...)
  2017-10-18 22:55 ` [PATCH 08/17] xfs: add parent pointer support to attribute code Allison Henderson
@ 2017-10-18 22:55 ` Allison Henderson
  2017-10-18 22:55 ` [PATCH 10/17] :xfs: extent transaction reservations for parent attributes Allison Henderson
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 66+ messages in thread
From: Allison Henderson @ 2017-10-18 22:55 UTC (permalink / raw)
  To: linux-xfs; +Cc: Dave Chinner, Allison Henderson

From: Dave Chinner <dchinner@redhat.com>

We need to define the parent pointer attribute format before we
start adding support for it into all the code that needs to use it.
The EA format we will use encodes the following information:

	name={parent inode #, parent inode generation, dirent offset}
	value={dirent filename}

The inode/gen gives all the information we need to reliably identify
the parent without requiring child->parent lock ordering, and allows
userspace to do pathname component level reconstruction without the
kernel ever needing to verify the parent itself as part of ioctl
calls.

By using the dirent offset in the EA name, we have a method of
knowing the exact parent pointer EA we need to modify/remove in
rename/unlink without an unbound EA name search.

By keeping the dirent name in the value, we have enough information
to be able to validate and reconstruct damaged directory trees.
While the diroffset of a filename alone is not unique enough to
identify the child, the {diroffset,filename,child_inode} tuple is
sufficient. That is, if the diroffset gets reused and points to a
different filename, we can detect that from the contents of EA. If a
link of the same name is created, then we can check whether it
points at the same inode as the parent EA we current have.

[achender: rebased, changed __unint32_t to xfs_dir2_dataptr_t]

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
v2: changed p_ino to xfs_ino_t and p_namelen to uint8_t

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/libxfs/xfs_format.h | 27 +++++++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h
index 23229f0..b9ea5bf 100644
--- a/fs/xfs/libxfs/xfs_format.h
+++ b/fs/xfs/libxfs/xfs_format.h
@@ -18,6 +18,8 @@
 #ifndef __XFS_FORMAT_H__
 #define __XFS_FORMAT_H__
 
+#include "xfs_da_format.h"
+
 /*
  * XFS On Disk Format Definitions
  *
@@ -1716,4 +1718,29 @@ struct xfs_acl {
 #define SGI_ACL_FILE_SIZE	(sizeof(SGI_ACL_FILE)-1)
 #define SGI_ACL_DEFAULT_SIZE	(sizeof(SGI_ACL_DEFAULT)-1)
 
+/*
+ * Parent pointer attribute format definition
+ *
+ * EA name encodes the parent inode number, generation and the offset of
+ * the dirent that points to the child inode. The EA value contains the
+ * same name as the dirent in the parent directory.
+ */
+struct xfs_parent_name_rec {
+	__be64	p_ino;
+	__be32	p_gen;
+	__be32	p_diroffset;
+};
+
+/*
+ * incore version of the above, also contains name pointers so callers
+ * can pass/obtain all the parent pointer information in a single structure
+ */
+struct xfs_parent_name_irec {
+	xfs_ino_t		p_ino;
+	uint32_t		p_gen;
+	xfs_dir2_dataptr_t	p_diroffset;
+	const char		*p_name;
+	uint8_t			p_namelen;
+};
+
 #endif /* __XFS_FORMAT_H__ */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH 10/17] :xfs: extent transaction reservations for parent attributes
  2017-10-18 22:55 [PATCH 00/17] Parent Pointers V3 Allison Henderson
                   ` (8 preceding siblings ...)
  2017-10-18 22:55 ` [PATCH 09/17] xfs: define parent pointer xattr format Allison Henderson
@ 2017-10-18 22:55 ` Allison Henderson
  2017-10-19 18:24   ` Darrick J. Wong
  2017-10-18 22:55 ` [PATCH 11/17] Add the extra space requirements for parent pointer attributes when calculating the minimum log size during mkfs Allison Henderson
                   ` (7 subsequent siblings)
  17 siblings, 1 reply; 66+ messages in thread
From: Allison Henderson @ 2017-10-18 22:55 UTC (permalink / raw)
  To: linux-xfs; +Cc: Dave Chinner, Allison Henderson

From: Dave Chinner <dchinner@redhat.com>

We need to add, remove or modify parent pointer attributes during
create/link/unlink/rename operations atomically with the dirents in the parent
directories being modified. This means they need to be modified in the same
transaction as the parent directories, and so we need to add the required
space for the attribute modifications to the transaction reservations.

[achender: rebased, added xfs_sb_version_hasparent stub]

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/libxfs/xfs_format.h     |   5 ++
 fs/xfs/libxfs/xfs_trans_resv.c | 103 ++++++++++++++++++++++++++++++++---------
 2 files changed, 85 insertions(+), 23 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h
index b9ea5bf..121862a 100644
--- a/fs/xfs/libxfs/xfs_format.h
+++ b/fs/xfs/libxfs/xfs_format.h
@@ -556,6 +556,11 @@ static inline bool xfs_sb_version_hasreflink(struct xfs_sb *sbp)
 		(sbp->sb_features_ro_compat & XFS_SB_FEAT_RO_COMPAT_REFLINK);
 }
 
+static inline bool xfs_sb_version_hasparent(struct xfs_sb *sbp)
+{
+	return false; /* We'll enable this at the end of the set */
+}
+
 /*
  * end of superblock version macros
  */
diff --git a/fs/xfs/libxfs/xfs_trans_resv.c b/fs/xfs/libxfs/xfs_trans_resv.c
index 6bd916b..54399e2 100644
--- a/fs/xfs/libxfs/xfs_trans_resv.c
+++ b/fs/xfs/libxfs/xfs_trans_resv.c
@@ -802,29 +802,30 @@ xfs_calc_sb_reservation(
 	return xfs_calc_buf_res(1, mp->m_sb.sb_sectsize);
 }
 
+/*
+ * Namespace reservations.
+ *
+ * These get tricky when parent pointers are enabled as we have attribute
+ * modifications occurring from within these transactions. Rather than confuse
+ * each of these reservation calculations with the conditional attribute
+ * reservations, add them here in a clear and concise manner. This assumes that
+ * the attribute reservations have already been calculated.
+ *
+ * Note that we only include the static attribute reservation here; the runtime
+ * reservation will have to be modified by the size of the attributes being
+ * added/removed/modified. See the comments on the attribute reservation
+ * calculations for more details.
+ *
+ * Note for rename: rename will vastly overestimate requirements. This will be
+ * addressed later when modifications are made to ensure parent attribute
+ * modifications can be done atomically with the rename operation.
+ */
 void
-xfs_trans_resv_calc(
+xfs_calc_namespace_reservations(
 	struct xfs_mount	*mp,
 	struct xfs_trans_resv	*resp)
 {
-	/*
-	 * The following transactions are logged in physical format and
-	 * require a permanent reservation on space.
-	 */
-	resp->tr_write.tr_logres = xfs_calc_write_reservation(mp);
-	if (xfs_sb_version_hasreflink(&mp->m_sb))
-		resp->tr_write.tr_logcount = XFS_WRITE_LOG_COUNT_REFLINK;
-	else
-		resp->tr_write.tr_logcount = XFS_WRITE_LOG_COUNT;
-	resp->tr_write.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
-
-	resp->tr_itruncate.tr_logres = xfs_calc_itruncate_reservation(mp);
-	if (xfs_sb_version_hasreflink(&mp->m_sb))
-		resp->tr_itruncate.tr_logcount =
-				XFS_ITRUNCATE_LOG_COUNT_REFLINK;
-	else
-		resp->tr_itruncate.tr_logcount = XFS_ITRUNCATE_LOG_COUNT;
-	resp->tr_itruncate.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
+	ASSERT(resp->tr_attrsetm.tr_logres > 0);
 
 	resp->tr_rename.tr_logres = xfs_calc_rename_reservation(mp);
 	resp->tr_rename.tr_logcount = XFS_RENAME_LOG_COUNT;
@@ -846,15 +847,69 @@ xfs_trans_resv_calc(
 	resp->tr_create.tr_logcount = XFS_CREATE_LOG_COUNT;
 	resp->tr_create.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
 
+	resp->tr_mkdir.tr_logres = xfs_calc_mkdir_reservation(mp);
+	resp->tr_mkdir.tr_logcount = XFS_MKDIR_LOG_COUNT;
+	resp->tr_mkdir.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
+
+	if (!xfs_sb_version_hasparent(&mp->m_sb))
+		return;
+
+	/* rename can add/remove/modify 2 parent attributes */
+	resp->tr_rename.tr_logres += 2 * max(resp->tr_attrsetm.tr_logres,
+					     resp->tr_attrrm.tr_logres);
+	resp->tr_rename.tr_logcount += 2 * max(resp->tr_attrsetm.tr_logcount,
+					       resp->tr_attrrm.tr_logcount);
+
+	/* create will add 1 parent attribute */
+	resp->tr_create.tr_logres += resp->tr_attrsetm.tr_logres;
+	resp->tr_create.tr_logcount += resp->tr_attrsetm.tr_logcount;
+
+	/* mkdir will add 1 parent attribute */
+	resp->tr_mkdir.tr_logres += resp->tr_attrsetm.tr_logres;
+	resp->tr_mkdir.tr_logcount += resp->tr_attrsetm.tr_logcount;
+
+	/* link will add 1 parent attribute */
+	resp->tr_link.tr_logres += resp->tr_attrsetm.tr_logres;
+	resp->tr_link.tr_logcount += resp->tr_attrsetm.tr_logcount;
+
+	/* symlink will add 1 parent attribute */
+	resp->tr_symlink.tr_logres += resp->tr_attrsetm.tr_logres;
+	resp->tr_symlink.tr_logcount += resp->tr_attrsetm.tr_logcount;
+
+	/* remove will remove 1 parent attribute */
+	resp->tr_remove.tr_logres += resp->tr_attrrm.tr_logres;
+	resp->tr_remove.tr_logcount = resp->tr_attrrm.tr_logcount;
+}
+
+void
+xfs_trans_resv_calc(
+	struct xfs_mount	*mp,
+	struct xfs_trans_resv	*resp)
+{
+	/*
+	 * The following transactions are logged in physical format and
+	 * require a permanent reservation on space.
+	 */
+	resp->tr_write.tr_logres = xfs_calc_write_reservation(mp);
+	if (xfs_sb_version_hasreflink(&mp->m_sb))
+		resp->tr_write.tr_logcount = XFS_WRITE_LOG_COUNT_REFLINK;
+	else
+		resp->tr_write.tr_logcount = XFS_WRITE_LOG_COUNT;
+	resp->tr_write.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
+
+	resp->tr_itruncate.tr_logres = xfs_calc_itruncate_reservation(mp);
+	if (xfs_sb_version_hasreflink(&mp->m_sb))
+		resp->tr_itruncate.tr_logcount =
+				XFS_ITRUNCATE_LOG_COUNT_REFLINK;
+	else
+		resp->tr_itruncate.tr_logcount = XFS_ITRUNCATE_LOG_COUNT;
+	resp->tr_itruncate.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
+
 	resp->tr_create_tmpfile.tr_logres =
 			xfs_calc_create_tmpfile_reservation(mp);
 	resp->tr_create_tmpfile.tr_logcount = XFS_CREATE_TMPFILE_LOG_COUNT;
 	resp->tr_create_tmpfile.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
 
-	resp->tr_mkdir.tr_logres = xfs_calc_mkdir_reservation(mp);
-	resp->tr_mkdir.tr_logcount = XFS_MKDIR_LOG_COUNT;
-	resp->tr_mkdir.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
-
 	resp->tr_ifree.tr_logres = xfs_calc_ifree_reservation(mp);
 	resp->tr_ifree.tr_logcount = XFS_INACTIVE_LOG_COUNT;
 	resp->tr_ifree.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
@@ -886,6 +941,8 @@ xfs_trans_resv_calc(
 		resp->tr_qm_dqalloc.tr_logcount = XFS_WRITE_LOG_COUNT;
 	resp->tr_qm_dqalloc.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
 
+	xfs_calc_namespace_reservations(mp, resp);
+
 	/*
 	 * The following transactions are logged in logical format with
 	 * a default log count.
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH 11/17] Add the extra space requirements for parent pointer attributes when calculating the minimum log size during mkfs
  2017-10-18 22:55 [PATCH 00/17] Parent Pointers V3 Allison Henderson
                   ` (9 preceding siblings ...)
  2017-10-18 22:55 ` [PATCH 10/17] :xfs: extent transaction reservations for parent attributes Allison Henderson
@ 2017-10-18 22:55 ` Allison Henderson
  2017-10-19 18:13   ` Darrick J. Wong
  2017-10-18 22:55 ` [PATCH 12/17] xfs: parent pointer attribute creation Allison Henderson
                   ` (6 subsequent siblings)
  17 siblings, 1 reply; 66+ messages in thread
From: Allison Henderson @ 2017-10-18 22:55 UTC (permalink / raw)
  To: linux-xfs; +Cc: Allison Henderson

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/libxfs/xfs_log_rlimit.c | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/fs/xfs/libxfs/xfs_log_rlimit.c b/fs/xfs/libxfs/xfs_log_rlimit.c
index c105979..beec9bf 100644
--- a/fs/xfs/libxfs/xfs_log_rlimit.c
+++ b/fs/xfs/libxfs/xfs_log_rlimit.c
@@ -39,6 +39,40 @@ xfs_log_calc_max_attrsetm_res(
 {
 	int			size;
 	int			nblks;
+	struct xfs_trans_resv   *resp = M_RES(mp);
+
+	/* Calculate extra space needed for parent pointer attributes */
+	if (!xfs_sb_version_hasparent(&mp->m_sb)) {
+
+		/* rename can add/remove/modify 2 parent attributes */
+		resp->tr_rename.tr_logres +=
+			2 * max(resp->tr_attrsetm.tr_logres,
+				resp->tr_attrrm.tr_logres);
+		resp->tr_rename.tr_logcount +=
+			2 * max(resp->tr_attrsetm.tr_logcount,
+				resp->tr_attrrm.tr_logcount);
+
+		/* create will add 1 parent attribute */
+		resp->tr_create.tr_logres += resp->tr_attrsetm.tr_logres;
+		resp->tr_create.tr_logcount += resp->tr_attrsetm.tr_logcount;
+
+		/* mkdir will add 1 parent attribute */
+		resp->tr_mkdir.tr_logres += resp->tr_attrsetm.tr_logres;
+		resp->tr_mkdir.tr_logcount += resp->tr_attrsetm.tr_logcount;
+
+		/* link will add 1 parent attribute */
+		resp->tr_link.tr_logres += resp->tr_attrsetm.tr_logres;
+		resp->tr_link.tr_logcount += resp->tr_attrsetm.tr_logcount;
+
+		/* symlink will add 1 parent attribute */
+		resp->tr_symlink.tr_logres += resp->tr_attrsetm.tr_logres;
+		resp->tr_symlink.tr_logcount += resp->tr_attrsetm.tr_logcount;
+
+		/* remove will remove 1 parent attribute */
+		resp->tr_remove.tr_logres += resp->tr_attrrm.tr_logres;
+		resp->tr_remove.tr_logcount = resp->tr_attrrm.tr_logcount;
+	}
+
 
 	size = xfs_attr_leaf_entsize_local_max(mp->m_attr_geo->blksize) -
 	       MAXNAMELEN - 1;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH 12/17] xfs: parent pointer attribute creation
  2017-10-18 22:55 [PATCH 00/17] Parent Pointers V3 Allison Henderson
                   ` (10 preceding siblings ...)
  2017-10-18 22:55 ` [PATCH 11/17] Add the extra space requirements for parent pointer attributes when calculating the minimum log size during mkfs Allison Henderson
@ 2017-10-18 22:55 ` Allison Henderson
  2017-10-19 19:36   ` Darrick J. Wong
  2017-10-18 22:55 ` [PATCH 13/17] xfs: add parent attributes to link Allison Henderson
                   ` (5 subsequent siblings)
  17 siblings, 1 reply; 66+ messages in thread
From: Allison Henderson @ 2017-10-18 22:55 UTC (permalink / raw)
  To: linux-xfs; +Cc: Dave Chinner, Allison Henderson

From: Dave Chinner <dchinner@redhat.com>

[bfoster: rebase, use VFS inode generation]
[achender: rebased, changed __unint32_t to xfs_dir2_dataptr_t,
	   fixed some null pointer bugs]

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
v2: remove unnecessary ENOSPC handling in xfs_attr_set_first_parent

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/Makefile            |  1 +
 fs/xfs/libxfs/xfs_attr.c   | 71 ++++++++++++++++++++++++++++++---
 fs/xfs/libxfs/xfs_bmap.c   | 51 ++++++++++++++----------
 fs/xfs/libxfs/xfs_bmap.h   |  1 +
 fs/xfs/libxfs/xfs_parent.c | 98 ++++++++++++++++++++++++++++++++++++++++++++++
 fs/xfs/xfs_attr.h          | 15 ++++++-
 fs/xfs/xfs_inode.c         | 16 +++++++-
 7 files changed, 225 insertions(+), 28 deletions(-)

diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile
index ec6486b..3015bca 100644
--- a/fs/xfs/Makefile
+++ b/fs/xfs/Makefile
@@ -52,6 +52,7 @@ xfs-y				+= $(addprefix libxfs/, \
 				   xfs_inode_fork.o \
 				   xfs_inode_buf.o \
 				   xfs_log_rlimit.o \
+				   xfs_parent.o \
 				   xfs_ag_resv.o \
 				   xfs_rmap.o \
 				   xfs_rmap_btree.o \
diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index 8f8bfff9..8aad242 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -91,12 +91,14 @@ xfs_attr_args_init(
 	args->whichfork = XFS_ATTR_FORK;
 	args->dp = dp;
 	args->flags = flags;
-	args->name = name;
-	args->namelen = namelen;
-	if (args->namelen >= MAXNAMELEN)
-		return -EFAULT;		/* match IRIX behaviour */
+	if (name) {
+		args->name = name;
+		args->namelen = namelen;
+		if (args->namelen >= MAXNAMELEN)
+			return -EFAULT;		/* match IRIX behaviour */
 
-	args->hashval = xfs_da_hashname(args->name, args->namelen);
+		args->hashval = xfs_da_hashname(args->name, args->namelen);
+	}
 	return 0;
 }
 
@@ -206,6 +208,65 @@ xfs_attr_calc_size(
 }
 
 /*
+ * Add the initial parent pointer attribute.
+ *
+ * Inode must be locked and completely empty as we are adding the attribute
+ * fork to the inode. This open codes bits of xfs_bmap_add_attrfork() and
+ * xfs_attr_set() because we know the inode is completely empty at this point
+ * and so don't need to handle all the different combinations of fork
+ * configurations here.
+ */
+int
+xfs_attr_set_first_parent(
+	struct xfs_trans	*tp,
+	struct xfs_inode	*ip,
+	struct xfs_parent_name_rec *rec,
+	int			reclen,
+	const char		*value,
+	int			valuelen,
+	struct xfs_defer_ops	*dfops,
+	xfs_fsblock_t		*firstblock)
+{
+	struct xfs_da_args	args;
+	int			flags = ATTR_PARENT;
+	int			local;
+	int			sf_size;
+	int			error;
+
+	tp->t_flags |= XFS_TRANS_RESERVE;
+
+	error = xfs_attr_args_init(&args, ip, (char *)rec, reclen, flags);
+	if (error)
+		return error;
+
+	args.name = (char *)rec;
+	args.namelen = reclen;
+	args.hashval = xfs_da_hashname(args.name, args.namelen);
+	args.value = (char *)value;
+	args.valuelen = valuelen;
+	args.firstblock = firstblock;
+	args.dfops = dfops;
+	args.op_flags = XFS_DA_OP_ADDNAME | XFS_DA_OP_OKNOENT;
+	args.total = xfs_attr_calc_size(&args, &local);
+	args.trans = tp;
+	ASSERT(local);
+
+	/* set the attribute fork appropriately */
+	sf_size = sizeof(struct xfs_attr_sf_hdr) +
+			XFS_ATTR_SF_ENTSIZE_BYNAME(reclen, valuelen);
+	xfs_bmap_set_attrforkoff(ip, sf_size, NULL);
+	ip->i_afp = kmem_zone_zalloc(xfs_ifork_zone, KM_SLEEP);
+	ip->i_afp->if_flags = XFS_IFEXTENTS;
+
+
+	/* Try to add the attr to the attribute list in the inode. */
+	xfs_attr_shortform_create(&args);
+	error = xfs_attr_shortform_addname(&args);
+
+	return error;
+}
+
+/*
  * set the attribute specified in @args. In the case of the parent attribute
  * being set, we do not want to roll the transaction on shortform-to-leaf
  * conversion, as the attribute must be added in the same transaction as the
diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 044a363..7ee98be 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -1066,6 +1066,35 @@ xfs_bmap_add_attrfork_local(
 	return -EFSCORRUPTED;
 }
 
+int
+xfs_bmap_set_attrforkoff(
+	struct xfs_inode	*ip,
+	int			size,
+	int			*version)
+{
+	switch (ip->i_d.di_format) {
+	case XFS_DINODE_FMT_DEV:
+		ip->i_d.di_forkoff = roundup(sizeof(xfs_dev_t), 8) >> 3;
+		break;
+	case XFS_DINODE_FMT_UUID:
+		ip->i_d.di_forkoff = roundup(sizeof(uuid_t), 8) >> 3;
+		break;
+	case XFS_DINODE_FMT_LOCAL:
+	case XFS_DINODE_FMT_EXTENTS:
+	case XFS_DINODE_FMT_BTREE:
+		ip->i_d.di_forkoff = xfs_attr_shortform_bytesfit(ip, size);
+		if (!ip->i_d.di_forkoff)
+			ip->i_d.di_forkoff = xfs_default_attroffset(ip) >> 3;
+		else if ((ip->i_mount->m_flags & XFS_MOUNT_ATTR2) && version)
+			*version = 2;
+		break;
+	default:
+		ASSERT(0);
+		return -EINVAL;
+	}
+	return 0;
+}
+
 /*
  * Convert inode from non-attributed to attributed.
  * Must not be in a transaction, ip must not be locked.
@@ -1120,27 +1149,7 @@ xfs_bmap_add_attrfork(
 	xfs_trans_ijoin(tp, ip, 0);
 	xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
 
-	switch (ip->i_d.di_format) {
-	case XFS_DINODE_FMT_DEV:
-		ip->i_d.di_forkoff = roundup(sizeof(xfs_dev_t), 8) >> 3;
-		break;
-	case XFS_DINODE_FMT_UUID:
-		ip->i_d.di_forkoff = roundup(sizeof(uuid_t), 8) >> 3;
-		break;
-	case XFS_DINODE_FMT_LOCAL:
-	case XFS_DINODE_FMT_EXTENTS:
-	case XFS_DINODE_FMT_BTREE:
-		ip->i_d.di_forkoff = xfs_attr_shortform_bytesfit(ip, size);
-		if (!ip->i_d.di_forkoff)
-			ip->i_d.di_forkoff = xfs_default_attroffset(ip) >> 3;
-		else if (mp->m_flags & XFS_MOUNT_ATTR2)
-			version = 2;
-		break;
-	default:
-		ASSERT(0);
-		error = -EINVAL;
-		goto trans_cancel;
-	}
+	xfs_bmap_set_attrforkoff(ip, size, &version);
 
 	ASSERT(ip->i_afp == NULL);
 	ip->i_afp = kmem_zone_zalloc(xfs_ifork_zone, KM_SLEEP);
diff --git a/fs/xfs/libxfs/xfs_bmap.h b/fs/xfs/libxfs/xfs_bmap.h
index 851982a..533f40f 100644
--- a/fs/xfs/libxfs/xfs_bmap.h
+++ b/fs/xfs/libxfs/xfs_bmap.h
@@ -209,6 +209,7 @@ void	xfs_bmap_trace_exlist(struct xfs_inode *ip, xfs_extnum_t cnt,
 void	xfs_trim_extent(struct xfs_bmbt_irec *irec, xfs_fileoff_t bno,
 		xfs_filblks_t len);
 int	xfs_bmap_add_attrfork(struct xfs_inode *ip, int size, int rsvd);
+int	xfs_bmap_set_attrforkoff(struct xfs_inode *ip, int size, int *version);
 void	xfs_bmap_local_to_extents_empty(struct xfs_inode *ip, int whichfork);
 void	xfs_bmap_add_free(struct xfs_mount *mp, struct xfs_defer_ops *dfops,
 			  xfs_fsblock_t bno, xfs_filblks_t len,
diff --git a/fs/xfs/libxfs/xfs_parent.c b/fs/xfs/libxfs/xfs_parent.c
new file mode 100644
index 0000000..88f7edc
--- /dev/null
+++ b/fs/xfs/libxfs/xfs_parent.c
@@ -0,0 +1,98 @@
+/*
+ * Copyright (c) 2015 Red Hat, Inc.
+ * All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation
+ */
+#include "xfs.h"
+#include "xfs_fs.h"
+#include "xfs_format.h"
+#include "xfs_log_format.h"
+#include "xfs_shared.h"
+#include "xfs_trans_resv.h"
+#include "xfs_mount.h"
+#include "xfs_bmap_btree.h"
+#include "xfs_inode.h"
+#include "xfs_error.h"
+#include "xfs_trace.h"
+#include "xfs_trans.h"
+#include "xfs_attr.h"
+
+/*
+ * Parent pointer attribute handling.
+ *
+ * Because the attribute value is a filename component, it will never be longer
+ * than 255 bytes. This means the attribute will always be a local format
+ * attribute as it is xfs_attr_leaf_entsize_local_max() for v5 filesystems will
+ * always be larger than this (max is 75% of block size).
+ *
+ * Creating a new parent attribute will always create a new attribute - there
+ * should never, ever be an existing attribute in the tree for a new inode.
+ * ENOSPC behaviour is problematic - creating the inode without the parent
+ * pointer is effectively a corruption, so we allow parent attribute creation
+ * to dip into the reserve block pool to avoid unexpected ENOSPC errors from
+ * occurring.
+ */
+
+/*
+ * Create the initial parent attribute.
+ *
+ * The initial attribute creation also needs to be atomic w.r.t the parent
+ * directory modification. Hence it needs to run in the same transaction and the
+ * transaction committed by the caller.  Because the attribute created is
+ * guaranteed to be a local attribute and is always going to be the first
+ * attribute in the attribute fork, we can do this safely in the single
+ * transaction context as it is impossible for an overwrite to occur and hence
+ * we'll never have a rolling overwrite transaction occurring here. Hence we
+ * can short-cut a lot of the normal xfs_attr_set() code paths that are needed
+ * to handle the generic cases.
+ */
+static int
+xfs_parent_create_nrec(
+	struct xfs_trans	*tp,
+	struct xfs_inode	*child,
+	struct xfs_parent_name_irec *nrec,
+	struct xfs_defer_ops	*dfops,
+	xfs_fsblock_t		*firstblock)
+{
+	struct xfs_parent_name_rec rec;
+
+	rec.p_ino = cpu_to_be64(nrec->p_ino);
+	rec.p_gen = cpu_to_be32(nrec->p_gen);
+	rec.p_diroffset = cpu_to_be32(nrec->p_diroffset);
+
+	return xfs_attr_set_first_parent(tp, child, &rec, sizeof(rec),
+				   nrec->p_name, nrec->p_namelen,
+				   dfops, firstblock);
+}
+
+int
+xfs_parent_create(
+	struct xfs_trans	*tp,
+	struct xfs_inode	*parent,
+	struct xfs_inode	*child,
+	struct xfs_name		*child_name,
+	xfs_dir2_dataptr_t	diroffset,
+	struct xfs_defer_ops	*dfops,
+	xfs_fsblock_t		*firstblock)
+{
+	struct xfs_parent_name_irec nrec;
+
+	nrec.p_ino = parent->i_ino;
+	nrec.p_gen = VFS_I(parent)->i_generation;
+	nrec.p_diroffset = diroffset;
+	nrec.p_name = child_name->name;
+	nrec.p_namelen = child_name->len;
+
+	return xfs_parent_create_nrec(tp, child, &nrec, dfops, firstblock);
+}
diff --git a/fs/xfs/xfs_attr.h b/fs/xfs/xfs_attr.h
index 7901c3b..b48e31b 100644
--- a/fs/xfs/xfs_attr.h
+++ b/fs/xfs/xfs_attr.h
@@ -19,6 +19,8 @@
 #define	__XFS_ATTR_H__
 
 #include "libxfs/xfs_defer.h"
+#include "libxfs/xfs_da_format.h"
+#include "libxfs/xfs_format.h"
 
 struct xfs_inode;
 struct xfs_da_args;
@@ -183,5 +185,16 @@ int xfs_attr_set_deferred(struct xfs_inode *dp, struct xfs_defer_ops *dfops,
 int xfs_attr_remove_deferred(struct xfs_inode *dp, struct xfs_defer_ops *dfops,
 			    const unsigned char *name, unsigned int namelen,
 			    int flags);
-
+/*
+ * Parent pointer attribute prototypes
+ */
+int xfs_parent_create(struct xfs_trans *tp, struct xfs_inode *parent,
+		      struct xfs_inode *child, struct xfs_name *child_name,
+		      xfs_dir2_dataptr_t diroffset, struct xfs_defer_ops *dfops,
+		      xfs_fsblock_t *firstblock);
+int xfs_attr_set_first_parent(struct xfs_trans *tp, struct xfs_inode *ip,
+			      struct xfs_parent_name_rec *rec, int reclen,
+			      const char *value, int valuelen,
+			      struct xfs_defer_ops *dfops,
+			      xfs_fsblock_t *firstblock);
 #endif	/* __XFS_ATTR_H__ */
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index f7986d8..4396561 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -1164,6 +1164,7 @@ xfs_create(
 	struct xfs_dquot	*pdqp = NULL;
 	struct xfs_trans_res	*tres;
 	uint			resblks;
+	xfs_dir2_dataptr_t	diroffset;
 
 	trace_xfs_create(dp, name);
 
@@ -1253,7 +1254,7 @@ xfs_create(
 	error = xfs_dir_createname(tp, dp, name, ip->i_ino,
 					&first_block, &dfops, resblks ?
 					resblks - XFS_IALLOC_SPACE_RES(mp) : 0,
-					NULL);
+					&diroffset);
 	if (error) {
 		ASSERT(error != -ENOSPC);
 		goto out_trans_cancel;
@@ -1272,6 +1273,19 @@ xfs_create(
 	}
 
 	/*
+	 * If we have parent pointers, we need to add the attribute containing
+	 * the parent information now. This must be done within the same
+	 * transaction the directory entry is created, while the new inode
+	 * contains nothing in the inode literal area.
+	 */
+	if (xfs_sb_version_hasparent(&mp->m_sb)) {
+		error = xfs_parent_create(tp, dp, ip, name, diroffset,
+					  &dfops, &first_block);
+		if (error)
+			goto out_bmap_cancel;
+	}
+
+	/*
 	 * If this is a synchronous mount, make sure that the
 	 * create transaction goes to disk before returning to
 	 * the user.
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH 13/17] xfs: add parent attributes to link
  2017-10-18 22:55 [PATCH 00/17] Parent Pointers V3 Allison Henderson
                   ` (11 preceding siblings ...)
  2017-10-18 22:55 ` [PATCH 12/17] xfs: parent pointer attribute creation Allison Henderson
@ 2017-10-18 22:55 ` Allison Henderson
  2017-10-19 19:40   ` Darrick J. Wong
  2017-10-18 22:55 ` [PATCH 14/17] xfs: remove parent pointers in unlink Allison Henderson
                   ` (4 subsequent siblings)
  17 siblings, 1 reply; 66+ messages in thread
From: Allison Henderson @ 2017-10-18 22:55 UTC (permalink / raw)
  To: linux-xfs; +Cc: Dave Chinner, Allison Henderson

From: Dave Chinner <dchinner@redhat.com>

[bfoster: rebase, use VFS inode fields, fix xfs_bmap_finish() usage]
[achender: rebased, changed __unint32_t to xfs_dir2_dataptr_t,
	   fixed null pointer bugs]

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/libxfs/xfs_attr.c   | 20 +++++++++++++-
 fs/xfs/libxfs/xfs_parent.c | 43 ++++++++++++++++++++++++++++++
 fs/xfs/xfs_attr.h          | 10 +++++++
 fs/xfs/xfs_inode.c         | 66 ++++++++++++++++++++++++++++++++++++----------
 4 files changed, 124 insertions(+), 15 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index 8aad242..e7692ef 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -35,6 +35,7 @@
 #include "xfs_bmap_util.h"
 #include "xfs_bmap_btree.h"
 #include "xfs_attr.h"
+#include "xfs_attr_sf.h"
 #include "xfs_attr_leaf.h"
 #include "xfs_attr_remote.h"
 #include "xfs_error.h"
@@ -266,6 +267,23 @@ xfs_attr_set_first_parent(
 	return error;
 }
 
+int
+xfs_attr_set_parent(
+	struct xfs_trans	*tp,
+	struct xfs_inode	*ip,
+	struct xfs_parent_name_rec *rec,
+	int			reclen,
+	const char		*value,
+	int			valuelen,
+	struct xfs_defer_ops	*dfops,
+	xfs_fsblock_t		*firstblock)
+{
+	int                     flags = ATTR_PARENT;
+
+	return xfs_attr_set_deferred(ip, dfops, (char *)rec, reclen,
+				    (char *)value, valuelen, flags);
+}
+
 /*
  * set the attribute specified in @args. In the case of the parent attribute
  * being set, we do not want to roll the transaction on shortform-to-leaf
@@ -512,8 +530,8 @@ xfs_attr_set(
 	 */
 	xfs_trans_log_inode(args.trans, dp, XFS_ILOG_CORE);
 	error = xfs_trans_commit(args.trans);
-	xfs_iunlock(dp, XFS_ILOCK_EXCL);
 
+	xfs_iunlock(dp, XFS_ILOCK_EXCL);
 	return error;
 
 out_defer_cancel:
diff --git a/fs/xfs/libxfs/xfs_parent.c b/fs/xfs/libxfs/xfs_parent.c
index 88f7edc..0707336 100644
--- a/fs/xfs/libxfs/xfs_parent.c
+++ b/fs/xfs/libxfs/xfs_parent.c
@@ -96,3 +96,46 @@ xfs_parent_create(
 
 	return xfs_parent_create_nrec(tp, child, &nrec, dfops, firstblock);
 }
+
+static int
+xfs_parent_add_nrec(
+	struct xfs_trans	*tp,
+	struct xfs_inode	*child,
+	struct xfs_parent_name_irec *nrec,
+	struct xfs_defer_ops	*dfops,
+	xfs_fsblock_t		*firstblock)
+{
+	struct xfs_parent_name_rec rec;
+
+	rec.p_ino = cpu_to_be64(nrec->p_ino);
+	rec.p_gen = cpu_to_be32(nrec->p_gen);
+	rec.p_diroffset = cpu_to_be32(nrec->p_diroffset);
+
+	return xfs_attr_set_parent(tp, child, &rec, sizeof(rec),
+				   nrec->p_name, nrec->p_namelen,
+				   dfops, firstblock);
+}
+
+/*
+ * Add a parent record to an inode with existing parent records.
+ */
+int
+xfs_parent_add(
+	struct xfs_trans	*tp,
+	struct xfs_inode	*parent,
+	struct xfs_inode	*child,
+	struct xfs_name		*child_name,
+	uint32_t		diroffset,
+	struct xfs_defer_ops	*dfops,
+	xfs_fsblock_t		*firstblock)
+{
+	struct xfs_parent_name_irec nrec;
+
+	nrec.p_ino = parent->i_ino;
+	nrec.p_gen = VFS_I(parent)->i_generation;
+	nrec.p_diroffset = diroffset;
+	nrec.p_name = child_name->name;
+	nrec.p_namelen = child_name->len;
+
+	return xfs_parent_add_nrec(tp, child, &nrec, dfops, firstblock);
+}
diff --git a/fs/xfs/xfs_attr.h b/fs/xfs/xfs_attr.h
index b48e31b..acb6157 100644
--- a/fs/xfs/xfs_attr.h
+++ b/fs/xfs/xfs_attr.h
@@ -197,4 +197,14 @@ int xfs_attr_set_first_parent(struct xfs_trans *tp, struct xfs_inode *ip,
 			      const char *value, int valuelen,
 			      struct xfs_defer_ops *dfops,
 			      xfs_fsblock_t *firstblock);
+
+int xfs_parent_add(struct xfs_trans *tp, struct xfs_inode *parent,
+		   struct xfs_inode *child, struct xfs_name *child_name,
+		   xfs_dir2_dataptr_t diroffset, struct xfs_defer_ops *dfops,
+		   xfs_fsblock_t *firstblock);
+int xfs_attr_set_parent(struct xfs_trans *tp, struct xfs_inode *ip,
+			struct xfs_parent_name_rec *rec, int reclen,
+			const char *value, int valuelen,
+			struct xfs_defer_ops *dfops, xfs_fsblock_t *firstblock);
+
 #endif	/* __XFS_ATTR_H__ */
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 4396561..51b623b 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -1451,6 +1451,8 @@ xfs_link(
 	struct xfs_defer_ops	dfops;
 	xfs_fsblock_t           first_block;
 	int			resblks;
+	uint32_t		diroffset;
+	bool			first_parent = false;
 
 	trace_xfs_link(tdp, target_name);
 
@@ -1467,6 +1469,25 @@ xfs_link(
 	if (error)
 		goto std_return;
 
+	/*
+	 * If we have parent pointers and there is no attribute fork (i.e. we
+	 * are linking in a O_TMPFILE created inode) we need to add the
+	 * attribute fork to the inode. Because we may have an existing data
+	 * fork, we do this before we start the link transaction as adding an
+	 * attribute fork requires it's own transaction.
+	 */
+	if (xfs_sb_version_hasparent(&mp->m_sb) && !xfs_inode_hasattr(sip)) {
+		int sf_size = sizeof(struct xfs_attr_sf_hdr) +
+				XFS_ATTR_SF_ENTSIZE_BYNAME(
+					sizeof(struct xfs_parent_name_rec),
+					target_name->len);
+		ASSERT(VFS_I(sip)->i_nlink == 0);
+		error = xfs_bmap_add_attrfork(sip, sf_size, 0);
+		if (error)
+			goto std_return;
+		first_parent = true;
+	}
+
 	resblks = XFS_LINK_SPACE_RES(mp, target_name->len);
 	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_link, resblks, 0, 0, &tp);
 	if (error == -ENOSPC) {
@@ -1498,8 +1519,6 @@ xfs_link(
 			goto error_return;
 	}
 
-	xfs_defer_init(&dfops, &first_block);
-
 	/*
 	 * Handle initial link state of O_TMPFILE inode
 	 */
@@ -1509,36 +1528,55 @@ xfs_link(
 			goto error_return;
 	}
 
+	xfs_defer_init(&dfops, &first_block);
 	error = xfs_dir_createname(tp, tdp, target_name, sip->i_ino,
-				   &first_block, &dfops, resblks, NULL);
+				   &first_block, &dfops, resblks, &diroffset);
 	if (error)
-		goto error_return;
+		goto out_defer_cancel;
 	xfs_trans_ichgtime(tp, tdp, XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG);
 	xfs_trans_log_inode(tp, tdp, XFS_ILOG_CORE);
 
 	error = xfs_bumplink(tp, sip);
 	if (error)
-		goto error_return;
+		goto out_defer_cancel;
 
 	/*
-	 * If this is a synchronous mount, make sure that the
-	 * link transaction goes to disk before returning to
-	 * the user.
+	 * If we have parent pointers, we now need to add the parent record to
+	 * the attribute fork of the inode. If this is the initial parent
+	 * atribute, we need to create it correctly, otherwise we can just add
+	 * the parent to the inode.
+	 */
+	if (xfs_sb_version_hasparent(&mp->m_sb)) {
+		if (first_parent)
+			error = xfs_parent_create(tp, tdp, sip, target_name,
+						  diroffset, &dfops,
+						  &first_block);
+		else
+			error = xfs_parent_add(tp, tdp, sip, target_name,
+					       diroffset, &dfops,
+					       &first_block);
+		if (error)
+			goto out_defer_cancel;
+	}
+
+	/*
+	 * If this is a synchronous mount, make sure that the link transaction
+	 * goes to disk before returning to the user.
 	 */
 	if (mp->m_flags & (XFS_MOUNT_WSYNC|XFS_MOUNT_DIRSYNC))
 		xfs_trans_set_sync(tp);
 
 	error = xfs_defer_finish(&tp, &dfops);
-	if (error) {
-		xfs_defer_cancel(&dfops);
-		goto error_return;
-	}
+	if (error)
+		goto out_defer_cancel;
 
 	return xfs_trans_commit(tp);
 
- error_return:
+out_defer_cancel:
+	xfs_defer_cancel(&dfops);
+error_return:
 	xfs_trans_cancel(tp);
- std_return:
+std_return:
 	return error;
 }
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH 14/17] xfs: remove parent pointers in unlink
  2017-10-18 22:55 [PATCH 00/17] Parent Pointers V3 Allison Henderson
                   ` (12 preceding siblings ...)
  2017-10-18 22:55 ` [PATCH 13/17] xfs: add parent attributes to link Allison Henderson
@ 2017-10-18 22:55 ` Allison Henderson
  2017-10-19 19:43   ` Darrick J. Wong
  2017-10-18 22:55 ` [PATCH 15/17] xfs_bmap_add_attrfork(): re-add error handling from set_attrforkoff() call Allison Henderson
                   ` (3 subsequent siblings)
  17 siblings, 1 reply; 66+ messages in thread
From: Allison Henderson @ 2017-10-18 22:55 UTC (permalink / raw)
  To: linux-xfs; +Cc: Dave Chinner, Allison Henderson

From: Dave Chinner <dchinner@redhat.com>

[bfoster: rebase, use VFS inode generation]
[achender: rebased, changed __unint32_t to xfs_dir2_dataptr_t
	   implemented xfs_attr_remove_parent]

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/libxfs/xfs_attr.c   | 15 +++++++++++++++
 fs/xfs/libxfs/xfs_parent.c | 22 ++++++++++++++++++++++
 fs/xfs/xfs_attr.h          |  7 +++++++
 fs/xfs/xfs_inode.c         | 10 +++++++++-
 fs/xfs/xfs_qm.c            |  2 +-
 fs/xfs/xfs_qm.h            |  1 +
 6 files changed, 55 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index e7692ef..7547eb7 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -42,6 +42,7 @@
 #include "xfs_quota.h"
 #include "xfs_trans_space.h"
 #include "xfs_trace.h"
+#include "xfs_qm.h"
 
 /*
  * xfs_attr.c
@@ -571,6 +572,20 @@ xfs_attr_set_deferred(
 	return 0;
 }
 
+int
+xfs_attr_remove_parent(
+	struct xfs_trans		*tp,
+	struct xfs_inode		*dp,
+	struct xfs_parent_name_rec	*rec,
+	int				reclen,
+	struct xfs_defer_ops		*dfops,
+	xfs_fsblock_t			*firstblock)
+{
+	int flags = ATTR_PARENT;
+
+	return xfs_attr_remove_deferred(dp, dfops, (char *) rec, reclen, flags);
+}
+
 /*
  * Generic handler routine to remove a name from an attribute list.
  * Transitions attribute list from Btree to shortform as necessary.
diff --git a/fs/xfs/libxfs/xfs_parent.c b/fs/xfs/libxfs/xfs_parent.c
index 0707336..ca695c4 100644
--- a/fs/xfs/libxfs/xfs_parent.c
+++ b/fs/xfs/libxfs/xfs_parent.c
@@ -139,3 +139,25 @@ xfs_parent_add(
 
 	return xfs_parent_add_nrec(tp, child, &nrec, dfops, firstblock);
 }
+
+/*
+ * Remove a parent record from a child inode.
+ */
+int
+xfs_parent_remove(
+	struct xfs_trans	*tp,
+	struct xfs_inode	*parent,
+	struct xfs_inode	*child,
+	xfs_dir2_dataptr_t	diroffset,
+	struct xfs_defer_ops	*dfops,
+	xfs_fsblock_t		*firstblock)
+{
+	struct xfs_parent_name_rec rec;
+
+	rec.p_ino = cpu_to_be64(parent->i_ino);
+	rec.p_gen = cpu_to_be32(VFS_I(parent)->i_generation);
+	rec.p_diroffset = cpu_to_be32(diroffset);
+
+	return xfs_attr_remove_parent(tp, child, &rec, sizeof(rec),
+				      dfops, firstblock);
+}
diff --git a/fs/xfs/xfs_attr.h b/fs/xfs/xfs_attr.h
index acb6157..7a3bf8b 100644
--- a/fs/xfs/xfs_attr.h
+++ b/fs/xfs/xfs_attr.h
@@ -207,4 +207,11 @@ int xfs_attr_set_parent(struct xfs_trans *tp, struct xfs_inode *ip,
 			const char *value, int valuelen,
 			struct xfs_defer_ops *dfops, xfs_fsblock_t *firstblock);
 
+int xfs_parent_remove(struct xfs_trans *tp, struct xfs_inode *parent,
+		      struct xfs_inode *child, xfs_dir2_dataptr_t diroffset,
+		      struct xfs_defer_ops *dfops, xfs_fsblock_t *firstblock);
+int xfs_attr_remove_parent(struct xfs_trans *tp, struct xfs_inode *ip,
+			struct xfs_parent_name_rec *rec, int reclen,
+			struct xfs_defer_ops *dfops, xfs_fsblock_t *firstblock);
+
 #endif	/* __XFS_ATTR_H__ */
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 51b623b..a360c3d 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -2612,6 +2612,7 @@ xfs_remove(
 	struct xfs_defer_ops	dfops;
 	xfs_fsblock_t           first_block;
 	uint			resblks;
+	uint32_t		dir_offset;
 
 	trace_xfs_remove(dp, name);
 
@@ -2692,12 +2693,19 @@ xfs_remove(
 
 	xfs_defer_init(&dfops, &first_block);
 	error = xfs_dir_removename(tp, dp, name, ip->i_ino, &first_block,
-				   &dfops, resblks, NULL);
+				   &dfops, resblks, &dir_offset);
 	if (error) {
 		ASSERT(error != -ENOENT);
 		goto out_bmap_cancel;
 	}
 
+	if (xfs_sb_version_hasparent(&mp->m_sb)) {
+		error = xfs_parent_remove(tp, dp, ip, dir_offset, &dfops,
+					  &first_block);
+		if (error)
+			goto out_bmap_cancel;
+	}
+
 	/*
 	 * If this is a synchronous mount, make sure that the
 	 * remove transaction goes to disk before returning to
diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index 010a13a..a047f0f 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -307,7 +307,7 @@ xfs_qm_dqattach_one(
 	return 0;
 }
 
-static bool
+bool
 xfs_qm_need_dqattach(
 	struct xfs_inode	*ip)
 {
diff --git a/fs/xfs/xfs_qm.h b/fs/xfs/xfs_qm.h
index 2975a82..9976369 100644
--- a/fs/xfs/xfs_qm.h
+++ b/fs/xfs/xfs_qm.h
@@ -176,6 +176,7 @@ extern int		xfs_qm_scall_setqlim(struct xfs_mount *, xfs_dqid_t, uint,
 					struct qc_dqblk *);
 extern int		xfs_qm_scall_quotaon(struct xfs_mount *, uint);
 extern int		xfs_qm_scall_quotaoff(struct xfs_mount *, uint);
+extern bool		xfs_qm_need_dqattach(struct xfs_inode *ip);
 
 static inline struct xfs_def_quota *
 xfs_get_defquota(struct xfs_dquot *dqp, struct xfs_quotainfo *qi)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH 15/17] xfs_bmap_add_attrfork(): re-add error handling from set_attrforkoff() call
  2017-10-18 22:55 [PATCH 00/17] Parent Pointers V3 Allison Henderson
                   ` (13 preceding siblings ...)
  2017-10-18 22:55 ` [PATCH 14/17] xfs: remove parent pointers in unlink Allison Henderson
@ 2017-10-18 22:55 ` Allison Henderson
  2017-10-19 19:43   ` Darrick J. Wong
  2017-10-18 22:55 ` [PATCH 16/17] Add parent pointers to rename Allison Henderson
                   ` (2 subsequent siblings)
  17 siblings, 1 reply; 66+ messages in thread
From: Allison Henderson @ 2017-10-18 22:55 UTC (permalink / raw)
  To: linux-xfs; +Cc: Brian Foster, Allison Henderson

From: Brian Foster <bfoster@redhat.com>

- fix for "xfs: parent pointer attribute creation"

[achender: rebased]

Signed-off-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/libxfs/xfs_bmap.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 7ee98be..a631fe1 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -1149,7 +1149,9 @@ xfs_bmap_add_attrfork(
 	xfs_trans_ijoin(tp, ip, 0);
 	xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
 
-	xfs_bmap_set_attrforkoff(ip, size, &version);
+	error = xfs_bmap_set_attrforkoff(ip, size, &version);
+	if (error)
+		goto trans_cancel;
 
 	ASSERT(ip->i_afp == NULL);
 	ip->i_afp = kmem_zone_zalloc(xfs_ifork_zone, KM_SLEEP);
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH 16/17] Add parent pointers to rename
  2017-10-18 22:55 [PATCH 00/17] Parent Pointers V3 Allison Henderson
                   ` (14 preceding siblings ...)
  2017-10-18 22:55 ` [PATCH 15/17] xfs_bmap_add_attrfork(): re-add error handling from set_attrforkoff() call Allison Henderson
@ 2017-10-18 22:55 ` Allison Henderson
  2017-10-18 22:55 ` [PATCH 17/17] Add the parent pointer support to the superblock version 5 Allison Henderson
  2017-10-19  4:11 ` [PATCH 00/17] Parent Pointers V3 Amir Goldstein
  17 siblings, 0 replies; 66+ messages in thread
From: Allison Henderson @ 2017-10-18 22:55 UTC (permalink / raw)
  To: linux-xfs; +Cc: Allison Henderson

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/libxfs/xfs_dir2.c |  6 ++++--
 fs/xfs/xfs_inode.c       | 26 ++++++++++++++++++++------
 2 files changed, 24 insertions(+), 8 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_dir2.c b/fs/xfs/libxfs/xfs_dir2.c
index 486f808..35667fd 100644
--- a/fs/xfs/libxfs/xfs_dir2.c
+++ b/fs/xfs/libxfs/xfs_dir2.c
@@ -324,10 +324,11 @@ xfs_dir_createname(
 	else
 		rval = xfs_dir2_node_addname(args);
 
+out_free:
 	/* return the location that this entry was place in the parent inode */
 	if (offset)
 		*offset = args->offset;
-out_free:
+
 	kmem_free(args);
 	return rval;
 }
@@ -496,9 +497,10 @@ xfs_dir_removename(
 		rval = xfs_dir2_leaf_removename(args);
 	else
 		rval = xfs_dir2_node_removename(args);
+out_free:
 	if (offset)
 		*offset = args->offset;
-out_free:
+
 	kmem_free(args);
 	return rval;
 }
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index a360c3d..98bd7c2 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -2989,6 +2989,8 @@ xfs_rename(
 	bool			src_is_directory = S_ISDIR(VFS_I(src_ip)->i_mode);
 	int			spaceres;
 	int			error;
+	xfs_dir2_dataptr_t	new_diroffset;
+	xfs_dir2_dataptr_t	old_diroffset;
 
 	trace_xfs_rename(src_dp, target_dp, src_name, target_name);
 
@@ -3091,13 +3093,12 @@ xfs_rename(
 		 */
 		error = xfs_dir_createname(tp, target_dp, target_name,
 					   src_ip->i_ino, &first_block, &dfops,
-					   spaceres, NULL);
+					   spaceres, &new_diroffset);
 		if (error)
 			goto out_bmap_cancel;
 
 		xfs_trans_ichgtime(tp, target_dp,
 					XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG);
-
 		if (new_parent && src_is_directory) {
 			error = xfs_bumplink(tp, target_dp);
 			if (error)
@@ -3131,7 +3132,7 @@ xfs_rename(
 		 */
 		error = xfs_dir_replace(tp, target_dp, target_name,
 					src_ip->i_ino, &first_block, &dfops,
-					spaceres, NULL);
+					spaceres, &new_diroffset);
 		if (error)
 			goto out_bmap_cancel;
 
@@ -3166,7 +3167,7 @@ xfs_rename(
 		 */
 		error = xfs_dir_replace(tp, src_ip, &xfs_name_dotdot,
 					target_dp->i_ino, &first_block, &dfops,
-					spaceres, NULL);
+					spaceres, &new_diroffset);
 		ASSERT(error != -EEXIST);
 		if (error)
 			goto out_bmap_cancel;
@@ -3205,11 +3206,12 @@ xfs_rename(
 	 */
 	if (wip) {
 		error = xfs_dir_replace(tp, src_dp, src_name, wip->i_ino,
-					&first_block, &dfops, spaceres, NULL);
+					&first_block, &dfops, spaceres,
+					&old_diroffset);
 	} else
 		error = xfs_dir_removename(tp, src_dp, src_name, src_ip->i_ino,
 					   &first_block, &dfops, spaceres,
-					   NULL);
+					   &old_diroffset);
 	if (error)
 		goto out_bmap_cancel;
 
@@ -3239,6 +3241,18 @@ xfs_rename(
 		VFS_I(wip)->i_state &= ~I_LINKABLE;
 	}
 
+	if (new_parent && xfs_sb_version_hasparent(&mp->m_sb)) {
+		error = xfs_parent_add(tp, target_dp, src_ip, target_name,
+				       new_diroffset, &dfops, &first_block);
+		if (error)
+			goto out_bmap_cancel;
+
+		error = xfs_parent_remove(tp, src_dp, src_ip,
+					  old_diroffset, &dfops, &first_block);
+		if (error)
+			goto out_bmap_cancel;
+	}
+
 	xfs_trans_ichgtime(tp, src_dp, XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG);
 	xfs_trans_log_inode(tp, src_dp, XFS_ILOG_CORE);
 	if (new_parent)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* [PATCH 17/17] Add the parent pointer support to the superblock version 5.
  2017-10-18 22:55 [PATCH 00/17] Parent Pointers V3 Allison Henderson
                   ` (15 preceding siblings ...)
  2017-10-18 22:55 ` [PATCH 16/17] Add parent pointers to rename Allison Henderson
@ 2017-10-18 22:55 ` Allison Henderson
  2017-10-19  3:57   ` Amir Goldstein
  2017-10-19 19:45   ` Darrick J. Wong
  2017-10-19  4:11 ` [PATCH 00/17] Parent Pointers V3 Amir Goldstein
  17 siblings, 2 replies; 66+ messages in thread
From: Allison Henderson @ 2017-10-18 22:55 UTC (permalink / raw)
  To: linux-xfs; +Cc: Allison Henderson, Mark Tinguely, Dave Chinner

[dchinner: forward ported and cleaned up]
[achender: rebased and added parent pointer attribute to
           compatible attributes mask]

Signed-off-by: Mark Tinguely <tinguely@sgi.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
v2: remove unrelated type clean up in xfs_format.h

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
 fs/xfs/libxfs/xfs_format.h | 7 +++++--
 fs/xfs/libxfs/xfs_fs.h     | 1 +
 fs/xfs/xfs_fsops.c         | 4 +++-
 3 files changed, 9 insertions(+), 3 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h
index 121862a..f3e3132 100644
--- a/fs/xfs/libxfs/xfs_format.h
+++ b/fs/xfs/libxfs/xfs_format.h
@@ -459,10 +459,12 @@ xfs_sb_has_compat_feature(
 #define XFS_SB_FEAT_RO_COMPAT_FINOBT   (1 << 0)		/* free inode btree */
 #define XFS_SB_FEAT_RO_COMPAT_RMAPBT   (1 << 1)		/* reverse map btree */
 #define XFS_SB_FEAT_RO_COMPAT_REFLINK  (1 << 2)		/* reflinked files */
+#define XFS_SB_FEAT_RO_COMPAT_PARENT	(1 << 3)	/* parent inode ptr */
 #define XFS_SB_FEAT_RO_COMPAT_ALL \
 		(XFS_SB_FEAT_RO_COMPAT_FINOBT | \
 		 XFS_SB_FEAT_RO_COMPAT_RMAPBT | \
-		 XFS_SB_FEAT_RO_COMPAT_REFLINK)
+		 XFS_SB_FEAT_RO_COMPAT_REFLINK| \
+		 XFS_SB_FEAT_RO_COMPAT_PARENT)
 #define XFS_SB_FEAT_RO_COMPAT_UNKNOWN	~XFS_SB_FEAT_RO_COMPAT_ALL
 static inline bool
 xfs_sb_has_ro_compat_feature(
@@ -558,7 +560,8 @@ static inline bool xfs_sb_version_hasreflink(struct xfs_sb *sbp)
 
 static inline bool xfs_sb_version_hasparent(struct xfs_sb *sbp)
 {
-	return false; /* We'll enable this at the end of the set */
+	return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5 &&
+		(sbp->sb_features_ro_compat & XFS_SB_FEAT_RO_COMPAT_PARENT));
 }
 
 /*
diff --git a/fs/xfs/libxfs/xfs_fs.h b/fs/xfs/libxfs/xfs_fs.h
index 8c61f21..b8108f8 100644
--- a/fs/xfs/libxfs/xfs_fs.h
+++ b/fs/xfs/libxfs/xfs_fs.h
@@ -222,6 +222,7 @@ typedef struct xfs_fsop_resblks {
 #define XFS_FSOP_GEOM_FLAGS_SPINODES	0x40000	/* sparse inode chunks	*/
 #define XFS_FSOP_GEOM_FLAGS_RMAPBT	0x80000	/* reverse mapping btree */
 #define XFS_FSOP_GEOM_FLAGS_REFLINK	0x100000 /* files can share blocks */
+#define XFS_FSOP_GEOM_FLAGS_PARENT	0x200000 /* parent pointers */
 
 /*
  * Minimum and maximum sizes need for growth checks.
diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
index 8f22fc5..9a0ce52 100644
--- a/fs/xfs/xfs_fsops.c
+++ b/fs/xfs/xfs_fsops.c
@@ -111,7 +111,9 @@ xfs_fs_geometry(
 			(xfs_sb_version_hasrmapbt(&mp->m_sb) ?
 				XFS_FSOP_GEOM_FLAGS_RMAPBT : 0) |
 			(xfs_sb_version_hasreflink(&mp->m_sb) ?
-				XFS_FSOP_GEOM_FLAGS_REFLINK : 0);
+				XFS_FSOP_GEOM_FLAGS_REFLINK : 0) |
+			(xfs_sb_version_hasparent(&mp->m_sb) ?
+				XFS_FSOP_GEOM_FLAGS_PARENT : 0);
 		geo->logsectsize = xfs_sb_version_hassector(&mp->m_sb) ?
 				mp->m_sb.sb_logsectsize : BBSIZE;
 		geo->rtsectsize = mp->m_sb.sb_blocksize;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

* Re: [PATCH 17/17] Add the parent pointer support to the superblock version 5.
  2017-10-18 22:55 ` [PATCH 17/17] Add the parent pointer support to the superblock version 5 Allison Henderson
@ 2017-10-19  3:57   ` Amir Goldstein
  2017-10-19 20:06     ` Darrick J. Wong
  2017-10-19 19:45   ` Darrick J. Wong
  1 sibling, 1 reply; 66+ messages in thread
From: Amir Goldstein @ 2017-10-19  3:57 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs, Mark Tinguely, Dave Chinner

On Thu, Oct 19, 2017 at 1:55 AM, Allison Henderson
<allison.henderson@oracle.com> wrote:
> [dchinner: forward ported and cleaned up]
> [achender: rebased and added parent pointer attribute to
>            compatible attributes mask]
>
> Signed-off-by: Mark Tinguely <tinguely@sgi.com>
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---
> v2: remove unrelated type clean up in xfs_format.h

I'm curious how XFS_SB_VERSION2_PARENTBIT fits into the picture?
old relic?

>
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_format.h | 7 +++++--
>  fs/xfs/libxfs/xfs_fs.h     | 1 +
>  fs/xfs/xfs_fsops.c         | 4 +++-
>  3 files changed, 9 insertions(+), 3 deletions(-)
>
> diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h
> index 121862a..f3e3132 100644
> --- a/fs/xfs/libxfs/xfs_format.h
> +++ b/fs/xfs/libxfs/xfs_format.h
> @@ -459,10 +459,12 @@ xfs_sb_has_compat_feature(
>  #define XFS_SB_FEAT_RO_COMPAT_FINOBT   (1 << 0)                /* free inode btree */
>  #define XFS_SB_FEAT_RO_COMPAT_RMAPBT   (1 << 1)                /* reverse map btree */
>  #define XFS_SB_FEAT_RO_COMPAT_REFLINK  (1 << 2)                /* reflinked files */
> +#define XFS_SB_FEAT_RO_COMPAT_PARENT   (1 << 3)        /* parent inode ptr */
>  #define XFS_SB_FEAT_RO_COMPAT_ALL \
>                 (XFS_SB_FEAT_RO_COMPAT_FINOBT | \
>                  XFS_SB_FEAT_RO_COMPAT_RMAPBT | \
> -                XFS_SB_FEAT_RO_COMPAT_REFLINK)
> +                XFS_SB_FEAT_RO_COMPAT_REFLINK| \
> +                XFS_SB_FEAT_RO_COMPAT_PARENT)
>  #define XFS_SB_FEAT_RO_COMPAT_UNKNOWN  ~XFS_SB_FEAT_RO_COMPAT_ALL
>  static inline bool
>  xfs_sb_has_ro_compat_feature(
> @@ -558,7 +560,8 @@ static inline bool xfs_sb_version_hasreflink(struct xfs_sb *sbp)
>
>  static inline bool xfs_sb_version_hasparent(struct xfs_sb *sbp)
>  {
> -       return false; /* We'll enable this at the end of the set */
> +       return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5 &&
> +               (sbp->sb_features_ro_compat & XFS_SB_FEAT_RO_COMPAT_PARENT));
>  }
>
>  /*
> diff --git a/fs/xfs/libxfs/xfs_fs.h b/fs/xfs/libxfs/xfs_fs.h
> index 8c61f21..b8108f8 100644
> --- a/fs/xfs/libxfs/xfs_fs.h
> +++ b/fs/xfs/libxfs/xfs_fs.h
> @@ -222,6 +222,7 @@ typedef struct xfs_fsop_resblks {
>  #define XFS_FSOP_GEOM_FLAGS_SPINODES   0x40000 /* sparse inode chunks  */
>  #define XFS_FSOP_GEOM_FLAGS_RMAPBT     0x80000 /* reverse mapping btree */
>  #define XFS_FSOP_GEOM_FLAGS_REFLINK    0x100000 /* files can share blocks */
> +#define XFS_FSOP_GEOM_FLAGS_PARENT     0x200000 /* parent pointers */
>
>  /*
>   * Minimum and maximum sizes need for growth checks.
> diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
> index 8f22fc5..9a0ce52 100644
> --- a/fs/xfs/xfs_fsops.c
> +++ b/fs/xfs/xfs_fsops.c
> @@ -111,7 +111,9 @@ xfs_fs_geometry(
>                         (xfs_sb_version_hasrmapbt(&mp->m_sb) ?
>                                 XFS_FSOP_GEOM_FLAGS_RMAPBT : 0) |
>                         (xfs_sb_version_hasreflink(&mp->m_sb) ?
> -                               XFS_FSOP_GEOM_FLAGS_REFLINK : 0);
> +                               XFS_FSOP_GEOM_FLAGS_REFLINK : 0) |
> +                       (xfs_sb_version_hasparent(&mp->m_sb) ?
> +                               XFS_FSOP_GEOM_FLAGS_PARENT : 0);
>                 geo->logsectsize = xfs_sb_version_hassector(&mp->m_sb) ?
>                                 mp->m_sb.sb_logsectsize : BBSIZE;
>                 geo->rtsectsize = mp->m_sb.sb_blocksize;
> --
> 2.7.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 00/17] Parent Pointers V3
  2017-10-18 22:55 [PATCH 00/17] Parent Pointers V3 Allison Henderson
                   ` (16 preceding siblings ...)
  2017-10-18 22:55 ` [PATCH 17/17] Add the parent pointer support to the superblock version 5 Allison Henderson
@ 2017-10-19  4:11 ` Amir Goldstein
  2017-10-20  3:22   ` Amir Goldstein
  2017-10-20 22:41   ` Dave Chinner
  17 siblings, 2 replies; 66+ messages in thread
From: Amir Goldstein @ 2017-10-19  4:11 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Thu, Oct 19, 2017 at 1:55 AM, Allison Henderson
<allison.henderson@oracle.com> wrote:
> Hi all,
>
> This is the third version of parent pointer attributes for xfs.
> I've integrated the suggestions made since v2, mostly moving the
> attr buffers in the xfs_attr_log_item to pointers that point to
> xfs_attr_item. I've also implementing the recovery routines for
> the xfs_attr_log_format.  If I missed anything please point it
> out.  As always, comments and feedback are appreciated.  Thank
> you!
>

A minor comment about the cover letter.
All designated reviewers must know exactly what "parent pointers" are for,
but it could be useful to add some context in the cover letter about the purpose
of this work for the sake of other readers on the list. Useful to refer to the
upcoming scrub support patches.

BTW, not sure if this was mentioned in the previous lifetime of those
patches, but parent pointers can be used to implement exportfs operation
xfs_fs_fh_to_parent() for "non-connectable" file handles (FILEID_INO32_GEN)
and to implement xfs_fs_get_name(), which would make reconnect_path()
*much* more efficient.

Also, you may want to use git format-patch -v3 for V3
makes it easier to browse old versions of patches on the list.

Cheers,
Amir.

> Allison Henderson (7):
>   Add helper functions xfs_attr_set_args and xfs_attr_remove_args
>   Set up infastructure for deferred attribute operations
>   Add xfs_attr_set_defered and xfs_attr_remove_defered
>   Remove all strlen calls in all xfs_attr_* functions for attr names.
>   Add the extra space requirements for parent pointer attributes when
>     calculating the minimum log size during mkfs
>   Add parent pointers to rename
>   Add the parent pointer support to the superblock version 5.
>
> Brian Foster (1):
>   xfs_bmap_add_attrfork(): re-add error handling from set_attrforkoff()
>     call
>
> Dave Chinner (5):
>   xfs: define parent pointer xattr format
>   :xfs: extent transaction reservations for parent attributes
>   xfs: parent pointer attribute creation
>   xfs: add parent attributes to link
>   xfs: remove parent pointers in unlink
>
> Mark Tinguely (4):
>   xfs: get directory offset when adding directory name
>   xfs: get directory offset when removing directory name
>   xfs: get directory offset when replacing a directory name
>   xfs: add parent pointer support to attribute code
>
>  fs/xfs/Makefile                |   3 +
>  fs/xfs/libxfs/xfs_attr.c       | 476 +++++++++++++++++++++++++++-----------
>  fs/xfs/libxfs/xfs_bmap.c       |  51 ++--
>  fs/xfs/libxfs/xfs_bmap.h       |   1 +
>  fs/xfs/libxfs/xfs_da_btree.h   |   1 +
>  fs/xfs/libxfs/xfs_da_format.h  |  12 +-
>  fs/xfs/libxfs/xfs_defer.h      |   1 +
>  fs/xfs/libxfs/xfs_dir2.c       |  41 ++--
>  fs/xfs/libxfs/xfs_dir2.h       |  10 +-
>  fs/xfs/libxfs/xfs_dir2_block.c |   9 +-
>  fs/xfs/libxfs/xfs_dir2_leaf.c  |   8 +-
>  fs/xfs/libxfs/xfs_dir2_node.c  |   8 +-
>  fs/xfs/libxfs/xfs_dir2_sf.c    |   6 +
>  fs/xfs/libxfs/xfs_format.h     |  37 ++-
>  fs/xfs/libxfs/xfs_fs.h         |   1 +
>  fs/xfs/libxfs/xfs_log_format.h |  36 ++-
>  fs/xfs/libxfs/xfs_log_rlimit.c |  34 +++
>  fs/xfs/libxfs/xfs_parent.c     | 163 +++++++++++++
>  fs/xfs/libxfs/xfs_trans_resv.c | 103 +++++++--
>  fs/xfs/libxfs/xfs_types.h      |   1 +
>  fs/xfs/xfs_acl.c               |  12 +-
>  fs/xfs/xfs_attr.h              |  68 +++++-
>  fs/xfs/xfs_attr_item.c         | 512 +++++++++++++++++++++++++++++++++++++++++
>  fs/xfs/xfs_attr_item.h         | 111 +++++++++
>  fs/xfs/xfs_fsops.c             |   4 +-
>  fs/xfs/xfs_inode.c             | 146 +++++++++---
>  fs/xfs/xfs_ioctl.c             |  13 +-
>  fs/xfs/xfs_iops.c              |   6 +-
>  fs/xfs/xfs_log_recover.c       | 140 +++++++++++
>  fs/xfs/xfs_qm.c                |   2 +-
>  fs/xfs/xfs_qm.h                |   1 +
>  fs/xfs/xfs_super.c             |   1 +
>  fs/xfs/xfs_symlink.c           |   2 +-
>  fs/xfs/xfs_trans.h             |  13 ++
>  fs/xfs/xfs_trans_attr.c        | 286 +++++++++++++++++++++++
>  fs/xfs/xfs_xattr.c             |  10 +-
>  36 files changed, 2064 insertions(+), 265 deletions(-)
>  create mode 100644 fs/xfs/libxfs/xfs_parent.c
>  create mode 100644 fs/xfs/xfs_attr_item.c
>  create mode 100644 fs/xfs/xfs_attr_item.h
>  create mode 100644 fs/xfs/xfs_trans_attr.c
>
> --
> 2.7.4
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 11/17] Add the extra space requirements for parent pointer attributes when calculating the minimum log size during mkfs
  2017-10-18 22:55 ` [PATCH 11/17] Add the extra space requirements for parent pointer attributes when calculating the minimum log size during mkfs Allison Henderson
@ 2017-10-19 18:13   ` Darrick J. Wong
  2017-10-21  1:07     ` Allison Henderson
  0 siblings, 1 reply; 66+ messages in thread
From: Darrick J. Wong @ 2017-10-19 18:13 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Wed, Oct 18, 2017 at 03:55:27PM -0700, Allison Henderson wrote:
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_log_rlimit.c | 34 ++++++++++++++++++++++++++++++++++
>  1 file changed, 34 insertions(+)
> 
> diff --git a/fs/xfs/libxfs/xfs_log_rlimit.c b/fs/xfs/libxfs/xfs_log_rlimit.c
> index c105979..beec9bf 100644
> --- a/fs/xfs/libxfs/xfs_log_rlimit.c
> +++ b/fs/xfs/libxfs/xfs_log_rlimit.c
> @@ -39,6 +39,40 @@ xfs_log_calc_max_attrsetm_res(
>  {
>  	int			size;
>  	int			nblks;
> +	struct xfs_trans_resv   *resp = M_RES(mp);
> +
> +	/* Calculate extra space needed for parent pointer attributes */
> +	if (!xfs_sb_version_hasparent(&mp->m_sb)) {

Aren't we supposed to be enlarging tr_log{res,count} if hasparent is true?

> +
> +		/* rename can add/remove/modify 2 parent attributes */
> +		resp->tr_rename.tr_logres +=
> +			2 * max(resp->tr_attrsetm.tr_logres,
> +				resp->tr_attrrm.tr_logres);
> +		resp->tr_rename.tr_logcount +=
> +			2 * max(resp->tr_attrsetm.tr_logcount,
> +				resp->tr_attrrm.tr_logcount);
> +
> +		/* create will add 1 parent attribute */
> +		resp->tr_create.tr_logres += resp->tr_attrsetm.tr_logres;
> +		resp->tr_create.tr_logcount += resp->tr_attrsetm.tr_logcount;
> +
> +		/* mkdir will add 1 parent attribute */
> +		resp->tr_mkdir.tr_logres += resp->tr_attrsetm.tr_logres;
> +		resp->tr_mkdir.tr_logcount += resp->tr_attrsetm.tr_logcount;
> +
> +		/* link will add 1 parent attribute */
> +		resp->tr_link.tr_logres += resp->tr_attrsetm.tr_logres;
> +		resp->tr_link.tr_logcount += resp->tr_attrsetm.tr_logcount;
> +
> +		/* symlink will add 1 parent attribute */
> +		resp->tr_symlink.tr_logres += resp->tr_attrsetm.tr_logres;
> +		resp->tr_symlink.tr_logcount += resp->tr_attrsetm.tr_logcount;
> +
> +		/* remove will remove 1 parent attribute */
> +		resp->tr_remove.tr_logres += resp->tr_attrrm.tr_logres;
> +		resp->tr_remove.tr_logcount = resp->tr_attrrm.tr_logcount;

+= ?

--D

> +	}
> +
>  
>  	size = xfs_attr_leaf_entsize_local_max(mp->m_attr_geo->blksize) -
>  	       MAXNAMELEN - 1;
> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 10/17] :xfs: extent transaction reservations for parent attributes
  2017-10-18 22:55 ` [PATCH 10/17] :xfs: extent transaction reservations for parent attributes Allison Henderson
@ 2017-10-19 18:24   ` Darrick J. Wong
       [not found]     ` <8680e0c1-ada8-06e3-e397-61a5076030be@oracle.com>
  2017-10-21  1:07     ` Allison Henderson
  0 siblings, 2 replies; 66+ messages in thread
From: Darrick J. Wong @ 2017-10-19 18:24 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs, Dave Chinner

On Wed, Oct 18, 2017 at 03:55:26PM -0700, Allison Henderson wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> We need to add, remove or modify parent pointer attributes during
> create/link/unlink/rename operations atomically with the dirents in the parent
> directories being modified. This means they need to be modified in the same
> transaction as the parent directories, and so we need to add the required
> space for the attribute modifications to the transaction reservations.
> 
> [achender: rebased, added xfs_sb_version_hasparent stub]
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_format.h     |   5 ++
>  fs/xfs/libxfs/xfs_trans_resv.c | 103 ++++++++++++++++++++++++++++++++---------
>  2 files changed, 85 insertions(+), 23 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h
> index b9ea5bf..121862a 100644
> --- a/fs/xfs/libxfs/xfs_format.h
> +++ b/fs/xfs/libxfs/xfs_format.h
> @@ -556,6 +556,11 @@ static inline bool xfs_sb_version_hasreflink(struct xfs_sb *sbp)
>  		(sbp->sb_features_ro_compat & XFS_SB_FEAT_RO_COMPAT_REFLINK);
>  }
>  
> +static inline bool xfs_sb_version_hasparent(struct xfs_sb *sbp)
> +{
> +	return false; /* We'll enable this at the end of the set */

I think this chunk should just add the proper testing code here.

You only add RO_COMPAT_PARENT to XFS_SB_FEAT_RO_COMPAT_ALL at the end of
the patch series, so anyone bisecting their way through the series won't
be able to mount such an fs.

> +}
> +
>  /*
>   * end of superblock version macros
>   */
> diff --git a/fs/xfs/libxfs/xfs_trans_resv.c b/fs/xfs/libxfs/xfs_trans_resv.c
> index 6bd916b..54399e2 100644
> --- a/fs/xfs/libxfs/xfs_trans_resv.c
> +++ b/fs/xfs/libxfs/xfs_trans_resv.c
> @@ -802,29 +802,30 @@ xfs_calc_sb_reservation(
>  	return xfs_calc_buf_res(1, mp->m_sb.sb_sectsize);
>  }
>  
> +/*
> + * Namespace reservations.
> + *
> + * These get tricky when parent pointers are enabled as we have attribute
> + * modifications occurring from within these transactions. Rather than confuse
> + * each of these reservation calculations with the conditional attribute
> + * reservations, add them here in a clear and concise manner. This assumes that
> + * the attribute reservations have already been calculated.
> + *
> + * Note that we only include the static attribute reservation here; the runtime
> + * reservation will have to be modified by the size of the attributes being
> + * added/removed/modified. See the comments on the attribute reservation
> + * calculations for more details.

I don't know that we can properly use a different runtime reservations
than what we statically reserve here, since the static reservations are
used to ensure that the log is of sufficient size given the fs geometry.

<shrug> Maybe we can figure out how much extra space is allowable given
the actual size of the log?  Or perhaps in the end we'll just end up
restricting the maximum size of what we can log through intents?  Or
just set the reservation to 64k I guess.... :)

--D

> + * Note for rename: rename will vastly overestimate requirements. This will be
> + * addressed later when modifications are made to ensure parent attribute
> + * modifications can be done atomically with the rename operation.
> + */
>  void
> -xfs_trans_resv_calc(
> +xfs_calc_namespace_reservations(
>  	struct xfs_mount	*mp,
>  	struct xfs_trans_resv	*resp)
>  {
> -	/*
> -	 * The following transactions are logged in physical format and
> -	 * require a permanent reservation on space.
> -	 */
> -	resp->tr_write.tr_logres = xfs_calc_write_reservation(mp);
> -	if (xfs_sb_version_hasreflink(&mp->m_sb))
> -		resp->tr_write.tr_logcount = XFS_WRITE_LOG_COUNT_REFLINK;
> -	else
> -		resp->tr_write.tr_logcount = XFS_WRITE_LOG_COUNT;
> -	resp->tr_write.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
> -
> -	resp->tr_itruncate.tr_logres = xfs_calc_itruncate_reservation(mp);
> -	if (xfs_sb_version_hasreflink(&mp->m_sb))
> -		resp->tr_itruncate.tr_logcount =
> -				XFS_ITRUNCATE_LOG_COUNT_REFLINK;
> -	else
> -		resp->tr_itruncate.tr_logcount = XFS_ITRUNCATE_LOG_COUNT;
> -	resp->tr_itruncate.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
> +	ASSERT(resp->tr_attrsetm.tr_logres > 0);
>  
>  	resp->tr_rename.tr_logres = xfs_calc_rename_reservation(mp);
>  	resp->tr_rename.tr_logcount = XFS_RENAME_LOG_COUNT;
> @@ -846,15 +847,69 @@ xfs_trans_resv_calc(
>  	resp->tr_create.tr_logcount = XFS_CREATE_LOG_COUNT;
>  	resp->tr_create.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
>  
> +	resp->tr_mkdir.tr_logres = xfs_calc_mkdir_reservation(mp);
> +	resp->tr_mkdir.tr_logcount = XFS_MKDIR_LOG_COUNT;
> +	resp->tr_mkdir.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
> +
> +	if (!xfs_sb_version_hasparent(&mp->m_sb))
> +		return;
> +
> +	/* rename can add/remove/modify 2 parent attributes */
> +	resp->tr_rename.tr_logres += 2 * max(resp->tr_attrsetm.tr_logres,
> +					     resp->tr_attrrm.tr_logres);
> +	resp->tr_rename.tr_logcount += 2 * max(resp->tr_attrsetm.tr_logcount,
> +					       resp->tr_attrrm.tr_logcount);
> +
> +	/* create will add 1 parent attribute */
> +	resp->tr_create.tr_logres += resp->tr_attrsetm.tr_logres;
> +	resp->tr_create.tr_logcount += resp->tr_attrsetm.tr_logcount;
> +
> +	/* mkdir will add 1 parent attribute */
> +	resp->tr_mkdir.tr_logres += resp->tr_attrsetm.tr_logres;
> +	resp->tr_mkdir.tr_logcount += resp->tr_attrsetm.tr_logcount;
> +
> +	/* link will add 1 parent attribute */
> +	resp->tr_link.tr_logres += resp->tr_attrsetm.tr_logres;
> +	resp->tr_link.tr_logcount += resp->tr_attrsetm.tr_logcount;
> +
> +	/* symlink will add 1 parent attribute */
> +	resp->tr_symlink.tr_logres += resp->tr_attrsetm.tr_logres;
> +	resp->tr_symlink.tr_logcount += resp->tr_attrsetm.tr_logcount;
> +
> +	/* remove will remove 1 parent attribute */
> +	resp->tr_remove.tr_logres += resp->tr_attrrm.tr_logres;
> +	resp->tr_remove.tr_logcount = resp->tr_attrrm.tr_logcount;
> +}
> +
> +void
> +xfs_trans_resv_calc(
> +	struct xfs_mount	*mp,
> +	struct xfs_trans_resv	*resp)
> +{
> +	/*
> +	 * The following transactions are logged in physical format and
> +	 * require a permanent reservation on space.
> +	 */
> +	resp->tr_write.tr_logres = xfs_calc_write_reservation(mp);
> +	if (xfs_sb_version_hasreflink(&mp->m_sb))
> +		resp->tr_write.tr_logcount = XFS_WRITE_LOG_COUNT_REFLINK;
> +	else
> +		resp->tr_write.tr_logcount = XFS_WRITE_LOG_COUNT;
> +	resp->tr_write.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
> +
> +	resp->tr_itruncate.tr_logres = xfs_calc_itruncate_reservation(mp);
> +	if (xfs_sb_version_hasreflink(&mp->m_sb))
> +		resp->tr_itruncate.tr_logcount =
> +				XFS_ITRUNCATE_LOG_COUNT_REFLINK;
> +	else
> +		resp->tr_itruncate.tr_logcount = XFS_ITRUNCATE_LOG_COUNT;
> +	resp->tr_itruncate.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
> +
>  	resp->tr_create_tmpfile.tr_logres =
>  			xfs_calc_create_tmpfile_reservation(mp);
>  	resp->tr_create_tmpfile.tr_logcount = XFS_CREATE_TMPFILE_LOG_COUNT;
>  	resp->tr_create_tmpfile.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
>  
> -	resp->tr_mkdir.tr_logres = xfs_calc_mkdir_reservation(mp);
> -	resp->tr_mkdir.tr_logcount = XFS_MKDIR_LOG_COUNT;
> -	resp->tr_mkdir.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
> -
>  	resp->tr_ifree.tr_logres = xfs_calc_ifree_reservation(mp);
>  	resp->tr_ifree.tr_logcount = XFS_INACTIVE_LOG_COUNT;
>  	resp->tr_ifree.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
> @@ -886,6 +941,8 @@ xfs_trans_resv_calc(
>  		resp->tr_qm_dqalloc.tr_logcount = XFS_WRITE_LOG_COUNT;
>  	resp->tr_qm_dqalloc.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
>  
> +	xfs_calc_namespace_reservations(mp, resp);
> +
>  	/*
>  	 * The following transactions are logged in logical format with
>  	 * a default log count.
> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 02/17] Set up infastructure for deferred attribute operations
  2017-10-18 22:55 ` [PATCH 02/17] Set up infastructure for deferred attribute operations Allison Henderson
@ 2017-10-19 19:02   ` Darrick J. Wong
  2017-10-21  1:08     ` Allison Henderson
  0 siblings, 1 reply; 66+ messages in thread
From: Darrick J. Wong @ 2017-10-19 19:02 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Wed, Oct 18, 2017 at 03:55:18PM -0700, Allison Henderson wrote:
> This patch adds two new log item types for setting or
> removing attributes as deferred operations.  The
> xfs_attri_log_item logs an intent to set or remove an
> attribute.  The corresponding xfs_attrd_log_item holds
> a reference to the xfs_attri_log_item and is freed once
> the transaction is done.  Both log items use a generic
> xfs_attr_log_format structure that contains the attribute
> name, value, flags, inode, and an op_flag that indicates
> if the operations is a set or remove.
> 
> At the moment, this feature will only be used by the parent
> pointer patch set which uses attributes to store information
> about an inodes parent.
> 
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---
>  fs/xfs/Makefile                |   2 +
>  fs/xfs/libxfs/xfs_attr.c       |   2 +-
>  fs/xfs/libxfs/xfs_defer.h      |   1 +
>  fs/xfs/libxfs/xfs_log_format.h |  36 ++-
>  fs/xfs/libxfs/xfs_types.h      |   1 +
>  fs/xfs/xfs_attr.h              |  20 +-
>  fs/xfs/xfs_attr_item.c         | 512 +++++++++++++++++++++++++++++++++++++++++
>  fs/xfs/xfs_attr_item.h         | 111 +++++++++
>  fs/xfs/xfs_log_recover.c       | 140 +++++++++++
>  fs/xfs/xfs_super.c             |   1 +
>  fs/xfs/xfs_trans.h             |  13 ++
>  fs/xfs/xfs_trans_attr.c        | 286 +++++++++++++++++++++++
>  12 files changed, 1121 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile
> index a6e955b..ec6486b 100644
> --- a/fs/xfs/Makefile
> +++ b/fs/xfs/Makefile
> @@ -106,6 +106,7 @@ xfs-y				+= xfs_log.o \
>  				   xfs_bmap_item.o \
>  				   xfs_buf_item.o \
>  				   xfs_extfree_item.o \
> +				   xfs_attr_item.o \
>  				   xfs_icreate_item.o \
>  				   xfs_inode_item.o \
>  				   xfs_refcount_item.o \
> @@ -115,6 +116,7 @@ xfs-y				+= xfs_log.o \
>  				   xfs_trans_bmap.o \
>  				   xfs_trans_buf.o \
>  				   xfs_trans_extfree.o \
> +				   xfs_trans_attr.o \
>  				   xfs_trans_inode.o \
>  				   xfs_trans_refcount.o \
>  				   xfs_trans_rmap.o \
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index b00ec1f..5325ec2 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -74,7 +74,7 @@ STATIC int xfs_attr_fillstate(xfs_da_state_t *state);
>  STATIC int xfs_attr_refillstate(xfs_da_state_t *state);
>  
>  
> -STATIC int
> +int
>  xfs_attr_args_init(
>  	struct xfs_da_args	*args,
>  	struct xfs_inode	*dp,
> diff --git a/fs/xfs/libxfs/xfs_defer.h b/fs/xfs/libxfs/xfs_defer.h
> index d4f046d..ef0f8bf 100644
> --- a/fs/xfs/libxfs/xfs_defer.h
> +++ b/fs/xfs/libxfs/xfs_defer.h
> @@ -55,6 +55,7 @@ enum xfs_defer_ops_type {
>  	XFS_DEFER_OPS_TYPE_REFCOUNT,
>  	XFS_DEFER_OPS_TYPE_RMAP,
>  	XFS_DEFER_OPS_TYPE_FREE,
> +	XFS_DEFER_OPS_TYPE_ATTR,
>  	XFS_DEFER_OPS_TYPE_MAX,
>  };
>  
> diff --git a/fs/xfs/libxfs/xfs_log_format.h b/fs/xfs/libxfs/xfs_log_format.h
> index 8372e9b..b0ce87e 100644
> --- a/fs/xfs/libxfs/xfs_log_format.h
> +++ b/fs/xfs/libxfs/xfs_log_format.h
> @@ -18,6 +18,8 @@
>  #ifndef	__XFS_LOG_FORMAT_H__
>  #define __XFS_LOG_FORMAT_H__
>  
> +#include "xfs_attr.h"
> +
>  struct xfs_mount;
>  struct xfs_trans_res;
>  
> @@ -116,7 +118,12 @@ static inline uint xlog_get_cycle(char *ptr)
>  #define XLOG_REG_TYPE_CUD_FORMAT	24
>  #define XLOG_REG_TYPE_BUI_FORMAT	25
>  #define XLOG_REG_TYPE_BUD_FORMAT	26
> -#define XLOG_REG_TYPE_MAX		26
> +#define XLOG_REG_TYPE_ATTRI_FORMAT	27
> +#define XLOG_REG_TYPE_ATTRD_FORMAT	28
> +#define XLOG_REG_TYPE_ATTR_NAME		29
> +#define XLOG_REG_TYPE_ATTR_VALUE	30
> +#define XLOG_REG_TYPE_MAX		31
> +
>  
>  /*
>   * Flags to log operation header
> @@ -239,6 +246,8 @@ typedef struct xfs_trans_header {
>  #define	XFS_LI_CUD		0x1243
>  #define	XFS_LI_BUI		0x1244	/* bmbt update intent */
>  #define	XFS_LI_BUD		0x1245
> +#define	XFS_LI_ATTRI		0x1246  /* attr set/remove intent*/
> +#define	XFS_LI_ATTRD		0x1247  /* attr set/remove done */
>  
>  #define XFS_LI_TYPE_DESC \
>  	{ XFS_LI_EFI,		"XFS_LI_EFI" }, \
> @@ -254,7 +263,9 @@ typedef struct xfs_trans_header {
>  	{ XFS_LI_CUI,		"XFS_LI_CUI" }, \
>  	{ XFS_LI_CUD,		"XFS_LI_CUD" }, \
>  	{ XFS_LI_BUI,		"XFS_LI_BUI" }, \
> -	{ XFS_LI_BUD,		"XFS_LI_BUD" }
> +	{ XFS_LI_BUD,		"XFS_LI_BUD" }, \
> +	{ XFS_LI_ATTRI,		"XFS_LI_ATTRI" }, \
> +	{ XFS_LI_ATTRD,		"XFS_LI_ATTRD" }
>  
>  /*
>   * Inode Log Item Format definitions.
> @@ -863,4 +874,25 @@ struct xfs_icreate_log {
>  	__be32		icl_gen;	/* inode generation number to use */
>  };
>  
> +/* Flags for deferred attribute operations */
> +#define ATTR_OP_FLAGS_SET	0x01	/* Set the attribute */

The names need to have "XFS_" prefixed here because this is a public
header file, e.g. XFS_ATTR_OP_FLAGS_SET.

> +#define ATTR_OP_FLAGS_REMOVE	0x02	/* Remove the attribute */
> +#define ATTR_OP_FLAGS_MAX	0x02	/* Max flags */

I would be a little more explicit about which bits of op_flags are
actual bit flags and which parts are mutually exclusive type codes.
IOWs, from just these definitions here it's not clear that _SET and
_REMOVE can't both be set at the same time.

/* upper bits are flags, lower byte is type code */
#define XFS_ATTR_OP_FLAGS_SET		1
#define XFS_ATTR_OP_FLAGS_REMOVE	2
#define XFS_ATTR_OP_FLAGS_TYPE_MASK	0xFF

(TBH this opcode part could be an enum defined elsewhere like what RUI
pe_flags does...)

#define XFS_ATTR_OP_FLAGS_AFLAG		(1U << 31)
#define XFS_ATTR_OP_FLAGS_SOMEOTHERFLAG	(1U << 30)

> +
> +/*
> + * This is the structure used to lay out an attr log item in the
> + * log.
> + */
> +struct xfs_attr_log_format {
> +	uint64_t	id;		/* attri identifier */
> +	xfs_ino_t       ino;		/* the inode for this attr operation */
> +	uint32_t        op_flags;	/* marks the op as a set or remove */
> +	uint32_t        name_len;	/* attr name length */
> +	uint32_t        value_len;	/* attr value length */
> +	uint32_t        attr_flags;	/* attr flags */
> +	uint16_t	type;		/* attri log item type */
> +	uint16_t	size;		/* size of this item */
> +	uint32_t	pad;		/* pad to 64 bit aligned */

The names ought to have structure prefixes because this is a public header.

uint64_t	alf_id;
uint64_t	alf_ino;
...
uint32_t	alf_pad;

> +};
> +
>  #endif /* __XFS_LOG_FORMAT_H__ */
> diff --git a/fs/xfs/libxfs/xfs_types.h b/fs/xfs/libxfs/xfs_types.h
> index 0220159..5372063 100644
> --- a/fs/xfs/libxfs/xfs_types.h
> +++ b/fs/xfs/libxfs/xfs_types.h
> @@ -23,6 +23,7 @@ typedef uint32_t	prid_t;		/* project ID */
>  typedef uint32_t	xfs_agblock_t;	/* blockno in alloc. group */
>  typedef uint32_t	xfs_agino_t;	/* inode # within allocation grp */
>  typedef uint32_t	xfs_extlen_t;	/* extent length in blocks */
> +typedef uint32_t	xfs_attrlen_t;	/* attr length */
>  typedef uint32_t	xfs_agnumber_t;	/* allocation group number */
>  typedef int32_t		xfs_extnum_t;	/* # of extents in a file */
>  typedef int16_t		xfs_aextnum_t;	/* # extents in an attribute fork */
> diff --git a/fs/xfs/xfs_attr.h b/fs/xfs/xfs_attr.h
> index 8542606..34bb4cb 100644
> --- a/fs/xfs/xfs_attr.h
> +++ b/fs/xfs/xfs_attr.h
> @@ -18,6 +18,8 @@
>  #ifndef __XFS_ATTR_H__
>  #define	__XFS_ATTR_H__
>  
> +#include "libxfs/xfs_defer.h"

What does this header file need from xfs_defer.h?

>  struct xfs_inode;
>  struct xfs_da_args;
>  struct xfs_attr_list_context;
> @@ -87,6 +89,20 @@ typedef struct attrlist_ent {	/* data from attr_list() */
>  } attrlist_ent_t;
>  
>  /*
> + * List of attrs to commit later.
> + */
> +struct xfs_attr_item {
> +	xfs_ino_t	  xattri_ino;
> +	uint32_t	  xattri_op_flags;
> +	uint32_t	  xattri_value_len;   /* length of name and val */
> +	uint32_t	  xattri_name_len;    /* length of name */
> +	uint32_t	  xattri_flags;       /* attr flags */
> +	char		  xattri_name[XATTR_NAME_MAX];
> +	char              xattri_value[XATTR_SIZE_MAX];

MAXNAMELEN and XFS_XATTR_SIZE_MAX ?

(Ugh, really, we should just clean that up to XFS_MAXNAMELEN...)

> +	struct list_head  xattri_list;
> +};
> +
> +/*
>   * Given a pointer to the (char*) buffer containing the attr_list() result,
>   * and an index, return a pointer to the indicated attribute in the buffer.
>   */
> @@ -154,6 +170,8 @@ int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name, int flags);
>  int xfs_attr_remove_args(struct xfs_da_args *args, int flags);
>  int xfs_attr_list(struct xfs_inode *dp, char *buffer, int bufsize,
>  		  int flags, struct attrlist_cursor_kern *cursor);
> -
> +int xfs_attr_args_init(struct xfs_da_args *args, struct xfs_inode *dp,
> +		       const unsigned char *name, int flags);
> +int xfs_attr_calc_size(struct xfs_da_args *args, int *local);
>  
>  #endif	/* __XFS_ATTR_H__ */
> diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
> new file mode 100644
> index 0000000..8cbe9b0
> --- /dev/null
> +++ b/fs/xfs/xfs_attr_item.c
> @@ -0,0 +1,512 @@
> +/*
> + * Copyright (c) 2017 Oracle, Inc.
> + * All Rights Reserved.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it would be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write the Free Software Foundation Inc.
> + */
> +#include "xfs.h"
> +#include "xfs_fs.h"
> +#include "xfs_format.h"
> +#include "xfs_log_format.h"
> +#include "xfs_trans_resv.h"
> +#include "xfs_bit.h"
> +#include "xfs_mount.h"
> +#include "xfs_trans.h"
> +#include "xfs_trans_priv.h"
> +#include "xfs_buf_item.h"
> +#include "xfs_attr_item.h"
> +#include "xfs_log.h"
> +#include "xfs_btree.h"
> +#include "xfs_rmap.h"
> +
> +
> +static inline struct xfs_attri_log_item *ATTRI_ITEM(struct xfs_log_item *lip)
> +{
> +	return container_of(lip, struct xfs_attri_log_item, item);
> +}
> +
> +void
> +xfs_attri_item_free(
> +	struct xfs_attri_log_item	*attrip)
> +{
> +	kmem_free(attrip->item.li_lv_shadow);
> +	kmem_free(attrip);
> +}
> +
> +/*
> + * This returns the number of iovecs needed to log the given attri item.
> + * We only need 1 iovec for an attri item.  It just logs the attr_log_format
> + * structure.
> + */
> +static inline int
> +xfs_attri_item_sizeof(
> +	struct xfs_attri_log_item *attrip)
> +{
> +	return sizeof(struct xfs_attr_log_format);
> +}
> +
> +STATIC void
> +xfs_attri_item_size(
> +	struct xfs_log_item	*lip,
> +	int			*nvecs,
> +	int			*nbytes)
> +{
> +	struct xfs_attri_log_item       *attrip = ATTRI_ITEM(lip);
> +
> +	*nvecs += 1;
> +	*nbytes += xfs_attri_item_sizeof(attrip);
> +
> +	if (attrip->name_len > 0) {
> +		*nvecs += 1;
> +		nbytes += attrip->name_len;
> +	}
> +
> +	if (attrip->value_len > 0) {
> +		*nvecs += 1;
> +		nbytes += attrip->value_len;
> +	}
> +}
> +
> +/*
> + * This is called to fill in the vector of log iovecs for the
> + * given attri log item. We use only 1 iovec, and we point that
> + * at the attri_log_format structure embedded in the attri item.
> + * It is at this point that we assert that all of the attr
> + * slots in the attri item have been filled.
> + */
> +STATIC void
> +xfs_attri_item_format(
> +	struct xfs_log_item	*lip,
> +	struct xfs_log_vec	*lv)
> +{
> +	struct xfs_attri_log_item	*attrip = ATTRI_ITEM(lip);
> +	struct xfs_log_iovec	*vecp = NULL;
> +
> +	attrip->format.type = XFS_LI_ATTRI;
> +	attrip->format.size = 1;
> +
> +	xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTRI_FORMAT,
> +			&attrip->format,
> +			xfs_attri_item_sizeof(attrip));
> +	if (attrip->name_len > 0)
> +		xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTR_NAME,
> +				attrip->name, attrip->name_len);
> +
> +	if (attrip->value_len > 0)
> +		xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTR_VALUE,
> +				attrip->value, attrip->value_len);
> +}
> +
> +
> +/*
> + * Pinning has no meaning for an attri item, so just return.
> + */
> +STATIC void
> +xfs_attri_item_pin(
> +	struct xfs_log_item	*lip)
> +{
> +}
> +
> +/*
> + * The unpin operation is the last place an ATTRI is manipulated in the log. It
> + * is either inserted in the AIL or aborted in the event of a log I/O error. In
> + * either case, the EFI transaction has been successfully committed to make it

EFI?

> + * this far. Therefore, we expect whoever committed the ATTRI to either
> + * construct and commit the ATTRD or drop the ATTRD's reference in the event of
> + * error. Simply drop the log's ATTRI reference now that the log is done with
> + * it.
> + */
> +STATIC void
> +xfs_attri_item_unpin(
> +	struct xfs_log_item	*lip,
> +	int			remove)
> +{
> +	struct xfs_attri_log_item	*attrip = ATTRI_ITEM(lip);
> +
> +	xfs_attri_release(attrip);
> +}
> +
> +/*
> + * attri items have no locking or pushing.  However, since ATTRIs are pulled
> + * from the AIL when their corresponding ATTRDs are committed to disk, their
> + * situation is very similar to being pinned.  Return XFS_ITEM_PINNED so that
> + * the caller will eventually flush the log.  This should help in getting the
> + * ATTRI out of the AIL.
> + */
> +STATIC uint
> +xfs_attri_item_push(
> +	struct xfs_log_item	*lip,
> +	struct list_head	*buffer_list)
> +{
> +	return XFS_ITEM_PINNED;
> +}
> +
> +/*
> + * The ATTRI has been either committed or aborted if the transaction has been
> + * cancelled. If the transaction was cancelled, an ATTRD isn't going to be
> + * constructed and thus we free the ATTRI here directly.
> + */
> +STATIC void
> +xfs_attri_item_unlock(
> +	struct xfs_log_item	*lip)
> +{
> +	if (lip->li_flags & XFS_LI_ABORTED)
> +		xfs_attri_item_free(ATTRI_ITEM(lip));
> +}
> +
> +/*
> + * The ATTRI is logged only once and cannot be moved in the log, so simply
> + * return the lsn at which it's been logged.
> + */
> +STATIC xfs_lsn_t
> +xfs_attri_item_committed(
> +	struct xfs_log_item	*lip,
> +	xfs_lsn_t		lsn)
> +{
> +	return lsn;
> +}
> +
> +STATIC void
> +xfs_attri_item_committing(
> +	struct xfs_log_item	*lip,
> +	xfs_lsn_t		lsn)
> +{
> +}
> +
> +/*
> + * This is the ops vector shared by all attri log items.
> + */
> +static const struct xfs_item_ops xfs_attri_item_ops = {
> +	.iop_size	= xfs_attri_item_size,
> +	.iop_format	= xfs_attri_item_format,
> +	.iop_pin	= xfs_attri_item_pin,
> +	.iop_unpin	= xfs_attri_item_unpin,
> +	.iop_unlock	= xfs_attri_item_unlock,
> +	.iop_committed	= xfs_attri_item_committed,
> +	.iop_push	= xfs_attri_item_push,
> +	.iop_committing = xfs_attri_item_committing
> +};
> +
> +
> +/*
> + * Allocate and initialize an attri item
> + */
> +struct xfs_attri_log_item *
> +xfs_attri_init(
> +	struct xfs_mount	*mp)
> +
> +{
> +	struct xfs_attri_log_item	*attrip;
> +	uint			size;
> +
> +	size = (uint)(sizeof(struct xfs_attri_log_item));
> +	attrip = kmem_zalloc(size, KM_SLEEP);
> +
> +	xfs_log_item_init(mp, &(attrip->item), XFS_LI_ATTRI,
> +			  &xfs_attri_item_ops);
> +	attrip->format.id = (uintptr_t)(void *)attrip;
> +	atomic_set(&attrip->refcount, 2);
> +
> +	return attrip;
> +}
> +
> +/*
> + * Copy an attr format buffer from the given buf, and into the destination
> + * attr format structure.
> + */
> +int
> +xfs_attr_copy_format(struct xfs_log_iovec *buf,
> +		      struct xfs_attr_log_format *dst_attr_fmt)
> +{
> +	struct xfs_attr_log_format *src_attr_fmt = buf->i_addr;
> +	uint len = sizeof(struct xfs_attr_log_format);
> +
> +	if (buf->i_len == len) {
> +		memcpy((char *)dst_attr_fmt, (char *)src_attr_fmt, len);
> +		return 0;
> +	}
> +	return -EFSCORRUPTED;
> +}
> +
> +/*
> + * Freeing the attri requires that we remove it from the AIL if it has already
> + * been placed there. However, the ATTRI may not yet have been placed in the
> + * AIL when called by xfs_attri_release() from ATTRD processing due to the
> + * ordering of committed vs unpin operations in bulk insert operations. Hence
> + * the reference count to ensure only the last caller frees the ATTRI.
> + */
> +void
> +xfs_attri_release(
> +	struct xfs_attri_log_item	*attrip)
> +{
> +	ASSERT(atomic_read(&attrip->refcount) > 0);
> +	if (atomic_dec_and_test(&attrip->refcount)) {
> +		xfs_trans_ail_remove(&attrip->item,
> +				     SHUTDOWN_LOG_IO_ERROR);
> +		xfs_attri_item_free(attrip);
> +	}
> +}
> +
> +static inline struct xfs_attrd_log_item *ATTRD_ITEM(struct xfs_log_item *lip)
> +{
> +	return container_of(lip, struct xfs_attrd_log_item, item);
> +}
> +
> +STATIC void
> +xfs_attrd_item_free(struct xfs_attrd_log_item *attrdp)
> +{
> +	kmem_free(attrdp->item.li_lv_shadow);
> +	kmem_free(attrdp);
> +}
> +
> +/*
> + * This returns the number of iovecs needed to log the given attrd item.
> + * We only need 1 iovec for an attrd item.  It just logs the attr_log_format
> + * structure.
> + */
> +static inline int
> +xfs_attrd_item_sizeof(
> +	struct xfs_attrd_log_item *attrdp)
> +{
> +	return sizeof(struct xfs_attr_log_format);
> +}
> +
> +STATIC void
> +xfs_attrd_item_size(
> +	struct xfs_log_item	*lip,
> +	int			*nvecs,
> +	int			*nbytes)
> +{
> +	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
> +	*nvecs += 1;
> +	*nbytes += xfs_attrd_item_sizeof(attrdp);
> +
> +	if (attrdp->name_len > 0) {
> +		*nvecs += 1;
> +		nbytes += attrdp->name_len;
> +	}
> +
> +	if (attrdp->value_len > 0) {
> +		*nvecs += 1;
> +		nbytes += attrdp->value_len;
> +	}
> +}
> +
> +/*
> + * This is called to fill in the vector of log iovecs for the
> + * given attrd log item. We use only 1 iovec, and we point that
> + * at the attr_log_format structure embedded in the attrd item.
> + * It is at this point that we assert that all of the attr
> + * slots in the attrd item have been filled.
> + */
> +STATIC void
> +xfs_attrd_item_format(
> +	struct xfs_log_item	*lip,
> +	struct xfs_log_vec	*lv)
> +{
> +	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
> +	struct xfs_log_iovec	*vecp = NULL;
> +
> +	attrdp->format.type = XFS_LI_ATTRD;
> +	attrdp->format.size = 1;
> +
> +	xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTRD_FORMAT,
> +			&attrdp->format,
> +			xfs_attrd_item_sizeof(attrdp));
> +
> +	if (attrdp->name_len > 0)
> +		xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTR_NAME,
> +				attrdp->name, attrdp->name_len);
> +
> +	if (attrdp->value_len > 0)
> +		xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTR_VALUE,
> +				attrdp->value, attrdp->value_len);

No need to log name/value for ATTRD items, since we only care about
matching alf_id between ATTRI and ATTRD items.

> +}
> +
> +/*
> + * Pinning has no meaning for an attrd item, so just return.
> + */
> +STATIC void
> +xfs_attrd_item_pin(
> +	struct xfs_log_item	*lip)
> +{
> +}
> +
> +/*
> + * Since pinning has no meaning for an attrd item, unpinning does
> + * not either.
> + */
> +STATIC void
> +xfs_attrd_item_unpin(
> +	struct xfs_log_item	*lip,
> +	int			remove)
> +{
> +}
> +
> +/*
> + * There isn't much you can do to push on an attrd item.  It is simply stuck
> + * waiting for the log to be flushed to disk.
> + */
> +STATIC uint
> +xfs_attrd_item_push(
> +	struct xfs_log_item	*lip,
> +	struct list_head	*buffer_list)
> +{
> +	return XFS_ITEM_PINNED;
> +}
> +
> +/*
> + * The ATTRD is either committed or aborted if the transaction is cancelled. If
> + * the transaction is cancelled, drop our reference to the ATTRI and free the
> + * ATTRD.
> + */
> +STATIC void
> +xfs_attrd_item_unlock(
> +	struct xfs_log_item	*lip)
> +{
> +	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
> +
> +	if (lip->li_flags & XFS_LI_ABORTED) {
> +		xfs_attri_release(attrdp->attrip);
> +		xfs_attrd_item_free(attrdp);
> +	}
> +}
> +
> +/*
> + * When the attrd item is committed to disk, all we need to do is delete our
> + * reference to our partner attri item and then free ourselves. Since we're
> + * freeing ourselves we must return -1 to keep the transaction code from
> + * further referencing this item.
> + */
> +STATIC xfs_lsn_t
> +xfs_attrd_item_committed(
> +	struct xfs_log_item	*lip,
> +	xfs_lsn_t		lsn)
> +{
> +	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
> +
> +	/*
> +	 * Drop the ATTRI reference regardless of whether the ATTRD has been
> +	 * aborted. Once the ATTRD transaction is constructed, it is the sole
> +	 * responsibility of the ATTRD to release the ATTRI (even if the ATTRI
> +	 * is aborted due to log I/O error).
> +	 */
> +	xfs_attri_release(attrdp->attrip);
> +	xfs_attrd_item_free(attrdp);
> +
> +	return (xfs_lsn_t)-1;
> +}
> +
> +STATIC void
> +xfs_attrd_item_committing(
> +	struct xfs_log_item	*lip,
> +	xfs_lsn_t		lsn)
> +{
> +}
> +
> +/*
> + * This is the ops vector shared by all attrd log items.
> + */
> +static const struct xfs_item_ops xfs_attrd_item_ops = {
> +	.iop_size	= xfs_attrd_item_size,
> +	.iop_format	= xfs_attrd_item_format,
> +	.iop_pin	= xfs_attrd_item_pin,
> +	.iop_unpin	= xfs_attrd_item_unpin,
> +	.iop_unlock	= xfs_attrd_item_unlock,
> +	.iop_committed	= xfs_attrd_item_committed,
> +	.iop_push	= xfs_attrd_item_push,
> +	.iop_committing = xfs_attrd_item_committing
> +};
> +
> +/*
> + * Allocate and initialize an attrd item
> + */
> +struct xfs_attrd_log_item *
> +xfs_attrd_init(
> +	struct xfs_mount	*mp,
> +	struct xfs_attri_log_item	*attrip)
> +
> +{
> +	struct xfs_attrd_log_item	*attrdp;
> +	uint			size;
> +
> +	size = (uint)(sizeof(struct xfs_attrd_log_item));
> +	attrdp = kmem_zalloc(size, KM_SLEEP);
> +
> +	xfs_log_item_init(mp, &attrdp->item, XFS_LI_ATTRD,
> +			  &xfs_attrd_item_ops);
> +	attrdp->attrip = attrip;
> +	attrdp->format.id = attrip->format.id;
> +
> +	return attrdp;
> +}
> +
> +/*
> + * Process an attr intent item that was recovered from
> + * the log.  We need to delete the attr that it describes.
> + */
> +int
> +xfs_attri_recover(
> +	struct xfs_mount	*mp,
> +	struct xfs_attri_log_item	*attrip)
> +{
> +	struct xfs_attrd_log_item	*attrdp;
> +	struct xfs_trans	*tp;
> +	int			error = 0;
> +	struct xfs_attr_log_format	*attrp;
> +
> +	ASSERT(!test_bit(XFS_ATTRI_RECOVERED, &attrip->flags));
> +
> +	/*
> +	 * First check the validity of the attr described by the
> +	 * ATTRI.  If any are bad, then assume that all are bad and
> +	 * just toss the ATTRI.
> +	 */
> +	attrp = &attrip->format;
> +	if (attrp->value_len == 0 ||
> +	    attrp->name_len == 0 ||
> +	    attrp->op_flags > ATTR_OP_FLAGS_MAX) {
> +		/*
> +		 * This will pull the ATTRI from the AIL and
> +		 * free the memory associated with it.
> +		 */
> +		set_bit(XFS_ATTRI_RECOVERED, &attrip->flags);
> +		xfs_attri_release(attrip);
> +		return -EIO;
> +	}
> +
> +	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_itruncate, 0, 0, 0, &tp);
> +	if (error)
> +		return error;
> +	attrdp = xfs_trans_get_attrd(tp, attrip);
> +	attrp = &attrip->format;
> +
> +	error = xfs_trans_attr(tp, attrdp, attrp->ino,
> +				attrp->op_flags,
> +				attrp->attr_flags,
> +				attrp->name_len,
> +				attrp->value_len,
> +				attrip->name,
> +				attrip->value);
> +	if (error)
> +		goto abort_error;
> +
> +
> +	set_bit(XFS_ATTRI_RECOVERED, &attrip->flags);
> +	error = xfs_trans_commit(tp);
> +	return error;
> +
> +abort_error:
> +	xfs_trans_cancel(tp);
> +	return error;
> +}
> diff --git a/fs/xfs/xfs_attr_item.h b/fs/xfs/xfs_attr_item.h
> new file mode 100644
> index 0000000..023675d
> --- /dev/null
> +++ b/fs/xfs/xfs_attr_item.h
> @@ -0,0 +1,111 @@
> +/*
> + * Copyright (c) 2017 Oracle, Inc.
> + * All Rights Reserved.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it would be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write the Free Software Foundation Inc.
> + */
> +#ifndef	__XFS_ATTR_ITEM_H__
> +#define	__XFS_ATTR_ITEM_H__
> +
> +/* kernel only ATTRI/ATTRD definitions */
> +
> +struct xfs_mount;
> +struct kmem_zone;
> +
> +/*
> + * Max number of attrs in fast allocation path.
> + */
> +#define XFS_ATTRI_MAX_FAST_ATTRS        16

Daaaang, we can cram *sixteen* different attribute name/value updates in
a single defer_ops item? :)

Each defer_ops item gets its own transaction to make changes and log the
done item.  Sixteen attr updates is a /lot/ to be pushing through one
transaction since (AFAICT) the static log reservations provide worst
case space for one update.  I think this ought to be 1, especially since
the actual attr log item structure only appears to have space to store
one name and one value.

> +
> +
> +/*
> + * Define ATTR flag bits. Manipulated by set/clear/test_bit operators.
> + */
> +#define	XFS_ATTRI_RECOVERED	1
> +
> +/*
> + * This is the "attr intention" log item.  It is used to log the fact
> + * that some attrs need to be processed.  It is used in conjunction with the
> + * "attr done" log item described below.
> + *
> + * The ATTRI is reference counted so that it is not freed prior to both the
> + * ATTRI and ATTRD being committed and unpinned. This ensures the ATTRI is
> + * inserted into the AIL even in the event of out of order ATTRI/ATTRD
> + * processing. In other words, an ATTRI is born with two references:
> + *
> + *      1.) an ATTRI held reference to track ATTRI AIL insertion
> + *      2.) an ATTRD held reference to track ATTRD commit
> + *
> + * On allocation, both references are the responsibility of the caller. Once
> + * the ATTRI is added to and dirtied in a transaction, ownership of reference
> + * one transfers to the transaction. The reference is dropped once the ATTRI is
> + * inserted to the AIL or in the event of failure along the way (e.g., commit
> + * failure, log I/O error, etc.). Note that the caller remains responsible for
> + * the ATTRD reference under all circumstances to this point. The caller has no
> + * means to detect failure once the transaction is committed, however.
> + * Therefore, an ATTRD is required after this point, even in the event of
> + * unrelated failure.
> + *
> + * Once an ATTRD is allocated and dirtied in a transaction, reference two
> + * transfers to the transaction. The ATTRD reference is dropped once it reaches
> + * the unpin handler. Similar to the ATTRI, the reference also drops in the
> + * event of commit failure or log I/O errors. Note that the ATTRD is not
> + * inserted in the AIL, so at this point both the ATTI and ATTRD are freed.
> + */
> +struct xfs_attri_log_item {
> +	xfs_log_item_t			item;
> +	atomic_t			refcount;
> +	unsigned long			flags;	/* misc flags */
> +	int				name_len;
> +	void				*name;
> +	int				value_len;
> +	void				*value;
> +	struct xfs_attr_log_format	format;
> +};
> +
> +/*
> + * This is the "attr done" log item.  It is used to log
> + * the fact that some attrs earlier mentioned in an attri item
> + * have been freed.
> + */
> +struct xfs_attrd_log_item {
> +	struct xfs_log_item		item;
> +	struct xfs_attri_log_item	*attrip;
> +	uint				next_attr;
> +	int				name_len;
> +	void				*name;
> +	int				value_len;
> +	void				*value;
> +	struct xfs_attr_log_format	format;
> +};
> +
> +/*
> + * Max number of attrs in fast allocation path.
> + */
> +#define	XFS_ATTRD_MAX_FAST_ATTRS	16
> +
> +extern struct kmem_zone	*xfs_attri_zone;
> +extern struct kmem_zone	*xfs_attrd_zone;
> +
> +struct xfs_attri_log_item	*xfs_attri_init(struct xfs_mount *mp);
> +struct xfs_attrd_log_item	*xfs_attrd_init(struct xfs_mount *mp,
> +					struct xfs_attri_log_item *attrip);
> +int xfs_attr_copy_format(struct xfs_log_iovec *buf,
> +			 struct xfs_attr_log_format *dst_attri_fmt);
> +void			xfs_attri_item_free(struct xfs_attri_log_item *attrip);
> +void			xfs_attri_release(struct xfs_attri_log_item *attrip);
> +
> +int			xfs_attri_recover(struct xfs_mount *mp,
> +					struct xfs_attri_log_item *attrip);
> +
> +#endif	/* __XFS_ATTR_ITEM_H__ */
> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> index ee34899..8326f56 100644
> --- a/fs/xfs/xfs_log_recover.c
> +++ b/fs/xfs/xfs_log_recover.c
> @@ -33,6 +33,7 @@
>  #include "xfs_log_recover.h"
>  #include "xfs_inode_item.h"
>  #include "xfs_extfree_item.h"
> +#include "xfs_attr_item.h"
>  #include "xfs_trans_priv.h"
>  #include "xfs_alloc.h"
>  #include "xfs_ialloc.h"
> @@ -1956,6 +1957,8 @@ xlog_recover_reorder_trans(
>  		case XFS_LI_CUD:
>  		case XFS_LI_BUI:
>  		case XFS_LI_BUD:
> +		case XFS_LI_ATTRI:
> +		case XFS_LI_ATTRD:
>  			trace_xfs_log_recover_item_reorder_tail(log,
>  							trans, item, pass);
>  			list_move_tail(&item->ri_list, &inode_list);
> @@ -3489,6 +3492,92 @@ xlog_recover_efd_pass2(
>  	return 0;
>  }
>  
> +STATIC int
> +xlog_recover_attri_pass2(
> +	struct xlog                     *log,
> +	struct xlog_recover_item        *item,
> +	xfs_lsn_t                       lsn)
> +{
> +	int                             error;
> +	struct xfs_mount                *mp = log->l_mp;
> +	struct xfs_attri_log_item       *attrip;
> +	struct xfs_attr_log_format     *attri_formatp;
> +
> +	attri_formatp = item->ri_buf[0].i_addr;
> +
> +	attrip = xfs_attri_init(mp);
> +	error = xfs_attr_copy_format(&item->ri_buf[0], &attrip->format);
> +	if (error) {
> +		xfs_attri_item_free(attrip);
> +		return error;
> +	}
> +
> +	spin_lock(&log->l_ailp->xa_lock);
> +	/*
> +	 * The ATTRI has two references. One for the ATTRD and one for ATTRI to
> +	 * ensure it makes it into the AIL. Insert the ATTRI into the AIL
> +	 * directly and drop the ATTRI reference. Note that
> +	 * xfs_trans_ail_update() drops the AIL lock.
> +	 */
> +	xfs_trans_ail_update(log->l_ailp, &attrip->item, lsn);
> +	xfs_attri_release(attrip);
> +	return 0;
> +}
> +
> +
> +/*
> + * This routine is called when an ATTRD format structure is found in a committed
> + * transaction in the log. Its purpose is to cancel the corresponding ATTRI if
> + * it was still in the log. To do this it searches the AIL for the ATTRI with
> + * an id equal to that in the ATTRD format structure. If we find it we drop
> + * the ATTRD reference, which removes the ATTRI from the AIL and frees it.
> + */
> +STATIC int
> +xlog_recover_attrd_pass2(
> +	struct xlog                     *log,
> +	struct xlog_recover_item        *item)
> +{
> +	struct xfs_attr_log_format    *attrd_formatp;
> +	struct xfs_attri_log_item      *attrip = NULL;
> +	struct xfs_log_item          *lip;
> +	uint64_t                attri_id;
> +	struct xfs_ail_cursor   cur;
> +	struct xfs_ail          *ailp = log->l_ailp;
> +
> +	attrd_formatp = item->ri_buf[0].i_addr;
> +	ASSERT((item->ri_buf[0].i_len ==
> +				(sizeof(struct xfs_attr_log_format))));
> +	attri_id = attrd_formatp->id;
> +
> +	/*
> +	 * Search for the ATTRI with the id in the ATTRD format structure in the
> +	 * AIL.
> +	 */
> +	spin_lock(&ailp->xa_lock);
> +	lip = xfs_trans_ail_cursor_first(ailp, &cur, 0);
> +	while (lip != NULL) {
> +		if (lip->li_type == XFS_LI_ATTRI) {
> +			attrip = (struct xfs_attri_log_item *)lip;
> +			if (attrip->format.id == attri_id) {
> +				/*
> +				 * Drop the ATTRD reference to the ATTRI. This
> +				 * removes the ATTRI from the AIL and frees it.
> +				 */
> +				spin_unlock(&ailp->xa_lock);
> +				xfs_attri_release(attrip);
> +				spin_lock(&ailp->xa_lock);
> +				break;
> +			}
> +		}
> +		lip = xfs_trans_ail_cursor_next(ailp, &cur);
> +	}
> +
> +	xfs_trans_ail_cursor_done(&cur);
> +	spin_unlock(&ailp->xa_lock);
> +
> +	return 0;
> +}
> +
>  /*
>   * This routine is called to create an in-core extent rmap update
>   * item from the rui format structure which was logged on disk.
> @@ -4108,6 +4197,10 @@ xlog_recover_commit_pass2(
>  		return xlog_recover_efi_pass2(log, item, trans->r_lsn);
>  	case XFS_LI_EFD:
>  		return xlog_recover_efd_pass2(log, item);
> +	case XFS_LI_ATTRI:
> +		return xlog_recover_attri_pass2(log, item, trans->r_lsn);
> +	case XFS_LI_ATTRD:
> +		return xlog_recover_attrd_pass2(log, item);
>  	case XFS_LI_RUI:
>  		return xlog_recover_rui_pass2(log, item, trans->r_lsn);
>  	case XFS_LI_RUD:
> @@ -4669,6 +4762,49 @@ xlog_recover_cancel_efi(
>  	spin_lock(&ailp->xa_lock);
>  }
>  
> +/* Recover the ATTRI if necessary. */
> +STATIC int
> +xlog_recover_process_attri(
> +	struct xfs_mount                *mp,
> +	struct xfs_ail                  *ailp,
> +	struct xfs_log_item             *lip)
> +{
> +	struct xfs_attri_log_item       *attrip;
> +	int                             error;
> +
> +	/*
> +	 * Skip ATTRIs that we've already processed.
> +	 */
> +	attrip = container_of(lip, struct xfs_attri_log_item, item);
> +	if (test_bit(XFS_ATTRI_RECOVERED, &attrip->flags))
> +		return 0;
> +
> +	spin_unlock(&ailp->xa_lock);
> +	error = xfs_attri_recover(mp, attrip);
> +	spin_lock(&ailp->xa_lock);
> +
> +	return error;
> +}
> +
> +/* Release the ATTRI since we're cancelling everything. */
> +STATIC void
> +xlog_recover_cancel_attri(
> +	struct xfs_mount                *mp,
> +	struct xfs_ail                  *ailp,
> +	struct xfs_log_item             *lip)
> +{
> +	struct xfs_attri_log_item         *attrip;
> +
> +	attrip = container_of(lip, struct xfs_attri_log_item, item);
> +
> +	spin_unlock(&ailp->xa_lock);
> +	xfs_attri_release(attrip);
> +	spin_lock(&ailp->xa_lock);
> +}
> +
> +
> +
> +
>  /* Recover the RUI if necessary. */
>  STATIC int
>  xlog_recover_process_rui(
> @@ -4861,6 +4997,10 @@ xlog_recover_process_intents(
>  		case XFS_LI_EFI:
>  			error = xlog_recover_process_efi(log->l_mp, ailp, lip);
>  			break;
> +		case XFS_LI_ATTRI:
> +			error = xlog_recover_process_attri(log->l_mp,
> +							   ailp, lip);

FWIW you're allowed (in xfs land only) to use the double-indent
convention:

error = xlog_recover_process_attri(log->l_mp,
		ailp, lip);

Instead of spending time lining things up with the '('.

> +			break;
>  		case XFS_LI_RUI:
>  			error = xlog_recover_process_rui(log->l_mp, ailp, lip);
>  			break;
> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
> index 584cf2d..046ced4 100644
> --- a/fs/xfs/xfs_super.c
> +++ b/fs/xfs/xfs_super.c
> @@ -2024,6 +2024,7 @@ init_xfs_fs(void)
>  	xfs_rmap_update_init_defer_op();
>  	xfs_refcount_update_init_defer_op();
>  	xfs_bmap_update_init_defer_op();
> +	xfs_attr_init_defer_op();
>  
>  	xfs_dir_startup();
>  
> diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
> index 815b53d2..d003637 100644
> --- a/fs/xfs/xfs_trans.h
> +++ b/fs/xfs/xfs_trans.h
> @@ -40,6 +40,9 @@ struct xfs_cud_log_item;
>  struct xfs_defer_ops;
>  struct xfs_bui_log_item;
>  struct xfs_bud_log_item;
> +struct xfs_attrd_log_item;
> +struct xfs_attri_log_item;
> +
>  
>  typedef struct xfs_log_item {
>  	struct list_head		li_ail;		/* AIL pointers */
> @@ -223,12 +226,22 @@ void		xfs_trans_dirty_buf(struct xfs_trans *, struct xfs_buf *);
>  void		xfs_trans_log_inode(xfs_trans_t *, struct xfs_inode *, uint);
>  
>  void		xfs_extent_free_init_defer_op(void);
> +void            xfs_attr_init_defer_op(void);
> +
>  struct xfs_efd_log_item	*xfs_trans_get_efd(struct xfs_trans *,
>  				  struct xfs_efi_log_item *,
>  				  uint);
>  int		xfs_trans_free_extent(struct xfs_trans *,
>  				      struct xfs_efd_log_item *, xfs_fsblock_t,
>  				      xfs_extlen_t, struct xfs_owner_info *);
> +struct xfs_attrd_log_item *
> +xfs_trans_get_attrd(struct xfs_trans *tp,
> +		    struct xfs_attri_log_item *attrip);
> +int xfs_trans_attr(struct xfs_trans *tp, struct xfs_attrd_log_item *attrdp,
> +			xfs_ino_t ino, uint32_t attr_op_flags, uint32_t flags,
> +			uint32_t name_len, uint32_t value_len,
> +			char *name, char *value);
> +
>  int		xfs_trans_commit(struct xfs_trans *);
>  int		xfs_trans_roll(struct xfs_trans **);
>  int		xfs_trans_roll_inode(struct xfs_trans **, struct xfs_inode *);
> diff --git a/fs/xfs/xfs_trans_attr.c b/fs/xfs/xfs_trans_attr.c
> new file mode 100644
> index 0000000..39eb18d
> --- /dev/null
> +++ b/fs/xfs/xfs_trans_attr.c
> @@ -0,0 +1,286 @@
> +/*
> + * Copyright (c) 2017, Oracle Inc.
> + * All Rights Reserved.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it would be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write the Free Software Foundation Inc.
> + */
> +#include "xfs.h"
> +#include "xfs_fs.h"
> +#include "xfs_shared.h"
> +#include "xfs_format.h"
> +#include "xfs_log_format.h"
> +#include "xfs_trans_resv.h"
> +#include "xfs_bit.h"
> +#include "xfs_mount.h"
> +#include "xfs_defer.h"
> +#include "xfs_trans.h"
> +#include "xfs_trans_priv.h"
> +#include "xfs_attr_item.h"
> +#include "xfs_alloc.h"
> +#include "xfs_bmap.h"
> +#include "xfs_trace.h"
> +#include "libxfs/xfs_da_format.h"
> +#include "xfs_da_btree.h"
> +#include "xfs_attr.h"
> +#include "xfs_inode.h"
> +#include "xfs_icache.h"
> +
> +/*
> + * This routine is called to allocate an "extent free done"
> + * log item that will hold nextents worth of extents.  The
> + * caller must use all nextents extents, because we are not
> + * flexible about this at all.
> + */
> +struct xfs_attrd_log_item *
> +xfs_trans_get_attrd(struct xfs_trans		*tp,
> +		  struct xfs_attri_log_item	*attrip)
> +{
> +	struct xfs_attrd_log_item			*attrdp;
> +
> +	ASSERT(tp != NULL);
> +
> +	attrdp = xfs_attrd_init(tp->t_mountp, attrip);
> +	ASSERT(attrdp != NULL);
> +
> +	/*
> +	 * Get a log_item_desc to point at the new item.
> +	 */
> +	xfs_trans_add_item(tp, &attrdp->item);
> +	return attrdp;
> +}
> +
> +/*
> + * Delete an attr and log it to the ATTRD. Note that the transaction is marked
> + * dirty regardless of whether the attr delete succeeds or fails to support the
> + * ATTRI/ATTRD lifecycle rules.
> + */
> +int
> +xfs_trans_attr(
> +	struct xfs_trans	*tp,
> +	struct xfs_attrd_log_item	*attrdp,
> +	xfs_ino_t		ino,
> +	uint32_t		op_flags,
> +	uint32_t                flags,
> +	uint32_t		name_len,
> +	uint32_t		value_len,
> +	char			*name,
> +	char			*value)
> +{
> +	uint			next_attr;
> +	struct xfs_attr_log_format *attrp;
> +	int			error;
> +	int                     local;
> +	struct xfs_da_args      args;
> +	struct xfs_inode	*dp;
> +	struct xfs_defer_ops    dfops;
> +	xfs_fsblock_t		firstblock = NULLFSBLOCK;
> +	struct xfs_mount	*mp = tp->t_mountp;
> +
> +	error = xfs_iget(mp, tp, ino, flags, 0, &dp);
> +	if (error)
> +		return error;
> +
> +	ASSERT(XFS_IFORK_Q((dp)));
> +	tp->t_flags |= XFS_TRANS_RESERVE;
> +
> +	error = xfs_attr_args_init(&args, dp, name, flags);
> +	if (error)
> +		return error;
> +
> +	args.name = name;
> +	args.namelen = name_len;
> +	args.hashval = xfs_da_hashname(args.name, args.namelen);
> +	args.value = value;
> +	args.valuelen = value_len;
> +	args.dfops = &dfops;

dfops needs an xfs_defer_init().

Oh, the initialization of dfops is after where we start using it.
Please move it up (or the args.dfops assignment down).

> +	args.firstblock = &firstblock;
> +	args.op_flags = XFS_DA_OP_OKNOENT;
> +	args.total = xfs_attr_calc_size(&args, &local);
> +	args.trans = tp;
> +	ASSERT(local);
> +
> +	xfs_ilock(dp, XFS_ILOCK_EXCL);
> +	xfs_defer_init(args.dfops, args.firstblock);
> +
> +	if (op_flags & ATTR_OP_FLAGS_SET) {

switch (op_flags & XFS_ATTR_OP_FLAGS_TYPE_MASK) {
case XFS_ATTR_OP_FLAGS_SET:
	...
case XFS_ATTR_OP_FLAGS_REMOVE:
	...
default:
	...
}

> +		args.op_flags |= XFS_DA_OP_ADDNAME;
> +		error = xfs_attr_set_args(&args, flags, false);
> +	} else if (op_flags & ATTR_OP_FLAGS_REMOVE) {
> +		error = xfs_attr_remove_args(&args, flags);
> +	} else {
> +		ASSERT(0);

We're reading and processing log items off the disk, so cleaning up and
returning -EFSCORRUPTED is more appropriate here.

> +	}
> +
> +	if (error)
> +		xfs_defer_cancel(&dfops);
> +
> +	xfs_iunlock(dp, XFS_ILOCK_EXCL);
> +
> +	/*
> +	 * Mark the transaction dirty, even on error. This ensures the
> +	 * transaction is aborted, which:
> +	 *
> +	 * 1.) releases the ATTRI and frees the ATTRD
> +	 * 2.) shuts down the filesystem
> +	 */
> +	tp->t_flags |= XFS_TRANS_DIRTY;
> +	attrdp->item.li_desc->lid_flags |= XFS_LID_DIRTY;
> +
> +	next_attr = attrdp->next_attr;
> +	attrp = &(attrdp->format);
> +	attrp->ino = ino;
> +	attrp->op_flags = op_flags;
> +	attrp->value_len = value_len;
> +	attrp->name_len = name_len;
> +	attrp->attr_flags = flags;
> +
> +	attrdp->name = name;
> +	attrdp->value = value;
> +	attrdp->name_len = name_len;
> +	attrdp->value_len = value_len;
> +	attrdp->next_attr++;
> +
> +	return error;
> +}
> +
> +static int
> +xfs_attr_diff_items(
> +	void				*priv,
> +	struct list_head		*a,
> +	struct list_head		*b)
> +{
> +	return 0;
> +}
> +
> +/* Get an ATTRI. */
> +STATIC void *
> +xfs_attr_create_intent(
> +	struct xfs_trans		*tp,
> +	unsigned int			count)
> +{
> +	struct xfs_attri_log_item		*attrip;
> +
> +	ASSERT(tp != NULL);
> +	ASSERT(count > 0);
> +
> +	attrip = xfs_attri_init(tp->t_mountp);
> +	ASSERT(attrip != NULL);
> +
> +	/*
> +	 * Get a log_item_desc to point at the new item.
> +	 */
> +	xfs_trans_add_item(tp, &attrip->item);
> +	return attrip;
> +}
> +
> +/* Log an attr to the intent item. */
> +STATIC void
> +xfs_attr_log_item(
> +	struct xfs_trans		*tp,
> +	void				*intent,
> +	struct list_head		*item)
> +{
> +	struct xfs_attri_log_item	*attrip = intent;
> +	struct xfs_attr_item		*free;
> +	struct xfs_attr_log_format	*attrp;
> +
> +	free = container_of(item, struct xfs_attr_item, xattri_list);
> +
> +	tp->t_flags |= XFS_TRANS_DIRTY;
> +	attrip->item.li_desc->lid_flags |= XFS_LID_DIRTY;
> +
> +	attrp = &attrip->format;
> +	attrp->ino = free->xattri_ino;
> +	attrp->op_flags = free->xattri_op_flags;
> +	attrp->value_len = free->xattri_value_len;
> +	attrp->name_len = free->xattri_name_len;
> +	attrp->attr_flags = free->xattri_flags;
> +
> +	attrip->name = &(free->xattri_name[0]);
> +	attrip->value = &(free->xattri_value[0]);
> +	attrip->name_len = free->xattri_name_len;
> +	attrip->value_len = free->xattri_value_len;
> +}
> +
> +/* Get an ATTRD so we can process all the attrs. */
> +STATIC void *
> +xfs_attr_create_done(
> +	struct xfs_trans		*tp,
> +	void				*intent,
> +	unsigned int			count)
> +{
> +	return xfs_trans_get_attrd(tp, intent);
> +}
> +
> +/* Process an attr. */
> +STATIC int
> +xfs_attr_finish_item(
> +	struct xfs_trans		*tp,
> +	struct xfs_defer_ops		*dop,
> +	struct list_head		*item,
> +	void				*done_item,
> +	void				**state)
> +{
> +	struct xfs_attr_item	*free;
> +	int				error;
> +
> +	free = container_of(item, struct xfs_attr_item, xattri_list);
> +	error = xfs_trans_attr(tp, done_item,
> +			free->xattri_ino,
> +			free->xattri_op_flags,
> +			free->xattri_flags,
> +			free->xattri_name_len,
> +			free->xattri_value_len,
> +			free->xattri_name,
> +			free->xattri_value);
> +	kmem_free(free);
> +	return error;
> +}
> +
> +/* Abort all pending EFIs. */

EFIs?

--D

> +STATIC void
> +xfs_attr_abort_intent(
> +	void				*intent)
> +{
> +	xfs_attri_release(intent);
> +}
> +
> +/* Cancel an attr */
> +STATIC void
> +xfs_attr_cancel_item(
> +	struct list_head		*item)
> +{
> +	struct xfs_attr_item	*free;
> +
> +	free = container_of(item, struct xfs_attr_item, xattri_list);
> +	kmem_free(free);
> +}
> +
> +static const struct xfs_defer_op_type xfs_attr_defer_type = {
> +	.type		= XFS_DEFER_OPS_TYPE_ATTR,
> +	.max_items	= XFS_ATTRI_MAX_FAST_ATTRS,
> +	.diff_items	= xfs_attr_diff_items,
> +	.create_intent	= xfs_attr_create_intent,
> +	.abort_intent	= xfs_attr_abort_intent,
> +	.log_item	= xfs_attr_log_item,
> +	.create_done	= xfs_attr_create_done,
> +	.finish_item	= xfs_attr_finish_item,
> +	.cancel_item	= xfs_attr_cancel_item,
> +};
> +
> +/* Register the deferred op type. */
> +void
> +xfs_attr_init_defer_op(void)
> +{
> +	xfs_defer_init_op_type(&xfs_attr_defer_type);
> +}
> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 03/17] Add xfs_attr_set_defered and xfs_attr_remove_defered
  2017-10-18 22:55 ` [PATCH 03/17] Add xfs_attr_set_defered and xfs_attr_remove_defered Allison Henderson
@ 2017-10-19 19:13   ` Darrick J. Wong
  2017-10-21  1:08     ` Allison Henderson
  0 siblings, 1 reply; 66+ messages in thread
From: Darrick J. Wong @ 2017-10-19 19:13 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Wed, Oct 18, 2017 at 03:55:19PM -0700, Allison Henderson wrote:
> These routines set up set and start a new deferred attribute
> operation.  These functions are meant to be called by other
> code needing to initiate a deferred attribute operation.  We
> will use these routines later in the parent pointer patches.
> 
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_attr.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++
>  fs/xfs/xfs_attr.h        |  7 ++++++
>  2 files changed, 65 insertions(+)
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index 5325ec2..59f3502 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -458,6 +458,37 @@ xfs_attr_set(
>  	return error;
>  }
>  
> +int
> +xfs_attr_set_deferred(
> +	struct xfs_inode	*dp,
> +	struct xfs_defer_ops    *dfops,
> +	const unsigned char	*name,
> +	unsigned int		namelen,
> +	unsigned char		*value,
> +	unsigned int		valuelen,
> +	int			flags)
> +{
> +
> +	struct xfs_attr_item     *new;
> +
> +	ASSERT(namelen != 0);
> +	ASSERT(valuelen != 0);
> +
> +	new = kmem_alloc(sizeof(struct xfs_attr_item), KM_SLEEP|KM_NOFS);

Yikes, this is a 66,000 byte allocation.  I wouldn't take it for granted
that we can ask for more than a page's worth of memory.

Seeing as we already know namelen and valuelen, let's go for the
zero length array at the end of the struct approach:

struct xfs_attr_item {
	...
	uint16_t		xattri_namelen;
	uint8_t			xattri_namevalue[0];
};

#define XFS_ATTR_ITEM_SIZEOF(namelen, valuelen)	\
	(sizeof(struct xfs_attr_item) + (namelen) + (valuelen))

new = kmem_alloc(XFS_ATTR_ITEM_SIZEOF(namelen, valuelen), KM...);
if (!new)
	return -ENOMEM;
memcpy(new->xattri_namevalue, name, namelen);
new->xattri_namelen = namelen;
memcpy(&new->xattri_namevalue[namelen], value, valuelen);
...

Assuming there isn't a way to attach the caller's buffers to the attr
item without copying anything(?)  (It would be nice if we could, but
between the defer_add and the defer_finish a lot can happen w.r.t.
variable scope so that might be a bad idea.)

> +	memset(new, 0, sizeof(struct xfs_attr_item));
> +	new->xattri_ino = dp->i_ino;
> +	new->xattri_op_flags = ATTR_OP_FLAGS_SET;
> +	new->xattri_name_len = namelen;
> +	new->xattri_value_len = valuelen;
> +	new->xattri_flags = flags;
> +	memcpy(new->xattri_name, name, namelen);
> +	memcpy(&new->xattri_value, value, valuelen);
> +
> +	xfs_defer_add(dfops, XFS_DEFER_OPS_TYPE_ATTR, &new->xattri_list);
> +
> +	return 0;
> +}
> +
>  /*
>   * Generic handler routine to remove a name from an attribute list.
>   * Transitions attribute list from Btree to shortform as necessary.
> @@ -531,6 +562,33 @@ xfs_attr_remove(
>  	return error;
>  }
>  
> +int
> +xfs_attr_remove_deferred(
> +	struct xfs_inode        *dp,
> +	struct xfs_defer_ops    *dfops,
> +	const unsigned char     *name,
> +	unsigned int		namelen,
> +	int                     flags)
> +{
> +
> +	struct xfs_attr_item     *new;
> +
> +	ASSERT(namelen != 0);
> +
> +	new = kmem_alloc(sizeof(struct xfs_attr_item), KM_SLEEP|KM_NOFS);
> +	memset(new, 0, sizeof(struct xfs_attr_item));
> +	new->xattri_ino = dp->i_ino;
> +	new->xattri_op_flags = ATTR_OP_FLAGS_REMOVE;
> +	new->xattri_name_len = namelen;
> +	new->xattri_value_len = 0;
> +	new->xattri_flags = flags;
> +	memcpy(new->xattri_name, name, namelen);
> +
> +	xfs_defer_add(dfops, XFS_DEFER_OPS_TYPE_ATTR, &new->xattri_list);
> +
> +	return 0;
> +}
> +
>  /*========================================================================
>   * External routines when attribute list is inside the inode
>   *========================================================================*/
> diff --git a/fs/xfs/xfs_attr.h b/fs/xfs/xfs_attr.h
> index 34bb4cb..f4a53fd 100644
> --- a/fs/xfs/xfs_attr.h
> +++ b/fs/xfs/xfs_attr.h
> @@ -173,5 +173,12 @@ int xfs_attr_list(struct xfs_inode *dp, char *buffer, int bufsize,
>  int xfs_attr_args_init(struct xfs_da_args *args, struct xfs_inode *dp,
>  		       const unsigned char *name, int flags);
>  int xfs_attr_calc_size(struct xfs_da_args *args, int *local);
> +int xfs_attr_set_deferred(struct xfs_inode *dp, struct xfs_defer_ops *dfops,
> +			  const unsigned char *name, unsigned int name_len,
> +			  unsigned char *value, unsigned int valuelen,
> +			  int flags);
> +int xfs_attr_remove_deferred(struct xfs_inode *dp, struct xfs_defer_ops *dfops,
> +			    const unsigned char *name, unsigned int namelen,
> +			    int flags);
>  
>  #endif	/* __XFS_ATTR_H__ */
> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 04/17] Remove all strlen calls in all xfs_attr_* functions for attr names.
  2017-10-18 22:55 ` [PATCH 04/17] Remove all strlen calls in all xfs_attr_* functions for attr names Allison Henderson
@ 2017-10-19 19:15   ` Darrick J. Wong
  2017-10-21  1:10     ` Allison Henderson
  0 siblings, 1 reply; 66+ messages in thread
From: Darrick J. Wong @ 2017-10-19 19:15 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Wed, Oct 18, 2017 at 03:55:20PM -0700, Allison Henderson wrote:
> Parent pointer attributes use a binary name, so strlen will not work.
> Calling functions will need to pass in the name length
> 
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_attr.c | 12 ++++++++----
>  fs/xfs/xfs_acl.c         | 12 +++++++-----
>  fs/xfs/xfs_attr.h        |  9 +++++----
>  fs/xfs/xfs_ioctl.c       | 13 ++++++++++---
>  fs/xfs/xfs_iops.c        |  6 ++++--
>  fs/xfs/xfs_trans_attr.c  |  2 +-
>  fs/xfs/xfs_xattr.c       | 10 +++++++---
>  7 files changed, 42 insertions(+), 22 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index 59f3502..b94f0cd 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -79,6 +79,7 @@ xfs_attr_args_init(
>  	struct xfs_da_args	*args,
>  	struct xfs_inode	*dp,
>  	const unsigned char	*name,
> +	int			namelen,

I think these should be size_t since they describe memory buffer sizes,
and that's what strlen() returns.

At least change it to 'unsigned int' since negative size makes no sense here...

--D

>  	int			flags)
>  {
>  
> @@ -91,7 +92,7 @@ xfs_attr_args_init(
>  	args->dp = dp;
>  	args->flags = flags;
>  	args->name = name;
> -	args->namelen = strlen((const char *)name);
> +	args->namelen = namelen;
>  	if (args->namelen >= MAXNAMELEN)
>  		return -EFAULT;		/* match IRIX behaviour */
>  
> @@ -137,6 +138,7 @@ int
>  xfs_attr_get(
>  	struct xfs_inode	*ip,
>  	const unsigned char	*name,
> +	int			namelen,
>  	unsigned char		*value,
>  	int			*valuelenp,
>  	int			flags)
> @@ -150,7 +152,7 @@ xfs_attr_get(
>  	if (XFS_FORCED_SHUTDOWN(ip->i_mount))
>  		return -EIO;
>  
> -	error = xfs_attr_args_init(&args, ip, name, flags);
> +	error = xfs_attr_args_init(&args, ip, name, namelen, flags);
>  	if (error)
>  		return error;
>  
> @@ -386,6 +388,7 @@ int
>  xfs_attr_set(
>  	struct xfs_inode	*dp,
>  	const unsigned char	*name,
> +	int			namelen,
>  	unsigned char		*value,
>  	int			valuelen,
>  	int			flags)
> @@ -402,7 +405,7 @@ xfs_attr_set(
>  	if (XFS_FORCED_SHUTDOWN(dp->i_mount))
>  		return -EIO;
>  
> -	error = xfs_attr_args_init(&args, dp, name, flags);
> +	error = xfs_attr_args_init(&args, dp, name, namelen, flags);
>  	if (error)
>  		return error;
>  
> @@ -497,6 +500,7 @@ int
>  xfs_attr_remove(
>  	struct xfs_inode	*dp,
>  	const unsigned char	*name,
> +	int			namelen,
>  	int			flags)
>  {
>  	struct xfs_mount	*mp = dp->i_mount;
> @@ -510,7 +514,7 @@ xfs_attr_remove(
>  	if (XFS_FORCED_SHUTDOWN(dp->i_mount))
>  		return -EIO;
>  
> -	error = xfs_attr_args_init(&args, dp, name, flags);
> +	error = xfs_attr_args_init(&args, dp, name, namelen, flags);
>  	if (error)
>  		return error;
>  
> diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c
> index 7034e17..72eca24 100644
> --- a/fs/xfs/xfs_acl.c
> +++ b/fs/xfs/xfs_acl.c
> @@ -153,8 +153,8 @@ xfs_get_acl(struct inode *inode, int type)
>  	if (!xfs_acl)
>  		return ERR_PTR(-ENOMEM);
>  
> -	error = xfs_attr_get(ip, ea_name, (unsigned char *)xfs_acl,
> -							&len, ATTR_ROOT);
> +	error = xfs_attr_get(ip, ea_name, strlen((const char *)ea_name),
> +			     (unsigned char *)xfs_acl, &len, ATTR_ROOT);
>  	if (error) {
>  		/*
>  		 * If the attribute doesn't exist make sure we have a negative
> @@ -204,15 +204,17 @@ __xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type)
>  		len -= sizeof(struct xfs_acl_entry) *
>  			 (XFS_ACL_MAX_ENTRIES(ip->i_mount) - acl->a_count);
>  
> -		error = xfs_attr_set(ip, ea_name, (unsigned char *)xfs_acl,
> -				len, ATTR_ROOT);
> +		error = xfs_attr_set(ip, ea_name, strlen((const char *)ea_name),
> +				     (unsigned char *)xfs_acl, len, ATTR_ROOT);
>  
>  		kmem_free(xfs_acl);
>  	} else {
>  		/*
>  		 * A NULL ACL argument means we want to remove the ACL.
>  		 */
> -		error = xfs_attr_remove(ip, ea_name, ATTR_ROOT);
> +		error = xfs_attr_remove(ip, ea_name,
> +					strlen((const char *)ea_name),
> +					ATTR_ROOT);
>  
>  		/*
>  		 * If the attribute didn't exist to start with that's fine.
> diff --git a/fs/xfs/xfs_attr.h b/fs/xfs/xfs_attr.h
> index f4a53fd..532567e 100644
> --- a/fs/xfs/xfs_attr.h
> +++ b/fs/xfs/xfs_attr.h
> @@ -161,17 +161,18 @@ int xfs_attr_list_int_ilocked(struct xfs_attr_list_context *);
>  int xfs_attr_list_int(struct xfs_attr_list_context *);
>  int xfs_inode_hasattr(struct xfs_inode *ip);
>  int xfs_attr_get_ilocked(struct xfs_inode *ip, struct xfs_da_args *args);
> -int xfs_attr_get(struct xfs_inode *ip, const unsigned char *name,
> +int xfs_attr_get(struct xfs_inode *ip, const unsigned char *name, int namelen,
>  		 unsigned char *value, int *valuelenp, int flags);
> -int xfs_attr_set(struct xfs_inode *dp, const unsigned char *name,
> +int xfs_attr_set(struct xfs_inode *dp, const unsigned char *name, int namelen,
>  		 unsigned char *value, int valuelen, int flags);
>  int xfs_attr_set_args(struct xfs_da_args *args, int flags, bool roll_trans);
> -int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name, int flags);
> +int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name,
> +		    int namelen, int flags);
>  int xfs_attr_remove_args(struct xfs_da_args *args, int flags);
>  int xfs_attr_list(struct xfs_inode *dp, char *buffer, int bufsize,
>  		  int flags, struct attrlist_cursor_kern *cursor);
>  int xfs_attr_args_init(struct xfs_da_args *args, struct xfs_inode *dp,
> -		       const unsigned char *name, int flags);
> +		       const unsigned char *name, int namelen, int flags);
>  int xfs_attr_calc_size(struct xfs_da_args *args, int *local);
>  int xfs_attr_set_deferred(struct xfs_inode *dp, struct xfs_defer_ops *dfops,
>  			  const unsigned char *name, unsigned int name_len,
> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> index aa75389..1c9f813 100644
> --- a/fs/xfs/xfs_ioctl.c
> +++ b/fs/xfs/xfs_ioctl.c
> @@ -448,6 +448,7 @@ xfs_attrmulti_attr_get(
>  {
>  	unsigned char		*kbuf;
>  	int			error = -EFAULT;
> +	int			namelen;
>  
>  	if (*len > XFS_XATTR_SIZE_MAX)
>  		return -EINVAL;
> @@ -455,7 +456,9 @@ xfs_attrmulti_attr_get(
>  	if (!kbuf)
>  		return -ENOMEM;
>  
> -	error = xfs_attr_get(XFS_I(inode), name, kbuf, (int *)len, flags);
> +	namelen = strlen((const char *)name);
> +	error = xfs_attr_get(XFS_I(inode), name, namelen,
> +			     kbuf, (int *)len, flags);
>  	if (error)
>  		goto out_kfree;
>  
> @@ -477,6 +480,7 @@ xfs_attrmulti_attr_set(
>  {
>  	unsigned char		*kbuf;
>  	int			error;
> +	int			namelen;
>  
>  	if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
>  		return -EPERM;
> @@ -487,7 +491,8 @@ xfs_attrmulti_attr_set(
>  	if (IS_ERR(kbuf))
>  		return PTR_ERR(kbuf);
>  
> -	error = xfs_attr_set(XFS_I(inode), name, kbuf, len, flags);
> +	namelen = strlen((const char *)name);
> +	error = xfs_attr_set(XFS_I(inode), name, namelen, kbuf, len, flags);
>  	if (!error)
>  		xfs_forget_acl(inode, name, flags);
>  	kfree(kbuf);
> @@ -501,10 +506,12 @@ xfs_attrmulti_attr_remove(
>  	uint32_t		flags)
>  {
>  	int			error;
> +	int			namelen;
>  
>  	if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
>  		return -EPERM;
> -	error = xfs_attr_remove(XFS_I(inode), name, flags);
> +	namelen = strlen((const char *)name);
> +	error = xfs_attr_remove(XFS_I(inode), name, namelen, flags);
>  	if (!error)
>  		xfs_forget_acl(inode, name, flags);
>  	return error;
> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> index 17081c7..5247bfc 100644
> --- a/fs/xfs/xfs_iops.c
> +++ b/fs/xfs/xfs_iops.c
> @@ -70,8 +70,10 @@ xfs_initxattrs(
>  	int			error = 0;
>  
>  	for (xattr = xattr_array; xattr->name != NULL; xattr++) {
> -		error = xfs_attr_set(ip, xattr->name, xattr->value,
> -				      xattr->value_len, ATTR_SECURE);
> +		error = xfs_attr_set(ip, xattr->name,
> +				     strlen((const char *)xattr->name),
> +				     xattr->value, xattr->value_len,
> +				     ATTR_SECURE);
>  		if (error < 0)
>  			break;
>  	}
> diff --git a/fs/xfs/xfs_trans_attr.c b/fs/xfs/xfs_trans_attr.c
> index 39eb18d..a45e9d0 100644
> --- a/fs/xfs/xfs_trans_attr.c
> +++ b/fs/xfs/xfs_trans_attr.c
> @@ -93,7 +93,7 @@ xfs_trans_attr(
>  	ASSERT(XFS_IFORK_Q((dp)));
>  	tp->t_flags |= XFS_TRANS_RESERVE;
>  
> -	error = xfs_attr_args_init(&args, dp, name, flags);
> +	error = xfs_attr_args_init(&args, dp, name, name_len, flags);
>  	if (error)
>  		return error;
>  
> diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c
> index 0594db4..4ef09c4 100644
> --- a/fs/xfs/xfs_xattr.c
> +++ b/fs/xfs/xfs_xattr.c
> @@ -38,6 +38,7 @@ xfs_xattr_get(const struct xattr_handler *handler, struct dentry *unused,
>  	int xflags = handler->flags;
>  	struct xfs_inode *ip = XFS_I(inode);
>  	int error, asize = size;
> +	int namelen = strlen((const char *)name);
>  
>  	/* Convert Linux syscall to XFS internal ATTR flags */
>  	if (!size) {
> @@ -45,7 +46,8 @@ xfs_xattr_get(const struct xattr_handler *handler, struct dentry *unused,
>  		value = NULL;
>  	}
>  
> -	error = xfs_attr_get(ip, (unsigned char *)name, value, &asize, xflags);
> +	error = xfs_attr_get(ip, (unsigned char *)name, namelen, value,
> +			     &asize, xflags);
>  	if (error)
>  		return error;
>  	return asize;
> @@ -81,6 +83,7 @@ xfs_xattr_set(const struct xattr_handler *handler, struct dentry *unused,
>  	int			xflags = handler->flags;
>  	struct xfs_inode	*ip = XFS_I(inode);
>  	int			error;
> +	int			namelen = strlen((const char *)name);
>  
>  	/* Convert Linux syscall to XFS internal ATTR flags */
>  	if (flags & XATTR_CREATE)
> @@ -89,8 +92,9 @@ xfs_xattr_set(const struct xattr_handler *handler, struct dentry *unused,
>  		xflags |= ATTR_REPLACE;
>  
>  	if (!value)
> -		return xfs_attr_remove(ip, (unsigned char *)name, xflags);
> -	error = xfs_attr_set(ip, (unsigned char *)name,
> +		return xfs_attr_remove(ip, (unsigned char *)name,
> +				       namelen, xflags);
> +	error = xfs_attr_set(ip, (unsigned char *)name, namelen,
>  				(void *)value, size, xflags);
>  	if (!error)
>  		xfs_forget_acl(inode, name, xflags);
> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 06/17] xfs: get directory offset when removing directory name
  2017-10-18 22:55 ` [PATCH 06/17] xfs: get directory offset when removing " Allison Henderson
@ 2017-10-19 19:17   ` Darrick J. Wong
  2017-10-21  1:11     ` Allison Henderson
  0 siblings, 1 reply; 66+ messages in thread
From: Darrick J. Wong @ 2017-10-19 19:17 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs, Mark Tinguely, Dave Chinner

On Wed, Oct 18, 2017 at 03:55:22PM -0700, Allison Henderson wrote:
> From: Mark Tinguely <tinguely@sgi.com>
> 
> Return the directory offset information when removing an entry to the
> directory.
> 
> This offset will be used as the parent pointer offset in xfs_remove.
> 
> [dchinner: forward ported and cleaned up]
> [achender: rebased, changed __unint32_t to xfs_dir2_dataptr_t]
> 
> Signed-off-by: Mark Tinguely <tinguely@sgi.com>
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---
> v2: Changed typedefs to raw struct types
> 
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_dir2.c       | 15 +++++++++------
>  fs/xfs/libxfs/xfs_dir2.h       |  4 +++-
>  fs/xfs/libxfs/xfs_dir2_block.c |  4 ++--
>  fs/xfs/libxfs/xfs_dir2_leaf.c  |  5 +++--
>  fs/xfs/libxfs/xfs_dir2_node.c  |  5 +++--
>  fs/xfs/libxfs/xfs_dir2_sf.c    |  2 ++
>  fs/xfs/xfs_inode.c             |  7 ++++---
>  7 files changed, 26 insertions(+), 16 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_dir2.c b/fs/xfs/libxfs/xfs_dir2.c
> index a1ca460..0511eb9 100644
> --- a/fs/xfs/libxfs/xfs_dir2.c
> +++ b/fs/xfs/libxfs/xfs_dir2.c
> @@ -443,13 +443,14 @@ xfs_dir_lookup(
>   */
>  int
>  xfs_dir_removename(
> -	xfs_trans_t	*tp,
> -	xfs_inode_t	*dp,
> -	struct xfs_name	*name,
> -	xfs_ino_t	ino,
> -	xfs_fsblock_t	*first,		/* bmap's firstblock */
> +	struct xfs_trans	*tp,
> +	struct xfs_inode	*dp,
> +	struct xfs_name		*name,
> +	xfs_ino_t		ino,
> +	xfs_fsblock_t	*first,			/* bmap's firstblock */

Indentation problem?

--D

>  	struct xfs_defer_ops	*dfops,		/* bmap's freeblock list */
> -	xfs_extlen_t	total)		/* bmap's total block count */
> +	xfs_extlen_t		total,		/* bmap's total block count */
> +	xfs_dir2_dataptr_t	*offset)	/* OUT: offset in directory */
>  {
>  	struct xfs_da_args *args;
>  	int		rval;
> @@ -495,6 +496,8 @@ xfs_dir_removename(
>  		rval = xfs_dir2_leaf_removename(args);
>  	else
>  		rval = xfs_dir2_node_removename(args);
> +	if (offset)
> +		*offset = args->offset;
>  out_free:
>  	kmem_free(args);
>  	return rval;
> diff --git a/fs/xfs/libxfs/xfs_dir2.h b/fs/xfs/libxfs/xfs_dir2.h
> index e349900..e1bd05d 100644
> --- a/fs/xfs/libxfs/xfs_dir2.h
> +++ b/fs/xfs/libxfs/xfs_dir2.h
> @@ -139,7 +139,9 @@ extern int xfs_dir_lookup(struct xfs_trans *tp, struct xfs_inode *dp,
>  extern int xfs_dir_removename(struct xfs_trans *tp, struct xfs_inode *dp,
>  				struct xfs_name *name, xfs_ino_t ino,
>  				xfs_fsblock_t *first,
> -				struct xfs_defer_ops *dfops, xfs_extlen_t tot);
> +				struct xfs_defer_ops *dfops,
> +				xfs_extlen_t tot,
> +				xfs_dir2_dataptr_t *offset);
>  extern int xfs_dir_replace(struct xfs_trans *tp, struct xfs_inode *dp,
>  				struct xfs_name *name, xfs_ino_t inum,
>  				xfs_fsblock_t *first,
> diff --git a/fs/xfs/libxfs/xfs_dir2_block.c b/fs/xfs/libxfs/xfs_dir2_block.c
> index 79684d5..4dbe2fc 100644
> --- a/fs/xfs/libxfs/xfs_dir2_block.c
> +++ b/fs/xfs/libxfs/xfs_dir2_block.c
> @@ -791,9 +791,9 @@ xfs_dir2_block_removename(
>  	/*
>  	 * Point to the data entry using the leaf entry.
>  	 */
> +	args->offset = be32_to_cpu(blp[ent].address);
>  	dep = (xfs_dir2_data_entry_t *)((char *)hdr +
> -			xfs_dir2_dataptr_to_off(args->geo,
> -						be32_to_cpu(blp[ent].address)));
> +			xfs_dir2_dataptr_to_off(args->geo, args->offset));
>  	/*
>  	 * Mark the data entry's space free.
>  	 */
> diff --git a/fs/xfs/libxfs/xfs_dir2_leaf.c b/fs/xfs/libxfs/xfs_dir2_leaf.c
> index 2ac7a7e..197e627 100644
> --- a/fs/xfs/libxfs/xfs_dir2_leaf.c
> +++ b/fs/xfs/libxfs/xfs_dir2_leaf.c
> @@ -1383,9 +1383,10 @@ xfs_dir2_leaf_removename(
>  	 * Point to the leaf entry, use that to point to the data entry.
>  	 */
>  	lep = &ents[index];
> -	db = xfs_dir2_dataptr_to_db(args->geo, be32_to_cpu(lep->address));
> +	args->offset = be32_to_cpu(lep->address);
> +	db = xfs_dir2_dataptr_to_db(args->geo, args->offset);
>  	dep = (xfs_dir2_data_entry_t *)((char *)hdr +
> -		xfs_dir2_dataptr_to_off(args->geo, be32_to_cpu(lep->address)));
> +		xfs_dir2_dataptr_to_off(args->geo, args->offset));
>  	needscan = needlog = 0;
>  	oldbest = be16_to_cpu(bf[0].length);
>  	ltp = xfs_dir2_leaf_tail_p(args->geo, leaf);
> diff --git a/fs/xfs/libxfs/xfs_dir2_node.c b/fs/xfs/libxfs/xfs_dir2_node.c
> index 8bc91f8..13d5244 100644
> --- a/fs/xfs/libxfs/xfs_dir2_node.c
> +++ b/fs/xfs/libxfs/xfs_dir2_node.c
> @@ -1238,9 +1238,10 @@ xfs_dir2_leafn_remove(
>  	/*
>  	 * Extract the data block and offset from the entry.
>  	 */
> -	db = xfs_dir2_dataptr_to_db(args->geo, be32_to_cpu(lep->address));
> +	args->offset = be32_to_cpu(lep->address);
> +	db = xfs_dir2_dataptr_to_db(args->geo, args->offset);
>  	ASSERT(dblk->blkno == db);
> -	off = xfs_dir2_dataptr_to_off(args->geo, be32_to_cpu(lep->address));
> +	off = xfs_dir2_dataptr_to_off(args->geo, args->offset);
>  	ASSERT(dblk->index == off);
>  
>  	/*
> diff --git a/fs/xfs/libxfs/xfs_dir2_sf.c b/fs/xfs/libxfs/xfs_dir2_sf.c
> index 489bdef..9e90c22 100644
> --- a/fs/xfs/libxfs/xfs_dir2_sf.c
> +++ b/fs/xfs/libxfs/xfs_dir2_sf.c
> @@ -919,6 +919,8 @@ xfs_dir2_sf_removename(
>  								XFS_CMP_EXACT) {
>  			ASSERT(dp->d_ops->sf_get_ino(sfp, sfep) ==
>  			       args->inumber);
> +			args->offset = xfs_dir2_byte_to_dataptr(
> +						xfs_dir2_sf_get_offset(sfep));
>  			break;
>  		}
>  	}
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index 3abcb17..358a98a 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -2639,8 +2639,8 @@ xfs_remove(
>  		goto out_trans_cancel;
>  
>  	xfs_defer_init(&dfops, &first_block);
> -	error = xfs_dir_removename(tp, dp, name, ip->i_ino,
> -					&first_block, &dfops, resblks);
> +	error = xfs_dir_removename(tp, dp, name, ip->i_ino, &first_block,
> +				   &dfops, resblks, NULL);
>  	if (error) {
>  		ASSERT(error != -ENOENT);
>  		goto out_bmap_cancel;
> @@ -3150,7 +3150,8 @@ xfs_rename(
>  					&first_block, &dfops, spaceres);
>  	} else
>  		error = xfs_dir_removename(tp, src_dp, src_name, src_ip->i_ino,
> -					   &first_block, &dfops, spaceres);
> +					   &first_block, &dfops, spaceres,
> +					   NULL);
>  	if (error)
>  		goto out_bmap_cancel;
>  
> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 12/17] xfs: parent pointer attribute creation
  2017-10-18 22:55 ` [PATCH 12/17] xfs: parent pointer attribute creation Allison Henderson
@ 2017-10-19 19:36   ` Darrick J. Wong
       [not found]     ` <9185d3e8-4b41-b2d8-294b-934f7d3409f0@oracle.com>
  2017-10-21  1:11     ` Allison Henderson
  0 siblings, 2 replies; 66+ messages in thread
From: Darrick J. Wong @ 2017-10-19 19:36 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs, Dave Chinner

On Wed, Oct 18, 2017 at 03:55:28PM -0700, Allison Henderson wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> [bfoster: rebase, use VFS inode generation]
> [achender: rebased, changed __unint32_t to xfs_dir2_dataptr_t,
> 	   fixed some null pointer bugs]
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---
> v2: remove unnecessary ENOSPC handling in xfs_attr_set_first_parent
> 
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---
>  fs/xfs/Makefile            |  1 +
>  fs/xfs/libxfs/xfs_attr.c   | 71 ++++++++++++++++++++++++++++++---
>  fs/xfs/libxfs/xfs_bmap.c   | 51 ++++++++++++++----------
>  fs/xfs/libxfs/xfs_bmap.h   |  1 +
>  fs/xfs/libxfs/xfs_parent.c | 98 ++++++++++++++++++++++++++++++++++++++++++++++
>  fs/xfs/xfs_attr.h          | 15 ++++++-
>  fs/xfs/xfs_inode.c         | 16 +++++++-
>  7 files changed, 225 insertions(+), 28 deletions(-)
> 
> diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile
> index ec6486b..3015bca 100644
> --- a/fs/xfs/Makefile
> +++ b/fs/xfs/Makefile
> @@ -52,6 +52,7 @@ xfs-y				+= $(addprefix libxfs/, \
>  				   xfs_inode_fork.o \
>  				   xfs_inode_buf.o \
>  				   xfs_log_rlimit.o \
> +				   xfs_parent.o \
>  				   xfs_ag_resv.o \
>  				   xfs_rmap.o \
>  				   xfs_rmap_btree.o \
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index 8f8bfff9..8aad242 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -91,12 +91,14 @@ xfs_attr_args_init(
>  	args->whichfork = XFS_ATTR_FORK;
>  	args->dp = dp;
>  	args->flags = flags;
> -	args->name = name;
> -	args->namelen = namelen;
> -	if (args->namelen >= MAXNAMELEN)
> -		return -EFAULT;		/* match IRIX behaviour */
> +	if (name) {

When do we have a NULL name?

> +		args->name = name;
> +		args->namelen = namelen;
> +		if (args->namelen >= MAXNAMELEN)
> +			return -EFAULT;		/* match IRIX behaviour */
>  
> -	args->hashval = xfs_da_hashname(args->name, args->namelen);
> +		args->hashval = xfs_da_hashname(args->name, args->namelen);
> +	}
>  	return 0;
>  }
>  
> @@ -206,6 +208,65 @@ xfs_attr_calc_size(
>  }
>  
>  /*
> + * Add the initial parent pointer attribute.
> + *
> + * Inode must be locked and completely empty as we are adding the attribute
> + * fork to the inode. This open codes bits of xfs_bmap_add_attrfork() and
> + * xfs_attr_set() because we know the inode is completely empty at this point

Hrmm... in general I don't like opencoding bits of other functions
without a good justification.

> + * and so don't need to handle all the different combinations of fork
> + * configurations here.
> + */
> +int
> +xfs_attr_set_first_parent(
> +	struct xfs_trans	*tp,
> +	struct xfs_inode	*ip,
> +	struct xfs_parent_name_rec *rec,
> +	int			reclen,
> +	const char		*value,
> +	int			valuelen,
> +	struct xfs_defer_ops	*dfops,
> +	xfs_fsblock_t		*firstblock)

These all need one more level of indentation due to struct xfs_parent_name_rec.

> +{
> +	struct xfs_da_args	args;
> +	int			flags = ATTR_PARENT;
> +	int			local;
> +	int			sf_size;
> +	int			error;
> +
> +	tp->t_flags |= XFS_TRANS_RESERVE;
> +
> +	error = xfs_attr_args_init(&args, ip, (char *)rec, reclen, flags);
> +	if (error)
> +		return error;
> +
> +	args.name = (char *)rec;
> +	args.namelen = reclen;
> +	args.hashval = xfs_da_hashname(args.name, args.namelen);

Aren't these already set by xfs_attr_args_init?

> +	args.value = (char *)value;
> +	args.valuelen = valuelen;
> +	args.firstblock = firstblock;
> +	args.dfops = dfops;
> +	args.op_flags = XFS_DA_OP_ADDNAME | XFS_DA_OP_OKNOENT;
> +	args.total = xfs_attr_calc_size(&args, &local);
> +	args.trans = tp;
> +	ASSERT(local);
> +
> +	/* set the attribute fork appropriately */
> +	sf_size = sizeof(struct xfs_attr_sf_hdr) +
> +			XFS_ATTR_SF_ENTSIZE_BYNAME(reclen, valuelen);
> +	xfs_bmap_set_attrforkoff(ip, sf_size, NULL);
> +	ip->i_afp = kmem_zone_zalloc(xfs_ifork_zone, KM_SLEEP);
> +	ip->i_afp->if_flags = XFS_IFEXTENTS;
> +
> +
> +	/* Try to add the attr to the attribute list in the inode. */
> +	xfs_attr_shortform_create(&args);

Are we sure that we'll always be able to cram the parent attribute into
the shortform area?  Minimum inode size is 512 bytes, core size is
currently 176 bytes, max parent attribute size is ~280 bytes... I guess
that works.

But I wouldn't want this to blow up some day when the inode core gets
bigger and this no longer fits.  Will using the regular xfs_attr_set
function cover all these sizing cases?  What's the benefit to all this
short circuiting?

> +	error = xfs_attr_shortform_addname(&args);
> +
> +	return error;
> +}
> +
> +/*
>   * set the attribute specified in @args. In the case of the parent attribute
>   * being set, we do not want to roll the transaction on shortform-to-leaf
>   * conversion, as the attribute must be added in the same transaction as the
> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> index 044a363..7ee98be 100644
> --- a/fs/xfs/libxfs/xfs_bmap.c
> +++ b/fs/xfs/libxfs/xfs_bmap.c
> @@ -1066,6 +1066,35 @@ xfs_bmap_add_attrfork_local(
>  	return -EFSCORRUPTED;
>  }
>  
> +int
> +xfs_bmap_set_attrforkoff(
> +	struct xfs_inode	*ip,
> +	int			size,
> +	int			*version)
> +{
> +	switch (ip->i_d.di_format) {
> +	case XFS_DINODE_FMT_DEV:
> +		ip->i_d.di_forkoff = roundup(sizeof(xfs_dev_t), 8) >> 3;
> +		break;
> +	case XFS_DINODE_FMT_UUID:
> +		ip->i_d.di_forkoff = roundup(sizeof(uuid_t), 8) >> 3;
> +		break;
> +	case XFS_DINODE_FMT_LOCAL:
> +	case XFS_DINODE_FMT_EXTENTS:
> +	case XFS_DINODE_FMT_BTREE:
> +		ip->i_d.di_forkoff = xfs_attr_shortform_bytesfit(ip, size);
> +		if (!ip->i_d.di_forkoff)
> +			ip->i_d.di_forkoff = xfs_default_attroffset(ip) >> 3;
> +		else if ((ip->i_mount->m_flags & XFS_MOUNT_ATTR2) && version)
> +			*version = 2;
> +		break;
> +	default:
> +		ASSERT(0);
> +		return -EINVAL;
> +	}
> +	return 0;
> +}
> +
>  /*
>   * Convert inode from non-attributed to attributed.
>   * Must not be in a transaction, ip must not be locked.
> @@ -1120,27 +1149,7 @@ xfs_bmap_add_attrfork(
>  	xfs_trans_ijoin(tp, ip, 0);
>  	xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
>  
> -	switch (ip->i_d.di_format) {
> -	case XFS_DINODE_FMT_DEV:
> -		ip->i_d.di_forkoff = roundup(sizeof(xfs_dev_t), 8) >> 3;
> -		break;
> -	case XFS_DINODE_FMT_UUID:
> -		ip->i_d.di_forkoff = roundup(sizeof(uuid_t), 8) >> 3;
> -		break;
> -	case XFS_DINODE_FMT_LOCAL:
> -	case XFS_DINODE_FMT_EXTENTS:
> -	case XFS_DINODE_FMT_BTREE:
> -		ip->i_d.di_forkoff = xfs_attr_shortform_bytesfit(ip, size);
> -		if (!ip->i_d.di_forkoff)
> -			ip->i_d.di_forkoff = xfs_default_attroffset(ip) >> 3;
> -		else if (mp->m_flags & XFS_MOUNT_ATTR2)
> -			version = 2;
> -		break;
> -	default:
> -		ASSERT(0);
> -		error = -EINVAL;
> -		goto trans_cancel;
> -	}
> +	xfs_bmap_set_attrforkoff(ip, size, &version);
>  
>  	ASSERT(ip->i_afp == NULL);
>  	ip->i_afp = kmem_zone_zalloc(xfs_ifork_zone, KM_SLEEP);
> diff --git a/fs/xfs/libxfs/xfs_bmap.h b/fs/xfs/libxfs/xfs_bmap.h
> index 851982a..533f40f 100644
> --- a/fs/xfs/libxfs/xfs_bmap.h
> +++ b/fs/xfs/libxfs/xfs_bmap.h
> @@ -209,6 +209,7 @@ void	xfs_bmap_trace_exlist(struct xfs_inode *ip, xfs_extnum_t cnt,
>  void	xfs_trim_extent(struct xfs_bmbt_irec *irec, xfs_fileoff_t bno,
>  		xfs_filblks_t len);
>  int	xfs_bmap_add_attrfork(struct xfs_inode *ip, int size, int rsvd);
> +int	xfs_bmap_set_attrforkoff(struct xfs_inode *ip, int size, int *version);
>  void	xfs_bmap_local_to_extents_empty(struct xfs_inode *ip, int whichfork);
>  void	xfs_bmap_add_free(struct xfs_mount *mp, struct xfs_defer_ops *dfops,
>  			  xfs_fsblock_t bno, xfs_filblks_t len,
> diff --git a/fs/xfs/libxfs/xfs_parent.c b/fs/xfs/libxfs/xfs_parent.c
> new file mode 100644
> index 0000000..88f7edc
> --- /dev/null
> +++ b/fs/xfs/libxfs/xfs_parent.c
> @@ -0,0 +1,98 @@
> +/*
> + * Copyright (c) 2015 Red Hat, Inc.
> + * All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public License as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it would be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program; if not, write the Free Software Foundation
> + */
> +#include "xfs.h"
> +#include "xfs_fs.h"
> +#include "xfs_format.h"
> +#include "xfs_log_format.h"
> +#include "xfs_shared.h"
> +#include "xfs_trans_resv.h"
> +#include "xfs_mount.h"
> +#include "xfs_bmap_btree.h"
> +#include "xfs_inode.h"
> +#include "xfs_error.h"
> +#include "xfs_trace.h"
> +#include "xfs_trans.h"
> +#include "xfs_attr.h"
> +
> +/*
> + * Parent pointer attribute handling.
> + *
> + * Because the attribute value is a filename component, it will never be longer
> + * than 255 bytes. This means the attribute will always be a local format
> + * attribute as it is xfs_attr_leaf_entsize_local_max() for v5 filesystems will
> + * always be larger than this (max is 75% of block size).
> + *
> + * Creating a new parent attribute will always create a new attribute - there
> + * should never, ever be an existing attribute in the tree for a new inode.
> + * ENOSPC behaviour is problematic - creating the inode without the parent
> + * pointer is effectively a corruption, so we allow parent attribute creation
> + * to dip into the reserve block pool to avoid unexpected ENOSPC errors from
> + * occurring.
> + */
> +
> +/*
> + * Create the initial parent attribute.
> + *
> + * The initial attribute creation also needs to be atomic w.r.t the parent
> + * directory modification. Hence it needs to run in the same transaction and the
> + * transaction committed by the caller.  Because the attribute created is
> + * guaranteed to be a local attribute and is always going to be the first
> + * attribute in the attribute fork, we can do this safely in the single
> + * transaction context as it is impossible for an overwrite to occur and hence
> + * we'll never have a rolling overwrite transaction occurring here. Hence we
> + * can short-cut a lot of the normal xfs_attr_set() code paths that are needed
> + * to handle the generic cases.

Is there some other part of inode creation (ACL propagation?) that
thinks it could be the creator of the first attribute and will react
negatively to this?

> + */
> +static int
> +xfs_parent_create_nrec(
> +	struct xfs_trans	*tp,
> +	struct xfs_inode	*child,
> +	struct xfs_parent_name_irec *nrec,
> +	struct xfs_defer_ops	*dfops,
> +	xfs_fsblock_t		*firstblock)
> +{
> +	struct xfs_parent_name_rec rec;
> +
> +	rec.p_ino = cpu_to_be64(nrec->p_ino);
> +	rec.p_gen = cpu_to_be32(nrec->p_gen);
> +	rec.p_diroffset = cpu_to_be32(nrec->p_diroffset);

The disk->header and header->disk converters should be their own
functions so that later when I add parent pointer iterators I can pass
the irec to the iterator function directly.

(Granted I could just as easily do that later in my own patch...)

> +
> +	return xfs_attr_set_first_parent(tp, child, &rec, sizeof(rec),
> +				   nrec->p_name, nrec->p_namelen,
> +				   dfops, firstblock);
> +}
> +
> +int
> +xfs_parent_create(

What's this function do?  (Needs comment.)

--D

> +	struct xfs_trans	*tp,
> +	struct xfs_inode	*parent,
> +	struct xfs_inode	*child,
> +	struct xfs_name		*child_name,
> +	xfs_dir2_dataptr_t	diroffset,
> +	struct xfs_defer_ops	*dfops,
> +	xfs_fsblock_t		*firstblock)
> +{
> +	struct xfs_parent_name_irec nrec;
> +
> +	nrec.p_ino = parent->i_ino;
> +	nrec.p_gen = VFS_I(parent)->i_generation;
> +	nrec.p_diroffset = diroffset;
> +	nrec.p_name = child_name->name;
> +	nrec.p_namelen = child_name->len;
> +
> +	return xfs_parent_create_nrec(tp, child, &nrec, dfops, firstblock);
> +}
> diff --git a/fs/xfs/xfs_attr.h b/fs/xfs/xfs_attr.h
> index 7901c3b..b48e31b 100644
> --- a/fs/xfs/xfs_attr.h
> +++ b/fs/xfs/xfs_attr.h
> @@ -19,6 +19,8 @@
>  #define	__XFS_ATTR_H__
>  
>  #include "libxfs/xfs_defer.h"
> +#include "libxfs/xfs_da_format.h"
> +#include "libxfs/xfs_format.h"
>  
>  struct xfs_inode;
>  struct xfs_da_args;
> @@ -183,5 +185,16 @@ int xfs_attr_set_deferred(struct xfs_inode *dp, struct xfs_defer_ops *dfops,
>  int xfs_attr_remove_deferred(struct xfs_inode *dp, struct xfs_defer_ops *dfops,
>  			    const unsigned char *name, unsigned int namelen,
>  			    int flags);
> -
> +/*
> + * Parent pointer attribute prototypes
> + */
> +int xfs_parent_create(struct xfs_trans *tp, struct xfs_inode *parent,
> +		      struct xfs_inode *child, struct xfs_name *child_name,
> +		      xfs_dir2_dataptr_t diroffset, struct xfs_defer_ops *dfops,
> +		      xfs_fsblock_t *firstblock);
> +int xfs_attr_set_first_parent(struct xfs_trans *tp, struct xfs_inode *ip,
> +			      struct xfs_parent_name_rec *rec, int reclen,
> +			      const char *value, int valuelen,
> +			      struct xfs_defer_ops *dfops,
> +			      xfs_fsblock_t *firstblock);
>  #endif	/* __XFS_ATTR_H__ */
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index f7986d8..4396561 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -1164,6 +1164,7 @@ xfs_create(
>  	struct xfs_dquot	*pdqp = NULL;
>  	struct xfs_trans_res	*tres;
>  	uint			resblks;
> +	xfs_dir2_dataptr_t	diroffset;
>  
>  	trace_xfs_create(dp, name);
>  
> @@ -1253,7 +1254,7 @@ xfs_create(
>  	error = xfs_dir_createname(tp, dp, name, ip->i_ino,
>  					&first_block, &dfops, resblks ?
>  					resblks - XFS_IALLOC_SPACE_RES(mp) : 0,
> -					NULL);
> +					&diroffset);
>  	if (error) {
>  		ASSERT(error != -ENOSPC);
>  		goto out_trans_cancel;
> @@ -1272,6 +1273,19 @@ xfs_create(
>  	}
>  
>  	/*
> +	 * If we have parent pointers, we need to add the attribute containing
> +	 * the parent information now. This must be done within the same
> +	 * transaction the directory entry is created, while the new inode
> +	 * contains nothing in the inode literal area.
> +	 */
> +	if (xfs_sb_version_hasparent(&mp->m_sb)) {
> +		error = xfs_parent_create(tp, dp, ip, name, diroffset,
> +					  &dfops, &first_block);
> +		if (error)
> +			goto out_bmap_cancel;
> +	}
> +
> +	/*
>  	 * If this is a synchronous mount, make sure that the
>  	 * create transaction goes to disk before returning to
>  	 * the user.
> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 13/17] xfs: add parent attributes to link
  2017-10-18 22:55 ` [PATCH 13/17] xfs: add parent attributes to link Allison Henderson
@ 2017-10-19 19:40   ` Darrick J. Wong
  2017-10-21  1:12     ` Allison Henderson
  0 siblings, 1 reply; 66+ messages in thread
From: Darrick J. Wong @ 2017-10-19 19:40 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs, Dave Chinner

On Wed, Oct 18, 2017 at 03:55:29PM -0700, Allison Henderson wrote:
> From: Dave Chinner <dchinner@redhat.com>

Needs a description of what w're doing and maybe why...

> [bfoster: rebase, use VFS inode fields, fix xfs_bmap_finish() usage]
> [achender: rebased, changed __unint32_t to xfs_dir2_dataptr_t,
> 	   fixed null pointer bugs]
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_attr.c   | 20 +++++++++++++-
>  fs/xfs/libxfs/xfs_parent.c | 43 ++++++++++++++++++++++++++++++
>  fs/xfs/xfs_attr.h          | 10 +++++++
>  fs/xfs/xfs_inode.c         | 66 ++++++++++++++++++++++++++++++++++++----------
>  4 files changed, 124 insertions(+), 15 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index 8aad242..e7692ef 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -35,6 +35,7 @@
>  #include "xfs_bmap_util.h"
>  #include "xfs_bmap_btree.h"
>  #include "xfs_attr.h"
> +#include "xfs_attr_sf.h"
>  #include "xfs_attr_leaf.h"
>  #include "xfs_attr_remote.h"
>  #include "xfs_error.h"
> @@ -266,6 +267,23 @@ xfs_attr_set_first_parent(
>  	return error;
>  }
>  
> +int
> +xfs_attr_set_parent(
> +	struct xfs_trans	*tp,
> +	struct xfs_inode	*ip,
> +	struct xfs_parent_name_rec *rec,
> +	int			reclen,
> +	const char		*value,
> +	int			valuelen,
> +	struct xfs_defer_ops	*dfops,
> +	xfs_fsblock_t		*firstblock)
> +{
> +	int                     flags = ATTR_PARENT;
> +
> +	return xfs_attr_set_deferred(ip, dfops, (char *)rec, reclen,
> +				    (char *)value, valuelen, flags);
> +}
> +
>  /*
>   * set the attribute specified in @args. In the case of the parent attribute
>   * being set, we do not want to roll the transaction on shortform-to-leaf
> @@ -512,8 +530,8 @@ xfs_attr_set(
>  	 */
>  	xfs_trans_log_inode(args.trans, dp, XFS_ILOG_CORE);
>  	error = xfs_trans_commit(args.trans);
> -	xfs_iunlock(dp, XFS_ILOCK_EXCL);
>  
> +	xfs_iunlock(dp, XFS_ILOCK_EXCL);
>  	return error;
>  
>  out_defer_cancel:
> diff --git a/fs/xfs/libxfs/xfs_parent.c b/fs/xfs/libxfs/xfs_parent.c
> index 88f7edc..0707336 100644
> --- a/fs/xfs/libxfs/xfs_parent.c
> +++ b/fs/xfs/libxfs/xfs_parent.c
> @@ -96,3 +96,46 @@ xfs_parent_create(
>  
>  	return xfs_parent_create_nrec(tp, child, &nrec, dfops, firstblock);
>  }
> +
> +static int
> +xfs_parent_add_nrec(
> +	struct xfs_trans	*tp,
> +	struct xfs_inode	*child,
> +	struct xfs_parent_name_irec *nrec,
> +	struct xfs_defer_ops	*dfops,
> +	xfs_fsblock_t		*firstblock)
> +{
> +	struct xfs_parent_name_rec rec;
> +
> +	rec.p_ino = cpu_to_be64(nrec->p_ino);
> +	rec.p_gen = cpu_to_be32(nrec->p_gen);
> +	rec.p_diroffset = cpu_to_be32(nrec->p_diroffset);
> +
> +	return xfs_attr_set_parent(tp, child, &rec, sizeof(rec),
> +				   nrec->p_name, nrec->p_namelen,
> +				   dfops, firstblock);
> +}
> +
> +/*
> + * Add a parent record to an inode with existing parent records.
> + */
> +int
> +xfs_parent_add(
> +	struct xfs_trans	*tp,
> +	struct xfs_inode	*parent,
> +	struct xfs_inode	*child,
> +	struct xfs_name		*child_name,
> +	uint32_t		diroffset,
> +	struct xfs_defer_ops	*dfops,
> +	xfs_fsblock_t		*firstblock)
> +{
> +	struct xfs_parent_name_irec nrec;
> +
> +	nrec.p_ino = parent->i_ino;
> +	nrec.p_gen = VFS_I(parent)->i_generation;
> +	nrec.p_diroffset = diroffset;
> +	nrec.p_name = child_name->name;
> +	nrec.p_namelen = child_name->len;
> +
> +	return xfs_parent_add_nrec(tp, child, &nrec, dfops, firstblock);
> +}
> diff --git a/fs/xfs/xfs_attr.h b/fs/xfs/xfs_attr.h
> index b48e31b..acb6157 100644
> --- a/fs/xfs/xfs_attr.h
> +++ b/fs/xfs/xfs_attr.h
> @@ -197,4 +197,14 @@ int xfs_attr_set_first_parent(struct xfs_trans *tp, struct xfs_inode *ip,
>  			      const char *value, int valuelen,
>  			      struct xfs_defer_ops *dfops,
>  			      xfs_fsblock_t *firstblock);
> +
> +int xfs_parent_add(struct xfs_trans *tp, struct xfs_inode *parent,
> +		   struct xfs_inode *child, struct xfs_name *child_name,
> +		   xfs_dir2_dataptr_t diroffset, struct xfs_defer_ops *dfops,
> +		   xfs_fsblock_t *firstblock);
> +int xfs_attr_set_parent(struct xfs_trans *tp, struct xfs_inode *ip,
> +			struct xfs_parent_name_rec *rec, int reclen,
> +			const char *value, int valuelen,
> +			struct xfs_defer_ops *dfops, xfs_fsblock_t *firstblock);
> +
>  #endif	/* __XFS_ATTR_H__ */
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index 4396561..51b623b 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -1451,6 +1451,8 @@ xfs_link(
>  	struct xfs_defer_ops	dfops;
>  	xfs_fsblock_t           first_block;
>  	int			resblks;
> +	uint32_t		diroffset;
> +	bool			first_parent = false;
>  
>  	trace_xfs_link(tdp, target_name);
>  
> @@ -1467,6 +1469,25 @@ xfs_link(
>  	if (error)
>  		goto std_return;
>  
> +	/*
> +	 * If we have parent pointers and there is no attribute fork (i.e. we
> +	 * are linking in a O_TMPFILE created inode) we need to add the
> +	 * attribute fork to the inode. Because we may have an existing data
> +	 * fork, we do this before we start the link transaction as adding an
> +	 * attribute fork requires it's own transaction.

Ok, so an inode that isn't pointed to by any directory will have zero
parent link attributes and possibly not even an attr fork.  Got it.

--D

> +	 */
> +	if (xfs_sb_version_hasparent(&mp->m_sb) && !xfs_inode_hasattr(sip)) {
> +		int sf_size = sizeof(struct xfs_attr_sf_hdr) +
> +				XFS_ATTR_SF_ENTSIZE_BYNAME(
> +					sizeof(struct xfs_parent_name_rec),
> +					target_name->len);
> +		ASSERT(VFS_I(sip)->i_nlink == 0);
> +		error = xfs_bmap_add_attrfork(sip, sf_size, 0);
> +		if (error)
> +			goto std_return;
> +		first_parent = true;
> +	}
> +
>  	resblks = XFS_LINK_SPACE_RES(mp, target_name->len);
>  	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_link, resblks, 0, 0, &tp);
>  	if (error == -ENOSPC) {
> @@ -1498,8 +1519,6 @@ xfs_link(
>  			goto error_return;
>  	}
>  
> -	xfs_defer_init(&dfops, &first_block);
> -
>  	/*
>  	 * Handle initial link state of O_TMPFILE inode
>  	 */
> @@ -1509,36 +1528,55 @@ xfs_link(
>  			goto error_return;
>  	}
>  
> +	xfs_defer_init(&dfops, &first_block);
>  	error = xfs_dir_createname(tp, tdp, target_name, sip->i_ino,
> -				   &first_block, &dfops, resblks, NULL);
> +				   &first_block, &dfops, resblks, &diroffset);
>  	if (error)
> -		goto error_return;
> +		goto out_defer_cancel;
>  	xfs_trans_ichgtime(tp, tdp, XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG);
>  	xfs_trans_log_inode(tp, tdp, XFS_ILOG_CORE);
>  
>  	error = xfs_bumplink(tp, sip);
>  	if (error)
> -		goto error_return;
> +		goto out_defer_cancel;
>  
>  	/*
> -	 * If this is a synchronous mount, make sure that the
> -	 * link transaction goes to disk before returning to
> -	 * the user.
> +	 * If we have parent pointers, we now need to add the parent record to
> +	 * the attribute fork of the inode. If this is the initial parent
> +	 * atribute, we need to create it correctly, otherwise we can just add
> +	 * the parent to the inode.
> +	 */
> +	if (xfs_sb_version_hasparent(&mp->m_sb)) {
> +		if (first_parent)
> +			error = xfs_parent_create(tp, tdp, sip, target_name,
> +						  diroffset, &dfops,
> +						  &first_block);
> +		else
> +			error = xfs_parent_add(tp, tdp, sip, target_name,
> +					       diroffset, &dfops,
> +					       &first_block);
> +		if (error)
> +			goto out_defer_cancel;
> +	}
> +
> +	/*
> +	 * If this is a synchronous mount, make sure that the link transaction
> +	 * goes to disk before returning to the user.
>  	 */
>  	if (mp->m_flags & (XFS_MOUNT_WSYNC|XFS_MOUNT_DIRSYNC))
>  		xfs_trans_set_sync(tp);
>  
>  	error = xfs_defer_finish(&tp, &dfops);
> -	if (error) {
> -		xfs_defer_cancel(&dfops);
> -		goto error_return;
> -	}
> +	if (error)
> +		goto out_defer_cancel;
>  
>  	return xfs_trans_commit(tp);
>  
> - error_return:
> +out_defer_cancel:
> +	xfs_defer_cancel(&dfops);
> +error_return:
>  	xfs_trans_cancel(tp);
> - std_return:
> +std_return:
>  	return error;
>  }
>  
> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 14/17] xfs: remove parent pointers in unlink
  2017-10-18 22:55 ` [PATCH 14/17] xfs: remove parent pointers in unlink Allison Henderson
@ 2017-10-19 19:43   ` Darrick J. Wong
  2017-10-21  1:12     ` Allison Henderson
  0 siblings, 1 reply; 66+ messages in thread
From: Darrick J. Wong @ 2017-10-19 19:43 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs, Dave Chinner

On Wed, Oct 18, 2017 at 03:55:30PM -0700, Allison Henderson wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> [bfoster: rebase, use VFS inode generation]
> [achender: rebased, changed __unint32_t to xfs_dir2_dataptr_t
> 	   implemented xfs_attr_remove_parent]
> 
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_attr.c   | 15 +++++++++++++++
>  fs/xfs/libxfs/xfs_parent.c | 22 ++++++++++++++++++++++
>  fs/xfs/xfs_attr.h          |  7 +++++++
>  fs/xfs/xfs_inode.c         | 10 +++++++++-
>  fs/xfs/xfs_qm.c            |  2 +-
>  fs/xfs/xfs_qm.h            |  1 +
>  6 files changed, 55 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index e7692ef..7547eb7 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -42,6 +42,7 @@
>  #include "xfs_quota.h"
>  #include "xfs_trans_space.h"
>  #include "xfs_trace.h"
> +#include "xfs_qm.h"
>  
>  /*
>   * xfs_attr.c
> @@ -571,6 +572,20 @@ xfs_attr_set_deferred(
>  	return 0;
>  }
>  
> +int
> +xfs_attr_remove_parent(
> +	struct xfs_trans		*tp,
> +	struct xfs_inode		*dp,
> +	struct xfs_parent_name_rec	*rec,
> +	int				reclen,
> +	struct xfs_defer_ops		*dfops,
> +	xfs_fsblock_t			*firstblock)
> +{
> +	int flags = ATTR_PARENT;
> +
> +	return xfs_attr_remove_deferred(dp, dfops, (char *) rec, reclen, flags);

return xfs_attr_remove_deferred(dp, dfops, (char *) rec, reclen, ATTR_PARENT); ?

What do you think of changing these prototypes to take (void *) so you
don't have to cast so much?  The name and value are (more or less) a
dumb array of bytes to the filesystem.

Also kinda wondering if the corresponding routines in xfs_parent.c could
just call the xfs_attr functions directly instead of jumping through
single line helpers...

> +}
> +
>  /*
>   * Generic handler routine to remove a name from an attribute list.
>   * Transitions attribute list from Btree to shortform as necessary.
> diff --git a/fs/xfs/libxfs/xfs_parent.c b/fs/xfs/libxfs/xfs_parent.c
> index 0707336..ca695c4 100644
> --- a/fs/xfs/libxfs/xfs_parent.c
> +++ b/fs/xfs/libxfs/xfs_parent.c
> @@ -139,3 +139,25 @@ xfs_parent_add(
>  
>  	return xfs_parent_add_nrec(tp, child, &nrec, dfops, firstblock);
>  }
> +
> +/*
> + * Remove a parent record from a child inode.
> + */
> +int
> +xfs_parent_remove(
> +	struct xfs_trans	*tp,
> +	struct xfs_inode	*parent,
> +	struct xfs_inode	*child,
> +	xfs_dir2_dataptr_t	diroffset,
> +	struct xfs_defer_ops	*dfops,
> +	xfs_fsblock_t		*firstblock)
> +{
> +	struct xfs_parent_name_rec rec;
> +
> +	rec.p_ino = cpu_to_be64(parent->i_ino);
> +	rec.p_gen = cpu_to_be32(VFS_I(parent)->i_generation);
> +	rec.p_diroffset = cpu_to_be32(diroffset);
> +
> +	return xfs_attr_remove_parent(tp, child, &rec, sizeof(rec),
> +				      dfops, firstblock);
> +}
> diff --git a/fs/xfs/xfs_attr.h b/fs/xfs/xfs_attr.h
> index acb6157..7a3bf8b 100644
> --- a/fs/xfs/xfs_attr.h
> +++ b/fs/xfs/xfs_attr.h
> @@ -207,4 +207,11 @@ int xfs_attr_set_parent(struct xfs_trans *tp, struct xfs_inode *ip,
>  			const char *value, int valuelen,
>  			struct xfs_defer_ops *dfops, xfs_fsblock_t *firstblock);
>  
> +int xfs_parent_remove(struct xfs_trans *tp, struct xfs_inode *parent,
> +		      struct xfs_inode *child, xfs_dir2_dataptr_t diroffset,
> +		      struct xfs_defer_ops *dfops, xfs_fsblock_t *firstblock);
> +int xfs_attr_remove_parent(struct xfs_trans *tp, struct xfs_inode *ip,
> +			struct xfs_parent_name_rec *rec, int reclen,
> +			struct xfs_defer_ops *dfops, xfs_fsblock_t *firstblock);
> +
>  #endif	/* __XFS_ATTR_H__ */
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index 51b623b..a360c3d 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -2612,6 +2612,7 @@ xfs_remove(
>  	struct xfs_defer_ops	dfops;
>  	xfs_fsblock_t           first_block;
>  	uint			resblks;
> +	uint32_t		dir_offset;
>  
>  	trace_xfs_remove(dp, name);
>  
> @@ -2692,12 +2693,19 @@ xfs_remove(
>  
>  	xfs_defer_init(&dfops, &first_block);
>  	error = xfs_dir_removename(tp, dp, name, ip->i_ino, &first_block,
> -				   &dfops, resblks, NULL);
> +				   &dfops, resblks, &dir_offset);
>  	if (error) {
>  		ASSERT(error != -ENOENT);
>  		goto out_bmap_cancel;
>  	}
>  
> +	if (xfs_sb_version_hasparent(&mp->m_sb)) {
> +		error = xfs_parent_remove(tp, dp, ip, dir_offset, &dfops,
> +					  &first_block);
> +		if (error)
> +			goto out_bmap_cancel;
> +	}
> +
>  	/*
>  	 * If this is a synchronous mount, make sure that the
>  	 * remove transaction goes to disk before returning to
> diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
> index 010a13a..a047f0f 100644
> --- a/fs/xfs/xfs_qm.c
> +++ b/fs/xfs/xfs_qm.c
> @@ -307,7 +307,7 @@ xfs_qm_dqattach_one(
>  	return 0;
>  }
>  
> -static bool
> +bool
>  xfs_qm_need_dqattach(
>  	struct xfs_inode	*ip)
>  {
> diff --git a/fs/xfs/xfs_qm.h b/fs/xfs/xfs_qm.h
> index 2975a82..9976369 100644
> --- a/fs/xfs/xfs_qm.h
> +++ b/fs/xfs/xfs_qm.h
> @@ -176,6 +176,7 @@ extern int		xfs_qm_scall_setqlim(struct xfs_mount *, xfs_dqid_t, uint,
>  					struct qc_dqblk *);
>  extern int		xfs_qm_scall_quotaon(struct xfs_mount *, uint);
>  extern int		xfs_qm_scall_quotaoff(struct xfs_mount *, uint);
> +extern bool		xfs_qm_need_dqattach(struct xfs_inode *ip);

Huh?

--D

>  
>  static inline struct xfs_def_quota *
>  xfs_get_defquota(struct xfs_dquot *dqp, struct xfs_quotainfo *qi)
> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 15/17] xfs_bmap_add_attrfork(): re-add error handling from set_attrforkoff() call
  2017-10-18 22:55 ` [PATCH 15/17] xfs_bmap_add_attrfork(): re-add error handling from set_attrforkoff() call Allison Henderson
@ 2017-10-19 19:43   ` Darrick J. Wong
  2017-10-21  1:13     ` Allison Henderson
  0 siblings, 1 reply; 66+ messages in thread
From: Darrick J. Wong @ 2017-10-19 19:43 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs, Brian Foster

On Wed, Oct 18, 2017 at 03:55:31PM -0700, Allison Henderson wrote:
> From: Brian Foster <bfoster@redhat.com>
> 
> - fix for "xfs: parent pointer attribute creation"
> 
> [achender: rebased]

Please fold this into that patch.

--D

> 
> Signed-off-by: Brian Foster <bfoster@redhat.com>
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_bmap.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> index 7ee98be..a631fe1 100644
> --- a/fs/xfs/libxfs/xfs_bmap.c
> +++ b/fs/xfs/libxfs/xfs_bmap.c
> @@ -1149,7 +1149,9 @@ xfs_bmap_add_attrfork(
>  	xfs_trans_ijoin(tp, ip, 0);
>  	xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
>  
> -	xfs_bmap_set_attrforkoff(ip, size, &version);
> +	error = xfs_bmap_set_attrforkoff(ip, size, &version);
> +	if (error)
> +		goto trans_cancel;
>  
>  	ASSERT(ip->i_afp == NULL);
>  	ip->i_afp = kmem_zone_zalloc(xfs_ifork_zone, KM_SLEEP);
> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 17/17] Add the parent pointer support to the superblock version 5.
  2017-10-18 22:55 ` [PATCH 17/17] Add the parent pointer support to the superblock version 5 Allison Henderson
  2017-10-19  3:57   ` Amir Goldstein
@ 2017-10-19 19:45   ` Darrick J. Wong
  2017-10-21  1:13     ` Allison Henderson
  1 sibling, 1 reply; 66+ messages in thread
From: Darrick J. Wong @ 2017-10-19 19:45 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs, Mark Tinguely, Dave Chinner

On Wed, Oct 18, 2017 at 03:55:33PM -0700, Allison Henderson wrote:
> [dchinner: forward ported and cleaned up]
> [achender: rebased and added parent pointer attribute to
>            compatible attributes mask]
> 
> Signed-off-by: Mark Tinguely <tinguely@sgi.com>
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---
> v2: remove unrelated type clean up in xfs_format.h
> 
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_format.h | 7 +++++--
>  fs/xfs/libxfs/xfs_fs.h     | 1 +
>  fs/xfs/xfs_fsops.c         | 4 +++-
>  3 files changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h
> index 121862a..f3e3132 100644
> --- a/fs/xfs/libxfs/xfs_format.h
> +++ b/fs/xfs/libxfs/xfs_format.h
> @@ -459,10 +459,12 @@ xfs_sb_has_compat_feature(
>  #define XFS_SB_FEAT_RO_COMPAT_FINOBT   (1 << 0)		/* free inode btree */
>  #define XFS_SB_FEAT_RO_COMPAT_RMAPBT   (1 << 1)		/* reverse map btree */
>  #define XFS_SB_FEAT_RO_COMPAT_REFLINK  (1 << 2)		/* reflinked files */
> +#define XFS_SB_FEAT_RO_COMPAT_PARENT	(1 << 3)	/* parent inode ptr */
>  #define XFS_SB_FEAT_RO_COMPAT_ALL \
>  		(XFS_SB_FEAT_RO_COMPAT_FINOBT | \
>  		 XFS_SB_FEAT_RO_COMPAT_RMAPBT | \
> -		 XFS_SB_FEAT_RO_COMPAT_REFLINK)
> +		 XFS_SB_FEAT_RO_COMPAT_REFLINK| \
> +		 XFS_SB_FEAT_RO_COMPAT_PARENT)
>  #define XFS_SB_FEAT_RO_COMPAT_UNKNOWN	~XFS_SB_FEAT_RO_COMPAT_ALL
>  static inline bool
>  xfs_sb_has_ro_compat_feature(
> @@ -558,7 +560,8 @@ static inline bool xfs_sb_version_hasreflink(struct xfs_sb *sbp)
>  
>  static inline bool xfs_sb_version_hasparent(struct xfs_sb *sbp)
>  {
> -	return false; /* We'll enable this at the end of the set */
> +	return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5 &&
> +		(sbp->sb_features_ro_compat & XFS_SB_FEAT_RO_COMPAT_PARENT));
>  }
>  
>  /*
> diff --git a/fs/xfs/libxfs/xfs_fs.h b/fs/xfs/libxfs/xfs_fs.h
> index 8c61f21..b8108f8 100644
> --- a/fs/xfs/libxfs/xfs_fs.h
> +++ b/fs/xfs/libxfs/xfs_fs.h
> @@ -222,6 +222,7 @@ typedef struct xfs_fsop_resblks {
>  #define XFS_FSOP_GEOM_FLAGS_SPINODES	0x40000	/* sparse inode chunks	*/
>  #define XFS_FSOP_GEOM_FLAGS_RMAPBT	0x80000	/* reverse mapping btree */
>  #define XFS_FSOP_GEOM_FLAGS_REFLINK	0x100000 /* files can share blocks */
> +#define XFS_FSOP_GEOM_FLAGS_PARENT	0x200000 /* parent pointers */
>  
>  /*
>   * Minimum and maximum sizes need for growth checks.
> diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
> index 8f22fc5..9a0ce52 100644
> --- a/fs/xfs/xfs_fsops.c
> +++ b/fs/xfs/xfs_fsops.c
> @@ -111,7 +111,9 @@ xfs_fs_geometry(
>  			(xfs_sb_version_hasrmapbt(&mp->m_sb) ?
>  				XFS_FSOP_GEOM_FLAGS_RMAPBT : 0) |
>  			(xfs_sb_version_hasreflink(&mp->m_sb) ?
> -				XFS_FSOP_GEOM_FLAGS_REFLINK : 0);
> +				XFS_FSOP_GEOM_FLAGS_REFLINK : 0) |
> +			(xfs_sb_version_hasparent(&mp->m_sb) ?
> +				XFS_FSOP_GEOM_FLAGS_PARENT : 0);
>  		geo->logsectsize = xfs_sb_version_hassector(&mp->m_sb) ?
>  				mp->m_sb.sb_logsectsize : BBSIZE;
>  		geo->rtsectsize = mp->m_sb.sb_blocksize;

xfs_fs_fill_super ought to have a warning about parent pointers being
experimental.

--D

> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 01/17] Add helper functions xfs_attr_set_args and xfs_attr_remove_args
  2017-10-18 22:55 ` [PATCH 01/17] Add helper functions xfs_attr_set_args and xfs_attr_remove_args Allison Henderson
@ 2017-10-19 20:03   ` Darrick J. Wong
  2017-10-21  1:14     ` Allison Henderson
  0 siblings, 1 reply; 66+ messages in thread
From: Darrick J. Wong @ 2017-10-19 20:03 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Wed, Oct 18, 2017 at 03:55:17PM -0700, Allison Henderson wrote:
> These sub-routines set or remove the attributes specified in
> @args. We will use this later for setting parent pointers as a
> deferred attribute operation.
> 
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_attr.c | 322 +++++++++++++++++++++++++++--------------------
>  fs/xfs/xfs_attr.h        |   2 +
>  2 files changed, 189 insertions(+), 135 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> index 6249c92..b00ec1f 100644
> --- a/fs/xfs/libxfs/xfs_attr.c
> +++ b/fs/xfs/libxfs/xfs_attr.c
> @@ -203,6 +203,185 @@ xfs_attr_calc_size(
>  	return nblks;
>  }
>  
> +/*
> + * set the attribute specified in @args. In the case of the parent attribute
> + * being set, we do not want to roll the transaction on shortform-to-leaf
> + * conversion, as the attribute must be added in the same transaction as the
> + * parent directory modifications. Hence @roll_trans needs to be set
> + * appropriately to control whether the transaction is committed during this
> + * function.

Hmm... shouldn't the deferred attribute set code take care of all the
conversions/attr fork expansions/whatever is necessary to cram in the
parent pointer?  So in theory all the parent pointer updates should look
like this:

xfs_defer_init(&dfops...);
xfs_some_name_creating_operation(tp, ...);

if (hasparent) {
	xfs_parent_set(tp, ip);
}
xfs_defer_finish(&tp, &dfops);
xfs_trans_commit(tp);

xfs_parent_set() would then be:

if (fits in inode) {
	xfs_attr_set_sf(ip, key, value);
	xfs_log_attr_area();
	return;
}

xfs_attr_do_conversions_if_needed(dfops);
xfs_attr_set(dfops);

...and if there's a really good reason to try to cram things in, we can
add those later?  As I scanned the series, that was what kept coming up
in my head -- just tell the xfs_parent.c code to set a parent pointer;
it can figure out if there's sufficient space to put it directly into
the inode and log that, or we need something else that actually requires
the deferred ops mechanism then do that.

(Maybe I'm just fixating on xfs_parent_create...)

Higher level questions about robustness: if we try to set a parent
ptr and the attr name already exists, do we error out?  If we try to
remove a parent ptr and there's no attr name, do we error out?  I think
you've enough here to start thinking what changes need to be made to
xfs_repair to validate & fix the parent pointers, and what test cases
(and possibly ioctls) will need to be constructed to verify that we
actually get the parent pointers we want.

--D

> + */
> +int
> +xfs_attr_set_args(
> +	struct xfs_da_args	*args,
> +	int			flags,
> +	bool			roll_trans)
> +{
> +	struct xfs_inode	*dp = args->dp;
> +	struct xfs_mount        *mp = dp->i_mount;
> +	struct xfs_trans_res    tres;
> +	int			rsvd = 0;
> +	int			error = 0;
> +
> +	tres.tr_logres = M_RES(mp)->tr_attrsetm.tr_logres +
> +			 M_RES(mp)->tr_attrsetrt.tr_logres * args->total;
> +	tres.tr_logcount = XFS_ATTRSET_LOG_COUNT;
> +	tres.tr_logflags = XFS_TRANS_PERM_LOG_RES;
> +
> +	/*
> +	 * Root fork attributes can use reserved data blocks for this
> +	 * operation if necessary
> +	 */
> +	error = xfs_trans_alloc(mp, &tres, args->total, 0,
> +				rsvd ? XFS_TRANS_RESERVE : 0, &args->trans);
> +	if (error)
> +		goto out;
> +
> +	error = xfs_trans_reserve_quota_nblks(args->trans, dp, args->total, 0,
> +					      rsvd ? XFS_QMOPT_RES_REGBLKS |
> +						     XFS_QMOPT_FORCE_RES :
> +						     XFS_QMOPT_RES_REGBLKS);
> +	if (error)
> +		goto out;
> +
> +	xfs_trans_ijoin(args->trans, dp, 0);
> +	/*
> +	 * If the attribute list is non-existent or a shortform list,
> +	 * upgrade it to a single-leaf-block attribute list.
> +	 */
> +	if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL ||
> +	    (dp->i_d.di_aformat == XFS_DINODE_FMT_EXTENTS &&
> +	     dp->i_d.di_anextents == 0)) {
> +
> +		/*
> +		 * Build initial attribute list (if required).
> +		 */
> +		if (dp->i_d.di_aformat == XFS_DINODE_FMT_EXTENTS)
> +			xfs_attr_shortform_create(args);
> +
> +		/*
> +		 * Try to add the attr to the attribute list in the inode.
> +		 */
> +		error = xfs_attr_shortform_addname(args);
> +		if (error != -ENOSPC) {
> +			ASSERT(args->trans);
> +			if (!error && (flags & ATTR_KERNOTIME) == 0)
> +				xfs_trans_ichgtime(args->trans, dp,
> +						   XFS_ICHGTIME_CHG);
> +			goto out;
> +		}
> +
> +		/*
> +		 * It won't fit in the shortform, transform to a leaf block.
> +		 * GROT: another possible req'mt for a double-split btree op.
> +		 */
> +		error = xfs_attr_shortform_to_leaf(args);
> +		if (error)
> +			goto out;
> +		xfs_defer_ijoin(args->dfops, dp);
> +		if (roll_trans) {
> +			error = xfs_defer_finish(&args->trans, args->dfops);
> +			if (error) {
> +				args->trans = NULL;
> +				goto out;
> +			}
> +
> +			/*
> +			 * Commit the leaf transformation.  We'll need another
> +			 * (linked) transaction to add the new attribute to the
> +			 * leaf.
> +			 */
> +			error = xfs_trans_roll_inode(&args->trans, dp);
> +			if (error)
> +				goto out;
> +		}
> +	}
> +
> +	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
> +		error = xfs_attr_leaf_addname(args);
> +	else
> +		error = xfs_attr_node_addname(args);
> +	if (error)
> +		goto out;
> +
> +	if ((flags & ATTR_KERNOTIME) == 0)
> +		xfs_trans_ichgtime(args->trans, dp, XFS_ICHGTIME_CHG);
> +
> +	xfs_trans_log_inode(args->trans, dp, XFS_ILOG_CORE);
> +out:
> +	return error;
> +}
> +
> +/*
> + * Remove the attribute specified in @args.
> + */
> +int
> +xfs_attr_remove_args(
> +	struct xfs_da_args      *args,
> +	int			flags)
> +{
> +	struct xfs_inode	*dp = args->dp;
> +	struct xfs_mount	*mp = dp->i_mount;
> +	int			error;
> +	int                     rsvd = 0;
> +
> +	error = xfs_qm_dqattach_locked(dp, 0);
> +	if (error)
> +		return error;
> +
> +	/*
> +	 * Root fork attributes can use reserved data blocks for this
> +	 * operation if necessary
> +	 */
> +	if (flags & ATTR_ROOT)
> +		rsvd = XFS_TRANS_RESERVE;
> +	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_attrrm,
> +		XFS_ATTRRM_SPACE_RES(mp), 0, rsvd, &args->trans);
> +
> +	if (error)
> +		goto out;
> +
> +	/*
> +	 * No need to make quota reservations here. We expect to release some
> +	 * blocks not allocate in the common case.
> +	 */
> +	xfs_trans_ijoin(args->trans, dp, 0);
> +
> +	if (!xfs_inode_hasattr(dp)) {
> +		error = -ENOATTR;
> +	} else if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL) {
> +		ASSERT(dp->i_afp->if_flags & XFS_IFINLINE);
> +		error = xfs_attr_shortform_remove(args);
> +	} else if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
> +		error = xfs_attr_leaf_removename(args);
> +	} else {
> +		error = xfs_attr_node_removename(args);
> +	}
> +
> +	if (error)
> +		goto out;
> +
> +	/*
> +	 * If this is a synchronous mount, make sure that the
> +	 * transaction goes to disk before returning to the user.
> +	 */
> +	if (mp->m_flags & XFS_MOUNT_WSYNC)
> +		xfs_trans_set_sync(args->trans);
> +
> +	if ((flags & ATTR_KERNOTIME) == 0)
> +		xfs_trans_ichgtime(args->trans, dp, XFS_ICHGTIME_CHG);
> +
> +	xfs_trans_log_inode(args->trans, dp, XFS_ILOG_CORE);
> +
> +	return error;
> +
> +out:
> +	if (args->trans)
> +		xfs_trans_cancel(args->trans);
> +
> +	return error;
> +}
> +
>  int
>  xfs_attr_set(
>  	struct xfs_inode	*dp,
> @@ -214,10 +393,9 @@ xfs_attr_set(
>  	struct xfs_mount	*mp = dp->i_mount;
>  	struct xfs_da_args	args;
>  	struct xfs_defer_ops	dfops;
> -	struct xfs_trans_res	tres;
>  	xfs_fsblock_t		firstblock;
>  	int			rsvd = (flags & ATTR_ROOT) != 0;
> -	int			error, err2, local;
> +	int			error, local;
>  
>  	XFS_STATS_INC(mp, xs_attr_set);
>  
> @@ -252,106 +430,11 @@ xfs_attr_set(
>  			return error;
>  	}
>  
> -	tres.tr_logres = M_RES(mp)->tr_attrsetm.tr_logres +
> -			 M_RES(mp)->tr_attrsetrt.tr_logres * args.total;
> -	tres.tr_logcount = XFS_ATTRSET_LOG_COUNT;
> -	tres.tr_logflags = XFS_TRANS_PERM_LOG_RES;
> -
> -	/*
> -	 * Root fork attributes can use reserved data blocks for this
> -	 * operation if necessary
> -	 */
> -	error = xfs_trans_alloc(mp, &tres, args.total, 0,
> -			rsvd ? XFS_TRANS_RESERVE : 0, &args.trans);
> -	if (error)
> -		return error;
> -
>  	xfs_ilock(dp, XFS_ILOCK_EXCL);
> -	error = xfs_trans_reserve_quota_nblks(args.trans, dp, args.total, 0,
> -				rsvd ? XFS_QMOPT_RES_REGBLKS | XFS_QMOPT_FORCE_RES :
> -				       XFS_QMOPT_RES_REGBLKS);
> -	if (error) {
> -		xfs_iunlock(dp, XFS_ILOCK_EXCL);
> -		xfs_trans_cancel(args.trans);
> -		return error;
> -	}
> -
> -	xfs_trans_ijoin(args.trans, dp, 0);
> -
> -	/*
> -	 * If the attribute list is non-existent or a shortform list,
> -	 * upgrade it to a single-leaf-block attribute list.
> -	 */
> -	if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL ||
> -	    (dp->i_d.di_aformat == XFS_DINODE_FMT_EXTENTS &&
> -	     dp->i_d.di_anextents == 0)) {
> -
> -		/*
> -		 * Build initial attribute list (if required).
> -		 */
> -		if (dp->i_d.di_aformat == XFS_DINODE_FMT_EXTENTS)
> -			xfs_attr_shortform_create(&args);
> -
> -		/*
> -		 * Try to add the attr to the attribute list in
> -		 * the inode.
> -		 */
> -		error = xfs_attr_shortform_addname(&args);
> -		if (error != -ENOSPC) {
> -			/*
> -			 * Commit the shortform mods, and we're done.
> -			 * NOTE: this is also the error path (EEXIST, etc).
> -			 */
> -			ASSERT(args.trans != NULL);
> -
> -			/*
> -			 * If this is a synchronous mount, make sure that
> -			 * the transaction goes to disk before returning
> -			 * to the user.
> -			 */
> -			if (mp->m_flags & XFS_MOUNT_WSYNC)
> -				xfs_trans_set_sync(args.trans);
> -
> -			if (!error && (flags & ATTR_KERNOTIME) == 0) {
> -				xfs_trans_ichgtime(args.trans, dp,
> -							XFS_ICHGTIME_CHG);
> -			}
> -			err2 = xfs_trans_commit(args.trans);
> -			xfs_iunlock(dp, XFS_ILOCK_EXCL);
> -
> -			return error ? error : err2;
> -		}
> -
> -		/*
> -		 * It won't fit in the shortform, transform to a leaf block.
> -		 * GROT: another possible req'mt for a double-split btree op.
> -		 */
> -		xfs_defer_init(args.dfops, args.firstblock);
> -		error = xfs_attr_shortform_to_leaf(&args);
> -		if (error)
> -			goto out_defer_cancel;
> -		xfs_defer_ijoin(args.dfops, dp);
> -		error = xfs_defer_finish(&args.trans, args.dfops);
> -		if (error)
> -			goto out_defer_cancel;
> -
> -		/*
> -		 * Commit the leaf transformation.  We'll need another (linked)
> -		 * transaction to add the new attribute to the leaf.
> -		 */
> -
> -		error = xfs_trans_roll_inode(&args.trans, dp);
> -		if (error)
> -			goto out;
> -
> -	}
> -
> -	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
> -		error = xfs_attr_leaf_addname(&args);
> -	else
> -		error = xfs_attr_node_addname(&args);
> +	xfs_defer_init(args.dfops, args.firstblock);
> +	error = xfs_attr_set_args(&args, flags, true);
>  	if (error)
> -		goto out;
> +		goto out_defer_cancel;
>  
>  	/*
>  	 * If this is a synchronous mount, make sure that the
> @@ -360,9 +443,6 @@ xfs_attr_set(
>  	if (mp->m_flags & XFS_MOUNT_WSYNC)
>  		xfs_trans_set_sync(args.trans);
>  
> -	if ((flags & ATTR_KERNOTIME) == 0)
> -		xfs_trans_ichgtime(args.trans, dp, XFS_ICHGTIME_CHG);
> -
>  	/*
>  	 * Commit the last in the sequence of transactions.
>  	 */
> @@ -374,10 +454,6 @@ xfs_attr_set(
>  
>  out_defer_cancel:
>  	xfs_defer_cancel(&dfops);
> -	args.trans = NULL;
> -out:
> -	if (args.trans)
> -		xfs_trans_cancel(args.trans);
>  	xfs_iunlock(dp, XFS_ILOCK_EXCL);
>  	return error;
>  }
> @@ -417,38 +493,15 @@ xfs_attr_remove(
>  	 */
>  	args.op_flags = XFS_DA_OP_OKNOENT;
>  
> -	error = xfs_qm_dqattach(dp, 0);
> -	if (error)
> -		return error;
> -
> -	/*
> -	 * Root fork attributes can use reserved data blocks for this
> -	 * operation if necessary
> -	 */
> -	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_attrrm,
> -			XFS_ATTRRM_SPACE_RES(mp), 0,
> -			(flags & ATTR_ROOT) ? XFS_TRANS_RESERVE : 0,
> -			&args.trans);
> -	if (error)
> -		return error;
> -
>  	xfs_ilock(dp, XFS_ILOCK_EXCL);
>  	/*
>  	 * No need to make quota reservations here. We expect to release some
>  	 * blocks not allocate in the common case.
>  	 */
>  	xfs_trans_ijoin(args.trans, dp, 0);
> +	xfs_defer_init(args.dfops, args.firstblock);
>  
> -	if (!xfs_inode_hasattr(dp)) {
> -		error = -ENOATTR;
> -	} else if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL) {
> -		ASSERT(dp->i_afp->if_flags & XFS_IFINLINE);
> -		error = xfs_attr_shortform_remove(&args);
> -	} else if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
> -		error = xfs_attr_leaf_removename(&args);
> -	} else {
> -		error = xfs_attr_node_removename(&args);
> -	}
> +	error = xfs_attr_remove_args(&args, flags);
>  
>  	if (error)
>  		goto out;
> @@ -460,9 +513,6 @@ xfs_attr_remove(
>  	if (mp->m_flags & XFS_MOUNT_WSYNC)
>  		xfs_trans_set_sync(args.trans);
>  
> -	if ((flags & ATTR_KERNOTIME) == 0)
> -		xfs_trans_ichgtime(args.trans, dp, XFS_ICHGTIME_CHG);
> -
>  	/*
>  	 * Commit the last in the sequence of transactions.
>  	 */
> @@ -473,6 +523,8 @@ xfs_attr_remove(
>  	return error;
>  
>  out:
> +	xfs_defer_cancel(&dfops);
> +
>  	if (args.trans)
>  		xfs_trans_cancel(args.trans);
>  	xfs_iunlock(dp, XFS_ILOCK_EXCL);
> diff --git a/fs/xfs/xfs_attr.h b/fs/xfs/xfs_attr.h
> index 5d5a5e2..8542606 100644
> --- a/fs/xfs/xfs_attr.h
> +++ b/fs/xfs/xfs_attr.h
> @@ -149,7 +149,9 @@ int xfs_attr_get(struct xfs_inode *ip, const unsigned char *name,
>  		 unsigned char *value, int *valuelenp, int flags);
>  int xfs_attr_set(struct xfs_inode *dp, const unsigned char *name,
>  		 unsigned char *value, int valuelen, int flags);
> +int xfs_attr_set_args(struct xfs_da_args *args, int flags, bool roll_trans);
>  int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name, int flags);
> +int xfs_attr_remove_args(struct xfs_da_args *args, int flags);
>  int xfs_attr_list(struct xfs_inode *dp, char *buffer, int bufsize,
>  		  int flags, struct attrlist_cursor_kern *cursor);
>  
> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 17/17] Add the parent pointer support to the superblock version 5.
  2017-10-19  3:57   ` Amir Goldstein
@ 2017-10-19 20:06     ` Darrick J. Wong
  2017-10-20  3:18       ` Amir Goldstein
  0 siblings, 1 reply; 66+ messages in thread
From: Darrick J. Wong @ 2017-10-19 20:06 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Allison Henderson, linux-xfs, Mark Tinguely, Dave Chinner

On Thu, Oct 19, 2017 at 06:57:04AM +0300, Amir Goldstein wrote:
> On Thu, Oct 19, 2017 at 1:55 AM, Allison Henderson
> <allison.henderson@oracle.com> wrote:
> > [dchinner: forward ported and cleaned up]
> > [achender: rebased and added parent pointer attribute to
> >            compatible attributes mask]
> >
> > Signed-off-by: Mark Tinguely <tinguely@sgi.com>
> > Signed-off-by: Dave Chinner <dchinner@redhat.com>
> > Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> > ---
> > v2: remove unrelated type clean up in xfs_format.h
> 
> I'm curious how XFS_SB_VERSION2_PARENTBIT fits into the picture?
> old relic?

A feature bit most probably used by the SGI XFS parent pointer
implementation.  We don't want to reuse that bit and thereby crash into
their code.

--D

> >
> > Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> > ---
> >  fs/xfs/libxfs/xfs_format.h | 7 +++++--
> >  fs/xfs/libxfs/xfs_fs.h     | 1 +
> >  fs/xfs/xfs_fsops.c         | 4 +++-
> >  3 files changed, 9 insertions(+), 3 deletions(-)
> >
> > diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h
> > index 121862a..f3e3132 100644
> > --- a/fs/xfs/libxfs/xfs_format.h
> > +++ b/fs/xfs/libxfs/xfs_format.h
> > @@ -459,10 +459,12 @@ xfs_sb_has_compat_feature(
> >  #define XFS_SB_FEAT_RO_COMPAT_FINOBT   (1 << 0)                /* free inode btree */
> >  #define XFS_SB_FEAT_RO_COMPAT_RMAPBT   (1 << 1)                /* reverse map btree */
> >  #define XFS_SB_FEAT_RO_COMPAT_REFLINK  (1 << 2)                /* reflinked files */
> > +#define XFS_SB_FEAT_RO_COMPAT_PARENT   (1 << 3)        /* parent inode ptr */
> >  #define XFS_SB_FEAT_RO_COMPAT_ALL \
> >                 (XFS_SB_FEAT_RO_COMPAT_FINOBT | \
> >                  XFS_SB_FEAT_RO_COMPAT_RMAPBT | \
> > -                XFS_SB_FEAT_RO_COMPAT_REFLINK)
> > +                XFS_SB_FEAT_RO_COMPAT_REFLINK| \
> > +                XFS_SB_FEAT_RO_COMPAT_PARENT)
> >  #define XFS_SB_FEAT_RO_COMPAT_UNKNOWN  ~XFS_SB_FEAT_RO_COMPAT_ALL
> >  static inline bool
> >  xfs_sb_has_ro_compat_feature(
> > @@ -558,7 +560,8 @@ static inline bool xfs_sb_version_hasreflink(struct xfs_sb *sbp)
> >
> >  static inline bool xfs_sb_version_hasparent(struct xfs_sb *sbp)
> >  {
> > -       return false; /* We'll enable this at the end of the set */
> > +       return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5 &&
> > +               (sbp->sb_features_ro_compat & XFS_SB_FEAT_RO_COMPAT_PARENT));
> >  }
> >
> >  /*
> > diff --git a/fs/xfs/libxfs/xfs_fs.h b/fs/xfs/libxfs/xfs_fs.h
> > index 8c61f21..b8108f8 100644
> > --- a/fs/xfs/libxfs/xfs_fs.h
> > +++ b/fs/xfs/libxfs/xfs_fs.h
> > @@ -222,6 +222,7 @@ typedef struct xfs_fsop_resblks {
> >  #define XFS_FSOP_GEOM_FLAGS_SPINODES   0x40000 /* sparse inode chunks  */
> >  #define XFS_FSOP_GEOM_FLAGS_RMAPBT     0x80000 /* reverse mapping btree */
> >  #define XFS_FSOP_GEOM_FLAGS_REFLINK    0x100000 /* files can share blocks */
> > +#define XFS_FSOP_GEOM_FLAGS_PARENT     0x200000 /* parent pointers */
> >
> >  /*
> >   * Minimum and maximum sizes need for growth checks.
> > diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
> > index 8f22fc5..9a0ce52 100644
> > --- a/fs/xfs/xfs_fsops.c
> > +++ b/fs/xfs/xfs_fsops.c
> > @@ -111,7 +111,9 @@ xfs_fs_geometry(
> >                         (xfs_sb_version_hasrmapbt(&mp->m_sb) ?
> >                                 XFS_FSOP_GEOM_FLAGS_RMAPBT : 0) |
> >                         (xfs_sb_version_hasreflink(&mp->m_sb) ?
> > -                               XFS_FSOP_GEOM_FLAGS_REFLINK : 0);
> > +                               XFS_FSOP_GEOM_FLAGS_REFLINK : 0) |
> > +                       (xfs_sb_version_hasparent(&mp->m_sb) ?
> > +                               XFS_FSOP_GEOM_FLAGS_PARENT : 0);
> >                 geo->logsectsize = xfs_sb_version_hassector(&mp->m_sb) ?
> >                                 mp->m_sb.sb_logsectsize : BBSIZE;
> >                 geo->rtsectsize = mp->m_sb.sb_blocksize;
> > --
> > 2.7.4
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 17/17] Add the parent pointer support to the superblock version 5.
  2017-10-19 20:06     ` Darrick J. Wong
@ 2017-10-20  3:18       ` Amir Goldstein
  0 siblings, 0 replies; 66+ messages in thread
From: Amir Goldstein @ 2017-10-20  3:18 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Allison Henderson, linux-xfs, Mark Tinguely, Dave Chinner

On Thu, Oct 19, 2017 at 11:06 PM, Darrick J. Wong
<darrick.wong@oracle.com> wrote:
> On Thu, Oct 19, 2017 at 06:57:04AM +0300, Amir Goldstein wrote:
>> On Thu, Oct 19, 2017 at 1:55 AM, Allison Henderson
>> <allison.henderson@oracle.com> wrote:
>> > [dchinner: forward ported and cleaned up]
>> > [achender: rebased and added parent pointer attribute to
>> >            compatible attributes mask]
>> >
>> > Signed-off-by: Mark Tinguely <tinguely@sgi.com>
>> > Signed-off-by: Dave Chinner <dchinner@redhat.com>
>> > Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
>> > ---
>> > v2: remove unrelated type clean up in xfs_format.h
>>
>> I'm curious how XFS_SB_VERSION2_PARENTBIT fits into the picture?
>> old relic?
>
> A feature bit most probably used by the SGI XFS parent pointer
> implementation.  We don't want to reuse that bit and thereby crash into
> their code.
>

Ah, I was not aware that parallel universe exists.
Perhaps worth a comment near definition of XFS_SB_VERSION2_PARENTBIT
so it won't be confused with Linux parent pointers.

Amir.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 00/17] Parent Pointers V3
  2017-10-19  4:11 ` [PATCH 00/17] Parent Pointers V3 Amir Goldstein
@ 2017-10-20  3:22   ` Amir Goldstein
  2017-10-21  1:06     ` Allison Henderson
  2017-10-20 22:41   ` Dave Chinner
  1 sibling, 1 reply; 66+ messages in thread
From: Amir Goldstein @ 2017-10-20  3:22 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Thu, Oct 19, 2017 at 7:11 AM, Amir Goldstein <amir73il@gmail.com> wrote:
> On Thu, Oct 19, 2017 at 1:55 AM, Allison Henderson
> <allison.henderson@oracle.com> wrote:
>> Hi all,
>>
>> This is the third version of parent pointer attributes for xfs.
>> I've integrated the suggestions made since v2, mostly moving the
>> attr buffers in the xfs_attr_log_item to pointers that point to
>> xfs_attr_item. I've also implementing the recovery routines for
>> the xfs_attr_log_format.  If I missed anything please point it
>> out.  As always, comments and feedback are appreciated.  Thank
>> you!
>>
>
> A minor comment about the cover letter.
> All designated reviewers must know exactly what "parent pointers" are for,
> but it could be useful to add some context in the cover letter about the purpose
> of this work for the sake of other readers on the list. Useful to refer to the
> upcoming scrub support patches.
>
> BTW, not sure if this was mentioned in the previous lifetime of those
> patches, but parent pointers can be used to implement exportfs operation
> xfs_fs_fh_to_parent() for "non-connectable" file handles (FILEID_INO32_GEN)
> and to implement xfs_fs_get_name(), which would make reconnect_path()
> *much* more efficient.
>
> Also, you may want to use git format-patch -v3 for V3
> makes it easier to browse old versions of patches on the list.
>
> Cheers,
> Amir.
>
>> Allison Henderson (7):
>>   Add helper functions xfs_attr_set_args and xfs_attr_remove_args
>>   Set up infastructure for deferred attribute operations
>>   Add xfs_attr_set_defered and xfs_attr_remove_defered
>>   Remove all strlen calls in all xfs_attr_* functions for attr names.
>>   Add the extra space requirements for parent pointer attributes when
>>     calculating the minimum log size during mkfs
>>   Add parent pointers to rename
>>   Add the parent pointer support to the superblock version 5.
>>
>> Brian Foster (1):
>>   xfs_bmap_add_attrfork(): re-add error handling from set_attrforkoff()
>>     call
>>
>> Dave Chinner (5):
>>   xfs: define parent pointer xattr format
>>   :xfs: extent transaction reservations for parent attributes

You must've already noticed - just pointing out the :xfs: typo in that commit
subject (easier to comment on that here then on patch itself)

Amir.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 00/17] Parent Pointers V3
  2017-10-19  4:11 ` [PATCH 00/17] Parent Pointers V3 Amir Goldstein
  2017-10-20  3:22   ` Amir Goldstein
@ 2017-10-20 22:41   ` Dave Chinner
  2017-10-21  7:34     ` Amir Goldstein
  1 sibling, 1 reply; 66+ messages in thread
From: Dave Chinner @ 2017-10-20 22:41 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Allison Henderson, linux-xfs

On Thu, Oct 19, 2017 at 07:11:50AM +0300, Amir Goldstein wrote:
> On Thu, Oct 19, 2017 at 1:55 AM, Allison Henderson
> <allison.henderson@oracle.com> wrote:
> > Hi all,
> >
> > This is the third version of parent pointer attributes for xfs.
> > I've integrated the suggestions made since v2, mostly moving the
> > attr buffers in the xfs_attr_log_item to pointers that point to
> > xfs_attr_item. I've also implementing the recovery routines for
> > the xfs_attr_log_format.  If I missed anything please point it
> > out.  As always, comments and feedback are appreciated.  Thank
> > you!
> >
> 
> A minor comment about the cover letter.
> All designated reviewers must know exactly what "parent pointers" are for,
> but it could be useful to add some context in the cover letter about the purpose
> of this work for the sake of other readers on the list. Useful to refer to the
> upcoming scrub support patches.
> 
> BTW, not sure if this was mentioned in the previous lifetime of those
> patches, but parent pointers can be used to implement exportfs operation
> xfs_fs_fh_to_parent() for "non-connectable" file handles (FILEID_INO32_GEN)
> and to implement xfs_fs_get_name(), which would make reconnect_path()
> *much* more efficient.

However, XFS only uses FILEID_INO32_GEN for directories
because they have known parents. For them, we implement ->get_parent()
and that means reconnect_path just does ->lookup("..") to find the
parents and doesn't need anything special.

We use FILEID_INO32_GEN_PARENT for all other types of files to
encode the ino # + generation of the parent inode into the handle.
That means for any non-dir file handle, our implemention of
->fh_to_parent  will get us the parent info as efficiently as
possible.

IOWs, parent pointers won't actually speed up filehandle ->
dentry reconnection on XFS at all because we already encode parent
pointers into the filehandles that need them....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 10/17] :xfs: extent transaction reservations for parent attributes
       [not found]     ` <8680e0c1-ada8-06e3-e397-61a5076030be@oracle.com>
@ 2017-10-20 23:45       ` Darrick J. Wong
  2017-10-21  0:12         ` Allison Henderson
  0 siblings, 1 reply; 66+ messages in thread
From: Darrick J. Wong @ 2017-10-20 23:45 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs, Dave Chinner

On Fri, Oct 20, 2017 at 04:34:38PM -0700, Allison Henderson wrote:
> On 10/19/2017 11:24 AM, Darrick J. Wong wrote:
> 
> >On Wed, Oct 18, 2017 at 03:55:26PM -0700, Allison Henderson wrote:
> >>From: Dave Chinner<dchinner@redhat.com>
> >>
> >>We need to add, remove or modify parent pointer attributes during
> >>create/link/unlink/rename operations atomically with the dirents in the parent
> >>directories being modified. This means they need to be modified in the same
> >>transaction as the parent directories, and so we need to add the required
> >>space for the attribute modifications to the transaction reservations.
> >>
> >>[achender: rebased, added xfs_sb_version_hasparent stub]
> >>
> >>Signed-off-by: Dave Chinner<dchinner@redhat.com>
> >>Signed-off-by: Allison Henderson<allison.henderson@oracle.com>
> >>---
> >>  fs/xfs/libxfs/xfs_format.h     |   5 ++
> >>  fs/xfs/libxfs/xfs_trans_resv.c | 103 ++++++++++++++++++++++++++++++++---------
> >>  2 files changed, 85 insertions(+), 23 deletions(-)
> >>
> >>diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h
> >>index b9ea5bf..121862a 100644
> >>--- a/fs/xfs/libxfs/xfs_format.h
> >>+++ b/fs/xfs/libxfs/xfs_format.h
> >>@@ -556,6 +556,11 @@ static inline bool xfs_sb_version_hasreflink(struct xfs_sb *sbp)
> >>  		(sbp->sb_features_ro_compat & XFS_SB_FEAT_RO_COMPAT_REFLINK);
> >>  }
> >>+static inline bool xfs_sb_version_hasparent(struct xfs_sb *sbp)
> >>+{
> >>+	return false; /* We'll enable this at the end of the set */
> >I think this chunk should just add the proper testing code here.
> >
> >You only add RO_COMPAT_PARENT to XFS_SB_FEAT_RO_COMPAT_ALL at the end of
> >the patch series, so anyone bisecting their way through the series won't
> >be able to mount such an fs.
> Ok, there really isn't much more to add in here once we have the feature
> flags defined.  Maybe we could just do something like:
> 
> return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5 &&
>         (sbp->sb_features_ro_compat & 0));
> /* We will turn the 0 into XFS_SB_FEAT_RO_COMPAT_PARENT at the end of the
> set */
> 
> 
> Is that something like what you meant?

I was talking about defining the function once and for all:

static inline bool xfs_sb_version_hasparent(struct xfs_sb *sbp)
{
	return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5 &&
		(sbp->sb_features_ro_compat & XFS_SB_FEAT_RO_COMPAT_PARENT));
}

Since we can't mount any filesystem with XFS_SB_FEAT_RO_COMPAT_PARENT
set until COMPAT_PARENT gets added to XFS_SB_FEAT_RO_COMPAT_ALL.  The
practical effect is that hasparent will never return true, but with less
code changes between patches.

(Or put another way, please minimize the type of patch series churn
wherein one adds a function in one patch and then rewrites it a
subseqeuent patch.)

--D

> 
> >>+}
> >>+
> >>  /*
> >>   * end of superblock version macros
> >>   */
> >>diff --git a/fs/xfs/libxfs/xfs_trans_resv.c b/fs/xfs/libxfs/xfs_trans_resv.c
> >>index 6bd916b..54399e2 100644
> >>--- a/fs/xfs/libxfs/xfs_trans_resv.c
> >>+++ b/fs/xfs/libxfs/xfs_trans_resv.c
> >>@@ -802,29 +802,30 @@ xfs_calc_sb_reservation(
> >>  	return xfs_calc_buf_res(1, mp->m_sb.sb_sectsize);
> >>  }
> >>+/*
> >>+ * Namespace reservations.
> >>+ *
> >>+ * These get tricky when parent pointers are enabled as we have attribute
> >>+ * modifications occurring from within these transactions. Rather than confuse
> >>+ * each of these reservation calculations with the conditional attribute
> >>+ * reservations, add them here in a clear and concise manner. This assumes that
> >>+ * the attribute reservations have already been calculated.
> >>+ *
> >>+ * Note that we only include the static attribute reservation here; the runtime
> >>+ * reservation will have to be modified by the size of the attributes being
> >>+ * added/removed/modified. See the comments on the attribute reservation
> >>+ * calculations for more details.
> >I don't know that we can properly use a different runtime reservations
> >than what we statically reserve here, since the static reservations are
> >used to ensure that the log is of sufficient size given the fs geometry.
> >
> ><shrug> Maybe we can figure out how much extra space is allowable given
> >the actual size of the log?  Or perhaps in the end we'll just end up
> >restricting the maximum size of what we can log through intents?  Or
> >just set the reservation to 64k I guess.... :)
> >
> >--D
> >
> >>+ * Note for rename: rename will vastly overestimate requirements. This will be
> >>+ * addressed later when modifications are made to ensure parent attribute
> >>+ * modifications can be done atomically with the rename operation.
> >>+ */
> >>  void
> >>-xfs_trans_resv_calc(
> >>+xfs_calc_namespace_reservations(
> >>  	struct xfs_mount	*mp,
> >>  	struct xfs_trans_resv	*resp)
> >>  {
> >>-	/*
> >>-	 * The following transactions are logged in physical format and
> >>-	 * require a permanent reservation on space.
> >>-	 */
> >>-	resp->tr_write.tr_logres = xfs_calc_write_reservation(mp);
> >>-	if (xfs_sb_version_hasreflink(&mp->m_sb))
> >>-		resp->tr_write.tr_logcount = XFS_WRITE_LOG_COUNT_REFLINK;
> >>-	else
> >>-		resp->tr_write.tr_logcount = XFS_WRITE_LOG_COUNT;
> >>-	resp->tr_write.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
> >>-
> >>-	resp->tr_itruncate.tr_logres = xfs_calc_itruncate_reservation(mp);
> >>-	if (xfs_sb_version_hasreflink(&mp->m_sb))
> >>-		resp->tr_itruncate.tr_logcount =
> >>-				XFS_ITRUNCATE_LOG_COUNT_REFLINK;
> >>-	else
> >>-		resp->tr_itruncate.tr_logcount = XFS_ITRUNCATE_LOG_COUNT;
> >>-	resp->tr_itruncate.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
> >>+	ASSERT(resp->tr_attrsetm.tr_logres > 0);
> >>  	resp->tr_rename.tr_logres = xfs_calc_rename_reservation(mp);
> >>  	resp->tr_rename.tr_logcount = XFS_RENAME_LOG_COUNT;
> >>@@ -846,15 +847,69 @@ xfs_trans_resv_calc(
> >>  	resp->tr_create.tr_logcount = XFS_CREATE_LOG_COUNT;
> >>  	resp->tr_create.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
> >>+	resp->tr_mkdir.tr_logres = xfs_calc_mkdir_reservation(mp);
> >>+	resp->tr_mkdir.tr_logcount = XFS_MKDIR_LOG_COUNT;
> >>+	resp->tr_mkdir.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
> >>+
> >>+	if (!xfs_sb_version_hasparent(&mp->m_sb))
> >>+		return;
> >>+
> >>+	/* rename can add/remove/modify 2 parent attributes */
> >>+	resp->tr_rename.tr_logres += 2 * max(resp->tr_attrsetm.tr_logres,
> >>+					     resp->tr_attrrm.tr_logres);
> >>+	resp->tr_rename.tr_logcount += 2 * max(resp->tr_attrsetm.tr_logcount,
> >>+					       resp->tr_attrrm.tr_logcount);
> >>+
> >>+	/* create will add 1 parent attribute */
> >>+	resp->tr_create.tr_logres += resp->tr_attrsetm.tr_logres;
> >>+	resp->tr_create.tr_logcount += resp->tr_attrsetm.tr_logcount;
> >>+
> >>+	/* mkdir will add 1 parent attribute */
> >>+	resp->tr_mkdir.tr_logres += resp->tr_attrsetm.tr_logres;
> >>+	resp->tr_mkdir.tr_logcount += resp->tr_attrsetm.tr_logcount;
> >>+
> >>+	/* link will add 1 parent attribute */
> >>+	resp->tr_link.tr_logres += resp->tr_attrsetm.tr_logres;
> >>+	resp->tr_link.tr_logcount += resp->tr_attrsetm.tr_logcount;
> >>+
> >>+	/* symlink will add 1 parent attribute */
> >>+	resp->tr_symlink.tr_logres += resp->tr_attrsetm.tr_logres;
> >>+	resp->tr_symlink.tr_logcount += resp->tr_attrsetm.tr_logcount;
> >>+
> >>+	/* remove will remove 1 parent attribute */
> >>+	resp->tr_remove.tr_logres += resp->tr_attrrm.tr_logres;
> >>+	resp->tr_remove.tr_logcount = resp->tr_attrrm.tr_logcount;
> >>+}
> >>+
> >>+void
> >>+xfs_trans_resv_calc(
> >>+	struct xfs_mount	*mp,
> >>+	struct xfs_trans_resv	*resp)
> >>+{
> >>+	/*
> >>+	 * The following transactions are logged in physical format and
> >>+	 * require a permanent reservation on space.
> >>+	 */
> >>+	resp->tr_write.tr_logres = xfs_calc_write_reservation(mp);
> >>+	if (xfs_sb_version_hasreflink(&mp->m_sb))
> >>+		resp->tr_write.tr_logcount = XFS_WRITE_LOG_COUNT_REFLINK;
> >>+	else
> >>+		resp->tr_write.tr_logcount = XFS_WRITE_LOG_COUNT;
> >>+	resp->tr_write.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
> >>+
> >>+	resp->tr_itruncate.tr_logres = xfs_calc_itruncate_reservation(mp);
> >>+	if (xfs_sb_version_hasreflink(&mp->m_sb))
> >>+		resp->tr_itruncate.tr_logcount =
> >>+				XFS_ITRUNCATE_LOG_COUNT_REFLINK;
> >>+	else
> >>+		resp->tr_itruncate.tr_logcount = XFS_ITRUNCATE_LOG_COUNT;
> >>+	resp->tr_itruncate.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
> >>+
> >>  	resp->tr_create_tmpfile.tr_logres =
> >>  			xfs_calc_create_tmpfile_reservation(mp);
> >>  	resp->tr_create_tmpfile.tr_logcount = XFS_CREATE_TMPFILE_LOG_COUNT;
> >>  	resp->tr_create_tmpfile.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
> >>-	resp->tr_mkdir.tr_logres = xfs_calc_mkdir_reservation(mp);
> >>-	resp->tr_mkdir.tr_logcount = XFS_MKDIR_LOG_COUNT;
> >>-	resp->tr_mkdir.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
> >>-
> >>  	resp->tr_ifree.tr_logres = xfs_calc_ifree_reservation(mp);
> >>  	resp->tr_ifree.tr_logcount = XFS_INACTIVE_LOG_COUNT;
> >>  	resp->tr_ifree.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
> >>@@ -886,6 +941,8 @@ xfs_trans_resv_calc(
> >>  		resp->tr_qm_dqalloc.tr_logcount = XFS_WRITE_LOG_COUNT;
> >>  	resp->tr_qm_dqalloc.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
> >>+	xfs_calc_namespace_reservations(mp, resp);
> >>+
> >>  	/*
> >>  	 * The following transactions are logged in logical format with
> >>  	 * a default log count.
> >>-- 
> >>2.7.4
> >>
> >>--
> >>To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> >>the body of a message tomajordomo@vger.kernel.org
> >>More majordomo info athttp://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 12/17] xfs: parent pointer attribute creation
       [not found]     ` <9185d3e8-4b41-b2d8-294b-934f7d3409f0@oracle.com>
@ 2017-10-21  0:03       ` Darrick J. Wong
  0 siblings, 0 replies; 66+ messages in thread
From: Darrick J. Wong @ 2017-10-21  0:03 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs, Dave Chinner

On Fri, Oct 20, 2017 at 04:41:49PM -0700, Allison Henderson wrote:
> On 10/19/2017 12:36 PM, Darrick J. Wong wrote:
> >On Wed, Oct 18, 2017 at 03:55:28PM -0700, Allison Henderson wrote:
> >>From: Dave Chinner<dchinner@redhat.com>
> >>
> >>[bfoster: rebase, use VFS inode generation]
> >>[achender: rebased, changed __unint32_t to xfs_dir2_dataptr_t,
> >>	   fixed some null pointer bugs]
> >>
> >>Signed-off-by: Dave Chinner<dchinner@redhat.com>
> >>Signed-off-by: Allison Henderson<allison.henderson@oracle.com>
> >>---
> >>v2: remove unnecessary ENOSPC handling in xfs_attr_set_first_parent
> >>
> >>Signed-off-by: Allison Henderson<allison.henderson@oracle.com>
> >>---
> >>  fs/xfs/Makefile            |  1 +
> >>  fs/xfs/libxfs/xfs_attr.c   | 71 ++++++++++++++++++++++++++++++---
> >>  fs/xfs/libxfs/xfs_bmap.c   | 51 ++++++++++++++----------
> >>  fs/xfs/libxfs/xfs_bmap.h   |  1 +
> >>  fs/xfs/libxfs/xfs_parent.c | 98 ++++++++++++++++++++++++++++++++++++++++++++++
> >>  fs/xfs/xfs_attr.h          | 15 ++++++-
> >>  fs/xfs/xfs_inode.c         | 16 +++++++-
> >>  7 files changed, 225 insertions(+), 28 deletions(-)
> >>
> >>diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile
> >>index ec6486b..3015bca 100644
> >>--- a/fs/xfs/Makefile
> >>+++ b/fs/xfs/Makefile
> >>@@ -52,6 +52,7 @@ xfs-y				+= $(addprefix libxfs/, \
> >>  				   xfs_inode_fork.o \
> >>  				   xfs_inode_buf.o \
> >>  				   xfs_log_rlimit.o \
> >>+				   xfs_parent.o \
> >>  				   xfs_ag_resv.o \
> >>  				   xfs_rmap.o \
> >>  				   xfs_rmap_btree.o \
> >>diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
> >>index 8f8bfff9..8aad242 100644
> >>--- a/fs/xfs/libxfs/xfs_attr.c
> >>+++ b/fs/xfs/libxfs/xfs_attr.c
> >>@@ -91,12 +91,14 @@ xfs_attr_args_init(
> >>  	args->whichfork = XFS_ATTR_FORK;
> >>  	args->dp = dp;
> >>  	args->flags = flags;
> >>-	args->name = name;
> >>-	args->namelen = namelen;
> >>-	if (args->namelen >= MAXNAMELEN)
> >>-		return -EFAULT;		/* match IRIX behaviour */
> >>+	if (name) {
> >When do we have a NULL name?
> 
> Ideally we shouldn't, though on a remove we should have a NULL value, since
> we only need the name.  I suppose I'm still in the habit of coding
> defensively
> though it may make since to generate the oops, or even add an assert if it
> happens.
> Thx!

ASSERT(name != NULL);

at the top of the function if you think it's particularly likely to happen
or if it's likely that tracing an oops back to the source will be difficult.

(The ASSERTs are useful if you hand off work to a workqueue or any other
process such that the call stack is interrupted.)

> 
> >>+		args->name = name;
> >>+		args->namelen = namelen;
> >>+		if (args->namelen >= MAXNAMELEN)
> >>+			return -EFAULT;		/* match IRIX behaviour */
> >>-	args->hashval = xfs_da_hashname(args->name, args->namelen);
> >>+		args->hashval = xfs_da_hashname(args->name, args->namelen);
> >>+	}
> >>  	return 0;
> >>  }
> >>@@ -206,6 +208,65 @@ xfs_attr_calc_size(
> >>  }
> >>  /*
> >>+ * Add the initial parent pointer attribute.
> >>+ *
> >>+ * Inode must be locked and completely empty as we are adding the attribute
> >>+ * fork to the inode. This open codes bits of xfs_bmap_add_attrfork() and
> >>+ * xfs_attr_set() because we know the inode is completely empty at this point
> >Hrmm... in general I don't like opencoding bits of other functions
> >without a good justification.
> >
> >>+ * and so don't need to handle all the different combinations of fork
> >>+ * configurations here.
> >>+ */
> >>+int
> >>+xfs_attr_set_first_parent(
> >>+	struct xfs_trans	*tp,
> >>+	struct xfs_inode	*ip,
> >>+	struct xfs_parent_name_rec *rec,
> >>+	int			reclen,
> >>+	const char		*value,
> >>+	int			valuelen,
> >>+	struct xfs_defer_ops	*dfops,
> >>+	xfs_fsblock_t		*firstblock)
> >These all need one more level of indentation due to struct xfs_parent_name_rec.
> Sure, I will push those out a level
> >>+{
> >>+	struct xfs_da_args	args;
> >>+	int			flags = ATTR_PARENT;
> >>+	int			local;
> >>+	int			sf_size;
> >>+	int			error;
> >>+
> >>+	tp->t_flags |= XFS_TRANS_RESERVE;
> >>+
> >>+	error = xfs_attr_args_init(&args, ip, (char *)rec, reclen, flags);
> >>+	if (error)
> >>+		return error;
> >>+
> >>+	args.name = (char *)rec;
> >>+	args.namelen = reclen;
> >>+	args.hashval = xfs_da_hashname(args.name, args.namelen);
> >Aren't these already set by xfs_attr_args_init?
> Some of them are: name, namelen, hashval, dp, and flags.
> But not firstblock dfops, op_flags, total, or trans.
> 
> I guess I kind of liked seeing things initialized all in one spot rather
> than split up like that. But it shouldn't hurt anything to remove the
> re-inits if that is not preferable.

Me too, but so long as we /do/ have a partial initialization function,
there's no need to set fields twice.

> >>+	args.value = (char *)value;
> >>+	args.valuelen = valuelen;
> >>+	args.firstblock = firstblock;
> >>+	args.dfops = dfops;
> >>+	args.op_flags = XFS_DA_OP_ADDNAME | XFS_DA_OP_OKNOENT;
> >>+	args.total = xfs_attr_calc_size(&args, &local);
> >>+	args.trans = tp;
> >>+	ASSERT(local);
> >>+
> >>+	/* set the attribute fork appropriately */
> >>+	sf_size = sizeof(struct xfs_attr_sf_hdr) +
> >>+			XFS_ATTR_SF_ENTSIZE_BYNAME(reclen, valuelen);
> >>+	xfs_bmap_set_attrforkoff(ip, sf_size, NULL);
> >>+	ip->i_afp = kmem_zone_zalloc(xfs_ifork_zone, KM_SLEEP);
> >>+	ip->i_afp->if_flags = XFS_IFEXTENTS;
> >>+
> >>+
> >>+	/* Try to add the attr to the attribute list in the inode. */
> >>+	xfs_attr_shortform_create(&args);
> >Are we sure that we'll always be able to cram the parent attribute into
> >the shortform area?  Minimum inode size is 512 bytes, core size is
> >currently 176 bytes, max parent attribute size is ~280 bytes... I guess
> >that works.
> >
> >But I wouldn't want this to blow up some day when the inode core gets
> >bigger and this no longer fits.  Will using the regular xfs_attr_set
> >function cover all these sizing cases?  What's the benefit to all this
> >short circuiting?
> 
> Hmm, I'm going to speculate that the original intent was to optimize
> on the current conditions of the inode and the attrs fitting in just
> right?  (Dave may need to correct me if that's not right....).

I guess the attraction here is that so long as the attr fits, we can
initialize the inode and link it into a directory in a single
transaction without having to resort to defer_ops and other heavier
machinery.

Hmm.  xfs_bmap_add_attrfork allocates its own transaction, as does
xfs_attr_set.  If I may make a suggestion: a pair of functions that
takes an existing transaction context and tries to set up the attr fork,
erroring out if the attr fork is already set up and not in LOCAL format;
and a second function that also takes an existing transaction context
and tries to add a shortform attr, erroring out if there's no room.
Then your xfs_parent_create function can try to use both functions, and
if they don't succeed resort to the heavier defer_ops versions.

--D

> You make good points though.  Unless someone has an objection, I
> can put in the normal xfs_attr_set
> 
> >>+	error = xfs_attr_shortform_addname(&args);
> >>+
> >>+	return error;
> >>+}
> >>+
> >>+/*
> >>   * set the attribute specified in @args. In the case of the parent attribute
> >>   * being set, we do not want to roll the transaction on shortform-to-leaf
> >>   * conversion, as the attribute must be added in the same transaction as the
> >>diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> >>index 044a363..7ee98be 100644
> >>--- a/fs/xfs/libxfs/xfs_bmap.c
> >>+++ b/fs/xfs/libxfs/xfs_bmap.c
> >>@@ -1066,6 +1066,35 @@ xfs_bmap_add_attrfork_local(
> >>  	return -EFSCORRUPTED;
> >>  }
> >>+int
> >>+xfs_bmap_set_attrforkoff(
> >>+	struct xfs_inode	*ip,
> >>+	int			size,
> >>+	int			*version)
> >>+{
> >>+	switch (ip->i_d.di_format) {
> >>+	case XFS_DINODE_FMT_DEV:
> >>+		ip->i_d.di_forkoff = roundup(sizeof(xfs_dev_t), 8) >> 3;
> >>+		break;
> >>+	case XFS_DINODE_FMT_UUID:
> >>+		ip->i_d.di_forkoff = roundup(sizeof(uuid_t), 8) >> 3;
> >>+		break;
> >>+	case XFS_DINODE_FMT_LOCAL:
> >>+	case XFS_DINODE_FMT_EXTENTS:
> >>+	case XFS_DINODE_FMT_BTREE:
> >>+		ip->i_d.di_forkoff = xfs_attr_shortform_bytesfit(ip, size);
> >>+		if (!ip->i_d.di_forkoff)
> >>+			ip->i_d.di_forkoff = xfs_default_attroffset(ip) >> 3;
> >>+		else if ((ip->i_mount->m_flags & XFS_MOUNT_ATTR2) && version)
> >>+			*version = 2;
> >>+		break;
> >>+	default:
> >>+		ASSERT(0);
> >>+		return -EINVAL;
> >>+	}
> >>+	return 0;
> >>+}
> >>+
> >>  /*
> >>   * Convert inode from non-attributed to attributed.
> >>   * Must not be in a transaction, ip must not be locked.
> >>@@ -1120,27 +1149,7 @@ xfs_bmap_add_attrfork(
> >>  	xfs_trans_ijoin(tp, ip, 0);
> >>  	xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
> >>-	switch (ip->i_d.di_format) {
> >>-	case XFS_DINODE_FMT_DEV:
> >>-		ip->i_d.di_forkoff = roundup(sizeof(xfs_dev_t), 8) >> 3;
> >>-		break;
> >>-	case XFS_DINODE_FMT_UUID:
> >>-		ip->i_d.di_forkoff = roundup(sizeof(uuid_t), 8) >> 3;
> >>-		break;
> >>-	case XFS_DINODE_FMT_LOCAL:
> >>-	case XFS_DINODE_FMT_EXTENTS:
> >>-	case XFS_DINODE_FMT_BTREE:
> >>-		ip->i_d.di_forkoff = xfs_attr_shortform_bytesfit(ip, size);
> >>-		if (!ip->i_d.di_forkoff)
> >>-			ip->i_d.di_forkoff = xfs_default_attroffset(ip) >> 3;
> >>-		else if (mp->m_flags & XFS_MOUNT_ATTR2)
> >>-			version = 2;
> >>-		break;
> >>-	default:
> >>-		ASSERT(0);
> >>-		error = -EINVAL;
> >>-		goto trans_cancel;
> >>-	}
> >>+	xfs_bmap_set_attrforkoff(ip, size, &version);
> >>  	ASSERT(ip->i_afp == NULL);
> >>  	ip->i_afp = kmem_zone_zalloc(xfs_ifork_zone, KM_SLEEP);
> >>diff --git a/fs/xfs/libxfs/xfs_bmap.h b/fs/xfs/libxfs/xfs_bmap.h
> >>index 851982a..533f40f 100644
> >>--- a/fs/xfs/libxfs/xfs_bmap.h
> >>+++ b/fs/xfs/libxfs/xfs_bmap.h
> >>@@ -209,6 +209,7 @@ void	xfs_bmap_trace_exlist(struct xfs_inode *ip, xfs_extnum_t cnt,
> >>  void	xfs_trim_extent(struct xfs_bmbt_irec *irec, xfs_fileoff_t bno,
> >>  		xfs_filblks_t len);
> >>  int	xfs_bmap_add_attrfork(struct xfs_inode *ip, int size, int rsvd);
> >>+int	xfs_bmap_set_attrforkoff(struct xfs_inode *ip, int size, int *version);
> >>  void	xfs_bmap_local_to_extents_empty(struct xfs_inode *ip, int whichfork);
> >>  void	xfs_bmap_add_free(struct xfs_mount *mp, struct xfs_defer_ops *dfops,
> >>  			  xfs_fsblock_t bno, xfs_filblks_t len,
> >>diff --git a/fs/xfs/libxfs/xfs_parent.c b/fs/xfs/libxfs/xfs_parent.c
> >>new file mode 100644
> >>index 0000000..88f7edc
> >>--- /dev/null
> >>+++ b/fs/xfs/libxfs/xfs_parent.c
> >>@@ -0,0 +1,98 @@
> >>+/*
> >>+ * Copyright (c) 2015 Red Hat, Inc.
> >>+ * All rights reserved.
> >>+ *
> >>+ * This program is free software; you can redistribute it and/or
> >>+ * modify it under the terms of the GNU General Public License as
> >>+ * published by the Free Software Foundation.
> >>+ *
> >>+ * This program is distributed in the hope that it would be useful,
> >>+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
> >>+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> >>+ * GNU General Public License for more details.
> >>+ *
> >>+ * You should have received a copy of the GNU General Public License
> >>+ * along with this program; if not, write the Free Software Foundation
> >>+ */
> >>+#include "xfs.h"
> >>+#include "xfs_fs.h"
> >>+#include "xfs_format.h"
> >>+#include "xfs_log_format.h"
> >>+#include "xfs_shared.h"
> >>+#include "xfs_trans_resv.h"
> >>+#include "xfs_mount.h"
> >>+#include "xfs_bmap_btree.h"
> >>+#include "xfs_inode.h"
> >>+#include "xfs_error.h"
> >>+#include "xfs_trace.h"
> >>+#include "xfs_trans.h"
> >>+#include "xfs_attr.h"
> >>+
> >>+/*
> >>+ * Parent pointer attribute handling.
> >>+ *
> >>+ * Because the attribute value is a filename component, it will never be longer
> >>+ * than 255 bytes. This means the attribute will always be a local format
> >>+ * attribute as it is xfs_attr_leaf_entsize_local_max() for v5 filesystems will
> >>+ * always be larger than this (max is 75% of block size).
> >>+ *
> >>+ * Creating a new parent attribute will always create a new attribute - there
> >>+ * should never, ever be an existing attribute in the tree for a new inode.
> >>+ * ENOSPC behaviour is problematic - creating the inode without the parent
> >>+ * pointer is effectively a corruption, so we allow parent attribute creation
> >>+ * to dip into the reserve block pool to avoid unexpected ENOSPC errors from
> >>+ * occurring.
> >>+ */
> >>+
> >>+/*
> >>+ * Create the initial parent attribute.
> >>+ *
> >>+ * The initial attribute creation also needs to be atomic w.r.t the parent
> >>+ * directory modification. Hence it needs to run in the same transaction and the
> >>+ * transaction committed by the caller.  Because the attribute created is
> >>+ * guaranteed to be a local attribute and is always going to be the first
> >>+ * attribute in the attribute fork, we can do this safely in the single
> >>+ * transaction context as it is impossible for an overwrite to occur and hence
> >>+ * we'll never have a rolling overwrite transaction occurring here. Hence we
> >>+ * can short-cut a lot of the normal xfs_attr_set() code paths that are needed
> >>+ * to handle the generic cases.
> >Is there some other part of inode creation (ACL propagation?) that
> >thinks it could be the creator of the first attribute and will react
> >negatively to this?
> Hmm, not that I can think of, but I wonder if there was at the time?
> >>+ */
> >>+static int
> >>+xfs_parent_create_nrec(
> >>+	struct xfs_trans	*tp,
> >>+	struct xfs_inode	*child,
> >>+	struct xfs_parent_name_irec *nrec,
> >>+	struct xfs_defer_ops	*dfops,
> >>+	xfs_fsblock_t		*firstblock)
> >>+{
> >>+	struct xfs_parent_name_rec rec;
> >>+
> >>+	rec.p_ino = cpu_to_be64(nrec->p_ino);
> >>+	rec.p_gen = cpu_to_be32(nrec->p_gen);
> >>+	rec.p_diroffset = cpu_to_be32(nrec->p_diroffset);
> >The disk->header and header->disk converters should be their own
> >functions so that later when I add parent pointer iterators I can pass
> >the irec to the iterator function directly.
> >
> >(Granted I could just as easily do that later in my own patch...)
> >
> I don't mind adding here if we're already have a need for it.  Saves time
> changing it later :-)
> >>+
> >>+	return xfs_attr_set_first_parent(tp, child, &rec, sizeof(rec),
> >>+				   nrec->p_name, nrec->p_namelen,
> >>+				   dfops, firstblock);
> >>+}
> >>+
> >>+int
> >>+xfs_parent_create(
> >What's this function do?  (Needs comment.)
> >
> >--D
> This is the subroutine that we use during creation, but I think you
> pointed out some issues with it in your later reviews, since this should
> probably be part of the deferred operation code. I will add comments
> when I revise it though.  Thx!
> 
> >>+	struct xfs_trans	*tp,
> >>+	struct xfs_inode	*parent,
> >>+	struct xfs_inode	*child,
> >>+	struct xfs_name		*child_name,
> >>+	xfs_dir2_dataptr_t	diroffset,
> >>+	struct xfs_defer_ops	*dfops,
> >>+	xfs_fsblock_t		*firstblock)
> >>+{
> >>+	struct xfs_parent_name_irec nrec;
> >>+
> >>+	nrec.p_ino = parent->i_ino;
> >>+	nrec.p_gen = VFS_I(parent)->i_generation;
> >>+	nrec.p_diroffset = diroffset;
> >>+	nrec.p_name = child_name->name;
> >>+	nrec.p_namelen = child_name->len;
> >>+
> >>+	return xfs_parent_create_nrec(tp, child, &nrec, dfops, firstblock);
> >>+}
> >>diff --git a/fs/xfs/xfs_attr.h b/fs/xfs/xfs_attr.h
> >>index 7901c3b..b48e31b 100644
> >>--- a/fs/xfs/xfs_attr.h
> >>+++ b/fs/xfs/xfs_attr.h
> >>@@ -19,6 +19,8 @@
> >>  #define	__XFS_ATTR_H__
> >>  #include "libxfs/xfs_defer.h"
> >>+#include "libxfs/xfs_da_format.h"
> >>+#include "libxfs/xfs_format.h"
> >>  struct xfs_inode;
> >>  struct xfs_da_args;
> >>@@ -183,5 +185,16 @@ int xfs_attr_set_deferred(struct xfs_inode *dp, struct xfs_defer_ops *dfops,
> >>  int xfs_attr_remove_deferred(struct xfs_inode *dp, struct xfs_defer_ops *dfops,
> >>  			    const unsigned char *name, unsigned int namelen,
> >>  			    int flags);
> >>-
> >>+/*
> >>+ * Parent pointer attribute prototypes
> >>+ */
> >>+int xfs_parent_create(struct xfs_trans *tp, struct xfs_inode *parent,
> >>+		      struct xfs_inode *child, struct xfs_name *child_name,
> >>+		      xfs_dir2_dataptr_t diroffset, struct xfs_defer_ops *dfops,
> >>+		      xfs_fsblock_t *firstblock);
> >>+int xfs_attr_set_first_parent(struct xfs_trans *tp, struct xfs_inode *ip,
> >>+			      struct xfs_parent_name_rec *rec, int reclen,
> >>+			      const char *value, int valuelen,
> >>+			      struct xfs_defer_ops *dfops,
> >>+			      xfs_fsblock_t *firstblock);
> >>  #endif	/* __XFS_ATTR_H__ */
> >>diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> >>index f7986d8..4396561 100644
> >>--- a/fs/xfs/xfs_inode.c
> >>+++ b/fs/xfs/xfs_inode.c
> >>@@ -1164,6 +1164,7 @@ xfs_create(
> >>  	struct xfs_dquot	*pdqp = NULL;
> >>  	struct xfs_trans_res	*tres;
> >>  	uint			resblks;
> >>+	xfs_dir2_dataptr_t	diroffset;
> >>  	trace_xfs_create(dp, name);
> >>@@ -1253,7 +1254,7 @@ xfs_create(
> >>  	error = xfs_dir_createname(tp, dp, name, ip->i_ino,
> >>  					&first_block, &dfops, resblks ?
> >>  					resblks - XFS_IALLOC_SPACE_RES(mp) : 0,
> >>-					NULL);
> >>+					&diroffset);
> >>  	if (error) {
> >>  		ASSERT(error != -ENOSPC);
> >>  		goto out_trans_cancel;
> >>@@ -1272,6 +1273,19 @@ xfs_create(
> >>  	}
> >>  	/*
> >>+	 * If we have parent pointers, we need to add the attribute containing
> >>+	 * the parent information now. This must be done within the same
> >>+	 * transaction the directory entry is created, while the new inode
> >>+	 * contains nothing in the inode literal area.
> >>+	 */
> >>+	if (xfs_sb_version_hasparent(&mp->m_sb)) {
> >>+		error = xfs_parent_create(tp, dp, ip, name, diroffset,
> >>+					  &dfops, &first_block);
> >>+		if (error)
> >>+			goto out_bmap_cancel;
> >>+	}
> >>+
> >>+	/*
> >>  	 * If this is a synchronous mount, make sure that the
> >>  	 * create transaction goes to disk before returning to
> >>  	 * the user.
> >>-- 
> >>2.7.4
> >>
> >>--
> >>To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> >>the body of a message tomajordomo@vger.kernel.org
> >>More majordomo info athttp://vger.kernel.org/majordomo-info.html
> 

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 10/17] :xfs: extent transaction reservations for parent attributes
  2017-10-20 23:45       ` Darrick J. Wong
@ 2017-10-21  0:12         ` Allison Henderson
  0 siblings, 0 replies; 66+ messages in thread
From: Allison Henderson @ 2017-10-21  0:12 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs, Dave Chinner

On 10/20/2017 4:45 PM, Darrick J. Wong wrote:
> On Fri, Oct 20, 2017 at 04:34:38PM -0700, Allison Henderson wrote:
>> On 10/19/2017 11:24 AM, Darrick J. Wong wrote:
>>
>>> On Wed, Oct 18, 2017 at 03:55:26PM -0700, Allison Henderson wrote:
>>>> From: Dave Chinner<dchinner@redhat.com>
>>>>
>>>> We need to add, remove or modify parent pointer attributes during
>>>> create/link/unlink/rename operations atomically with the dirents in the parent
>>>> directories being modified. This means they need to be modified in the same
>>>> transaction as the parent directories, and so we need to add the required
>>>> space for the attribute modifications to the transaction reservations.
>>>>
>>>> [achender: rebased, added xfs_sb_version_hasparent stub]
>>>>
>>>> Signed-off-by: Dave Chinner<dchinner@redhat.com>
>>>> Signed-off-by: Allison Henderson<allison.henderson@oracle.com>
>>>> ---
>>>>   fs/xfs/libxfs/xfs_format.h     |   5 ++
>>>>   fs/xfs/libxfs/xfs_trans_resv.c | 103 ++++++++++++++++++++++++++++++++---------
>>>>   2 files changed, 85 insertions(+), 23 deletions(-)
>>>>
>>>> diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h
>>>> index b9ea5bf..121862a 100644
>>>> --- a/fs/xfs/libxfs/xfs_format.h
>>>> +++ b/fs/xfs/libxfs/xfs_format.h
>>>> @@ -556,6 +556,11 @@ static inline bool xfs_sb_version_hasreflink(struct xfs_sb *sbp)
>>>>   		(sbp->sb_features_ro_compat & XFS_SB_FEAT_RO_COMPAT_REFLINK);
>>>>   }
>>>> +static inline bool xfs_sb_version_hasparent(struct xfs_sb *sbp)
>>>> +{
>>>> +	return false; /* We'll enable this at the end of the set */
>>> I think this chunk should just add the proper testing code here.
>>>
>>> You only add RO_COMPAT_PARENT to XFS_SB_FEAT_RO_COMPAT_ALL at the end of
>>> the patch series, so anyone bisecting their way through the series won't
>>> be able to mount such an fs.
>> Ok, there really isn't much more to add in here once we have the feature
>> flags defined.  Maybe we could just do something like:
>>
>> return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5 &&
>>          (sbp->sb_features_ro_compat & 0));
>> /* We will turn the 0 into XFS_SB_FEAT_RO_COMPAT_PARENT at the end of the
>> set */
>>
>>
>> Is that something like what you meant?
> 
> I was talking about defining the function once and for all:
> 
> static inline bool xfs_sb_version_hasparent(struct xfs_sb *sbp)
> {
> 	return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5 &&
> 		(sbp->sb_features_ro_compat & XFS_SB_FEAT_RO_COMPAT_PARENT));
> }
> 
> Since we can't mount any filesystem with XFS_SB_FEAT_RO_COMPAT_PARENT
> set until COMPAT_PARENT gets added to XFS_SB_FEAT_RO_COMPAT_ALL.  The
> practical effect is that hasparent will never return true, but with less
> code changes between patches.
> 
> (Or put another way, please minimize the type of patch series churn
> wherein one adds a function in one patch and then rewrites it a
> subseqeuent patch.)
> 
> --D
Alrighty, got it.  Thank you!

> 
>>
>>>> +}
>>>> +
>>>>   /*
>>>>    * end of superblock version macros
>>>>    */
>>>> diff --git a/fs/xfs/libxfs/xfs_trans_resv.c b/fs/xfs/libxfs/xfs_trans_resv.c
>>>> index 6bd916b..54399e2 100644
>>>> --- a/fs/xfs/libxfs/xfs_trans_resv.c
>>>> +++ b/fs/xfs/libxfs/xfs_trans_resv.c
>>>> @@ -802,29 +802,30 @@ xfs_calc_sb_reservation(
>>>>   	return xfs_calc_buf_res(1, mp->m_sb.sb_sectsize);
>>>>   }
>>>> +/*
>>>> + * Namespace reservations.
>>>> + *
>>>> + * These get tricky when parent pointers are enabled as we have attribute
>>>> + * modifications occurring from within these transactions. Rather than confuse
>>>> + * each of these reservation calculations with the conditional attribute
>>>> + * reservations, add them here in a clear and concise manner. This assumes that
>>>> + * the attribute reservations have already been calculated.
>>>> + *
>>>> + * Note that we only include the static attribute reservation here; the runtime
>>>> + * reservation will have to be modified by the size of the attributes being
>>>> + * added/removed/modified. See the comments on the attribute reservation
>>>> + * calculations for more details.
>>> I don't know that we can properly use a different runtime reservations
>>> than what we statically reserve here, since the static reservations are
>>> used to ensure that the log is of sufficient size given the fs geometry.
>>>
>>> <shrug> Maybe we can figure out how much extra space is allowable given
>>> the actual size of the log?  Or perhaps in the end we'll just end up
>>> restricting the maximum size of what we can log through intents?  Or
>>> just set the reservation to 64k I guess.... :)
>>>
>>> --D
>>>
>>>> + * Note for rename: rename will vastly overestimate requirements. This will be
>>>> + * addressed later when modifications are made to ensure parent attribute
>>>> + * modifications can be done atomically with the rename operation.
>>>> + */
>>>>   void
>>>> -xfs_trans_resv_calc(
>>>> +xfs_calc_namespace_reservations(
>>>>   	struct xfs_mount	*mp,
>>>>   	struct xfs_trans_resv	*resp)
>>>>   {
>>>> -	/*
>>>> -	 * The following transactions are logged in physical format and
>>>> -	 * require a permanent reservation on space.
>>>> -	 */
>>>> -	resp->tr_write.tr_logres = xfs_calc_write_reservation(mp);
>>>> -	if (xfs_sb_version_hasreflink(&mp->m_sb))
>>>> -		resp->tr_write.tr_logcount = XFS_WRITE_LOG_COUNT_REFLINK;
>>>> -	else
>>>> -		resp->tr_write.tr_logcount = XFS_WRITE_LOG_COUNT;
>>>> -	resp->tr_write.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
>>>> -
>>>> -	resp->tr_itruncate.tr_logres = xfs_calc_itruncate_reservation(mp);
>>>> -	if (xfs_sb_version_hasreflink(&mp->m_sb))
>>>> -		resp->tr_itruncate.tr_logcount =
>>>> -				XFS_ITRUNCATE_LOG_COUNT_REFLINK;
>>>> -	else
>>>> -		resp->tr_itruncate.tr_logcount = XFS_ITRUNCATE_LOG_COUNT;
>>>> -	resp->tr_itruncate.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
>>>> +	ASSERT(resp->tr_attrsetm.tr_logres > 0);
>>>>   	resp->tr_rename.tr_logres = xfs_calc_rename_reservation(mp);
>>>>   	resp->tr_rename.tr_logcount = XFS_RENAME_LOG_COUNT;
>>>> @@ -846,15 +847,69 @@ xfs_trans_resv_calc(
>>>>   	resp->tr_create.tr_logcount = XFS_CREATE_LOG_COUNT;
>>>>   	resp->tr_create.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
>>>> +	resp->tr_mkdir.tr_logres = xfs_calc_mkdir_reservation(mp);
>>>> +	resp->tr_mkdir.tr_logcount = XFS_MKDIR_LOG_COUNT;
>>>> +	resp->tr_mkdir.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
>>>> +
>>>> +	if (!xfs_sb_version_hasparent(&mp->m_sb))
>>>> +		return;
>>>> +
>>>> +	/* rename can add/remove/modify 2 parent attributes */
>>>> +	resp->tr_rename.tr_logres += 2 * max(resp->tr_attrsetm.tr_logres,
>>>> +					     resp->tr_attrrm.tr_logres);
>>>> +	resp->tr_rename.tr_logcount += 2 * max(resp->tr_attrsetm.tr_logcount,
>>>> +					       resp->tr_attrrm.tr_logcount);
>>>> +
>>>> +	/* create will add 1 parent attribute */
>>>> +	resp->tr_create.tr_logres += resp->tr_attrsetm.tr_logres;
>>>> +	resp->tr_create.tr_logcount += resp->tr_attrsetm.tr_logcount;
>>>> +
>>>> +	/* mkdir will add 1 parent attribute */
>>>> +	resp->tr_mkdir.tr_logres += resp->tr_attrsetm.tr_logres;
>>>> +	resp->tr_mkdir.tr_logcount += resp->tr_attrsetm.tr_logcount;
>>>> +
>>>> +	/* link will add 1 parent attribute */
>>>> +	resp->tr_link.tr_logres += resp->tr_attrsetm.tr_logres;
>>>> +	resp->tr_link.tr_logcount += resp->tr_attrsetm.tr_logcount;
>>>> +
>>>> +	/* symlink will add 1 parent attribute */
>>>> +	resp->tr_symlink.tr_logres += resp->tr_attrsetm.tr_logres;
>>>> +	resp->tr_symlink.tr_logcount += resp->tr_attrsetm.tr_logcount;
>>>> +
>>>> +	/* remove will remove 1 parent attribute */
>>>> +	resp->tr_remove.tr_logres += resp->tr_attrrm.tr_logres;
>>>> +	resp->tr_remove.tr_logcount = resp->tr_attrrm.tr_logcount;
>>>> +}
>>>> +
>>>> +void
>>>> +xfs_trans_resv_calc(
>>>> +	struct xfs_mount	*mp,
>>>> +	struct xfs_trans_resv	*resp)
>>>> +{
>>>> +	/*
>>>> +	 * The following transactions are logged in physical format and
>>>> +	 * require a permanent reservation on space.
>>>> +	 */
>>>> +	resp->tr_write.tr_logres = xfs_calc_write_reservation(mp);
>>>> +	if (xfs_sb_version_hasreflink(&mp->m_sb))
>>>> +		resp->tr_write.tr_logcount = XFS_WRITE_LOG_COUNT_REFLINK;
>>>> +	else
>>>> +		resp->tr_write.tr_logcount = XFS_WRITE_LOG_COUNT;
>>>> +	resp->tr_write.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
>>>> +
>>>> +	resp->tr_itruncate.tr_logres = xfs_calc_itruncate_reservation(mp);
>>>> +	if (xfs_sb_version_hasreflink(&mp->m_sb))
>>>> +		resp->tr_itruncate.tr_logcount =
>>>> +				XFS_ITRUNCATE_LOG_COUNT_REFLINK;
>>>> +	else
>>>> +		resp->tr_itruncate.tr_logcount = XFS_ITRUNCATE_LOG_COUNT;
>>>> +	resp->tr_itruncate.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
>>>> +
>>>>   	resp->tr_create_tmpfile.tr_logres =
>>>>   			xfs_calc_create_tmpfile_reservation(mp);
>>>>   	resp->tr_create_tmpfile.tr_logcount = XFS_CREATE_TMPFILE_LOG_COUNT;
>>>>   	resp->tr_create_tmpfile.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
>>>> -	resp->tr_mkdir.tr_logres = xfs_calc_mkdir_reservation(mp);
>>>> -	resp->tr_mkdir.tr_logcount = XFS_MKDIR_LOG_COUNT;
>>>> -	resp->tr_mkdir.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
>>>> -
>>>>   	resp->tr_ifree.tr_logres = xfs_calc_ifree_reservation(mp);
>>>>   	resp->tr_ifree.tr_logcount = XFS_INACTIVE_LOG_COUNT;
>>>>   	resp->tr_ifree.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
>>>> @@ -886,6 +941,8 @@ xfs_trans_resv_calc(
>>>>   		resp->tr_qm_dqalloc.tr_logcount = XFS_WRITE_LOG_COUNT;
>>>>   	resp->tr_qm_dqalloc.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
>>>> +	xfs_calc_namespace_reservations(mp, resp);
>>>> +
>>>>   	/*
>>>>   	 * The following transactions are logged in logical format with
>>>>   	 * a default log count.
>>>> -- 
>>>> 2.7.4
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
>>>> the body of a message tomajordomo@vger.kernel.org
>>>> More majordomo info athttp://vger.kernel.org/majordomo-info.html
>>

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 00/17] Parent Pointers V3
  2017-10-20  3:22   ` Amir Goldstein
@ 2017-10-21  1:06     ` Allison Henderson
  0 siblings, 0 replies; 66+ messages in thread
From: Allison Henderson @ 2017-10-21  1:06 UTC (permalink / raw)
  Cc: linux-xfs

On 10/19/2017 08:22 PM, Amir Goldstein wrote:

> On Thu, Oct 19, 2017 at 7:11 AM, Amir Goldstein<amir73il@gmail.com>  wrote:
>> On Thu, Oct 19, 2017 at 1:55 AM, Allison Henderson
>> <allison.henderson@oracle.com>  wrote:
>>> Hi all,
>>>
>>> This is the third version of parent pointer attributes for xfs.
>>> I've integrated the suggestions made since v2, mostly moving the
>>> attr buffers in the xfs_attr_log_item to pointers that point to
>>> xfs_attr_item. I've also implementing the recovery routines for
>>> the xfs_attr_log_format.  If I missed anything please point it
>>> out.  As always, comments and feedback are appreciated.  Thank
>>> you!
>>>
>> A minor comment about the cover letter.
>> All designated reviewers must know exactly what "parent pointers" are for,
>> but it could be useful to add some context in the cover letter about the purpose
>> of this work for the sake of other readers on the list. Useful to refer to the
>> upcoming scrub support patches.
>>
>> BTW, not sure if this was mentioned in the previous lifetime of those
>> patches, but parent pointers can be used to implement exportfs operation
>> xfs_fs_fh_to_parent() for "non-connectable" file handles (FILEID_INO32_GEN)
>> and to implement xfs_fs_get_name(), which would make reconnect_path()
>> *much* more efficient.
>>
>> Also, you may want to use git format-patch -v3 for V3
>> makes it easier to browse old versions of patches on the list.
>>
>> Cheers,
>> Amir.
>>
>>> Allison Henderson (7):
>>>    Add helper functions xfs_attr_set_args and xfs_attr_remove_args
>>>    Set up infastructure for deferred attribute operations
>>>    Add xfs_attr_set_defered and xfs_attr_remove_defered
>>>    Remove all strlen calls in all xfs_attr_* functions for attr names.
>>>    Add the extra space requirements for parent pointer attributes when
>>>      calculating the minimum log size during mkfs
>>>    Add parent pointers to rename
>>>    Add the parent pointer support to the superblock version 5.
>>>
>>> Brian Foster (1):
>>>    xfs_bmap_add_attrfork(): re-add error handling from set_attrforkoff()
>>>      call
>>>
>>> Dave Chinner (5):
>>>    xfs: define parent pointer xattr format
>>>    :xfs: extent transaction reservations for parent attributes
> You must've already noticed - just pointing out the :xfs: typo in that commit
> subject (easier to comment on that here then on patch itself)
>
> Amir.
Yep, I noticed it after I sent it.  I will fix that and expand on the 
cover letter a bit.  I think there's a few things that can leverage 
parent pointers once they are in place.  I will add the -v3 flag too.  Thx!


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 11/17] Add the extra space requirements for parent pointer attributes when calculating the minimum log size during mkfs
  2017-10-19 18:13   ` Darrick J. Wong
@ 2017-10-21  1:07     ` Allison Henderson
  0 siblings, 0 replies; 66+ messages in thread
From: Allison Henderson @ 2017-10-21  1:07 UTC (permalink / raw)
  Cc: linux-xfs

On 10/19/2017 11:13 AM, Darrick J. Wong wrote:

> On Wed, Oct 18, 2017 at 03:55:27PM -0700, Allison Henderson wrote:
>> Signed-off-by: Allison Henderson<allison.henderson@oracle.com>
>> ---
>>   fs/xfs/libxfs/xfs_log_rlimit.c | 34 ++++++++++++++++++++++++++++++++++
>>   1 file changed, 34 insertions(+)
>>
>> diff --git a/fs/xfs/libxfs/xfs_log_rlimit.c b/fs/xfs/libxfs/xfs_log_rlimit.c
>> index c105979..beec9bf 100644
>> --- a/fs/xfs/libxfs/xfs_log_rlimit.c
>> +++ b/fs/xfs/libxfs/xfs_log_rlimit.c
>> @@ -39,6 +39,40 @@ xfs_log_calc_max_attrsetm_res(
>>   {
>>   	int			size;
>>   	int			nblks;
>> +	struct xfs_trans_resv   *resp = M_RES(mp);
>> +
>> +	/* Calculate extra space needed for parent pointer attributes */
>> +	if (!xfs_sb_version_hasparent(&mp->m_sb)) {
> Aren't we supposed to be enlarging tr_log{res,count} if hasparent is true?
>
>> +
>> +		/* rename can add/remove/modify 2 parent attributes */
>> +		resp->tr_rename.tr_logres +=
>> +			2 * max(resp->tr_attrsetm.tr_logres,
>> +				resp->tr_attrrm.tr_logres);
>> +		resp->tr_rename.tr_logcount +=
>> +			2 * max(resp->tr_attrsetm.tr_logcount,
>> +				resp->tr_attrrm.tr_logcount);
>> +
>> +		/* create will add 1 parent attribute */
>> +		resp->tr_create.tr_logres += resp->tr_attrsetm.tr_logres;
>> +		resp->tr_create.tr_logcount += resp->tr_attrsetm.tr_logcount;
>> +
>> +		/* mkdir will add 1 parent attribute */
>> +		resp->tr_mkdir.tr_logres += resp->tr_attrsetm.tr_logres;
>> +		resp->tr_mkdir.tr_logcount += resp->tr_attrsetm.tr_logcount;
>> +
>> +		/* link will add 1 parent attribute */
>> +		resp->tr_link.tr_logres += resp->tr_attrsetm.tr_logres;
>> +		resp->tr_link.tr_logcount += resp->tr_attrsetm.tr_logcount;
>> +
>> +		/* symlink will add 1 parent attribute */
>> +		resp->tr_symlink.tr_logres += resp->tr_attrsetm.tr_logres;
>> +		resp->tr_symlink.tr_logcount += resp->tr_attrsetm.tr_logcount;
>> +
>> +		/* remove will remove 1 parent attribute */
>> +		resp->tr_remove.tr_logres += resp->tr_attrrm.tr_logres;
>> +		resp->tr_remove.tr_logcount = resp->tr_attrrm.tr_logcount;
> += ?
>
> --D
I think you're right.  Initially I was having trouble getting it to 
mount because not enough log space was reserved during mkfs time, and I 
had borrowed this code from xfs_calc_namespace_reservations in the 
previous patch.  So we might need the same fix there too.  I will get it 
corrected.  Thx!


>> +	}
>> +
>>   
>>   	size = xfs_attr_leaf_entsize_local_max(mp->m_attr_geo->blksize) -
>>   	       MAXNAMELEN - 1;
>> -- 
>> 2.7.4
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
>> the body of a message tomajordomo@vger.kernel.org
>> More majordomo info athttp://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 10/17] :xfs: extent transaction reservations for parent attributes
  2017-10-19 18:24   ` Darrick J. Wong
       [not found]     ` <8680e0c1-ada8-06e3-e397-61a5076030be@oracle.com>
@ 2017-10-21  1:07     ` Allison Henderson
  1 sibling, 0 replies; 66+ messages in thread
From: Allison Henderson @ 2017-10-21  1:07 UTC (permalink / raw)
  Cc: linux-xfs, Dave Chinner

On 10/19/2017 11:24 AM, Darrick J. Wong wrote:

> On Wed, Oct 18, 2017 at 03:55:26PM -0700, Allison Henderson wrote:
>> From: Dave Chinner<dchinner@redhat.com>
>>
>> We need to add, remove or modify parent pointer attributes during
>> create/link/unlink/rename operations atomically with the dirents in the parent
>> directories being modified. This means they need to be modified in the same
>> transaction as the parent directories, and so we need to add the required
>> space for the attribute modifications to the transaction reservations.
>>
>> [achender: rebased, added xfs_sb_version_hasparent stub]
>>
>> Signed-off-by: Dave Chinner<dchinner@redhat.com>
>> Signed-off-by: Allison Henderson<allison.henderson@oracle.com>
>> ---
>>   fs/xfs/libxfs/xfs_format.h     |   5 ++
>>   fs/xfs/libxfs/xfs_trans_resv.c | 103 ++++++++++++++++++++++++++++++++---------
>>   2 files changed, 85 insertions(+), 23 deletions(-)
>>
>> diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h
>> index b9ea5bf..121862a 100644
>> --- a/fs/xfs/libxfs/xfs_format.h
>> +++ b/fs/xfs/libxfs/xfs_format.h
>> @@ -556,6 +556,11 @@ static inline bool xfs_sb_version_hasreflink(struct xfs_sb *sbp)
>>   		(sbp->sb_features_ro_compat & XFS_SB_FEAT_RO_COMPAT_REFLINK);
>>   }
>>   
>> +static inline bool xfs_sb_version_hasparent(struct xfs_sb *sbp)
>> +{
>> +	return false; /* We'll enable this at the end of the set */
> I think this chunk should just add the proper testing code here.
>
> You only add RO_COMPAT_PARENT to XFS_SB_FEAT_RO_COMPAT_ALL at the end of
> the patch series, so anyone bisecting their way through the series won't
> be able to mount such an fs.
Ok, there really isn't much more to add in here once we have the feature
flags defined.  Maybe we could just do something like:

return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5 &&
         (sbp->sb_features_ro_compat & 0));
/* We will turn the 0 into XFS_SB_FEAT_RO_COMPAT_PARENT at the end of 
the set */


Is that something like what you meant?

>> +}
>> +
>>   /*
>>    * end of superblock version macros
>>    */
>> diff --git a/fs/xfs/libxfs/xfs_trans_resv.c b/fs/xfs/libxfs/xfs_trans_resv.c
>> index 6bd916b..54399e2 100644
>> --- a/fs/xfs/libxfs/xfs_trans_resv.c
>> +++ b/fs/xfs/libxfs/xfs_trans_resv.c
>> @@ -802,29 +802,30 @@ xfs_calc_sb_reservation(
>>   	return xfs_calc_buf_res(1, mp->m_sb.sb_sectsize);
>>   }
>>   
>> +/*
>> + * Namespace reservations.
>> + *
>> + * These get tricky when parent pointers are enabled as we have attribute
>> + * modifications occurring from within these transactions. Rather than confuse
>> + * each of these reservation calculations with the conditional attribute
>> + * reservations, add them here in a clear and concise manner. This assumes that
>> + * the attribute reservations have already been calculated.
>> + *
>> + * Note that we only include the static attribute reservation here; the runtime
>> + * reservation will have to be modified by the size of the attributes being
>> + * added/removed/modified. See the comments on the attribute reservation
>> + * calculations for more details.
> I don't know that we can properly use a different runtime reservations
> than what we statically reserve here, since the static reservations are
> used to ensure that the log is of sufficient size given the fs geometry.
>
> <shrug> Maybe we can figure out how much extra space is allowable given
> the actual size of the log?  Or perhaps in the end we'll just end up
> restricting the maximum size of what we can log through intents?  Or
> just set the reservation to 64k I guess.... :)
>
> --D
>
>> + * Note for rename: rename will vastly overestimate requirements. This will be
>> + * addressed later when modifications are made to ensure parent attribute
>> + * modifications can be done atomically with the rename operation.
>> + */
>>   void
>> -xfs_trans_resv_calc(
>> +xfs_calc_namespace_reservations(
>>   	struct xfs_mount	*mp,
>>   	struct xfs_trans_resv	*resp)
>>   {
>> -	/*
>> -	 * The following transactions are logged in physical format and
>> -	 * require a permanent reservation on space.
>> -	 */
>> -	resp->tr_write.tr_logres = xfs_calc_write_reservation(mp);
>> -	if (xfs_sb_version_hasreflink(&mp->m_sb))
>> -		resp->tr_write.tr_logcount = XFS_WRITE_LOG_COUNT_REFLINK;
>> -	else
>> -		resp->tr_write.tr_logcount = XFS_WRITE_LOG_COUNT;
>> -	resp->tr_write.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
>> -
>> -	resp->tr_itruncate.tr_logres = xfs_calc_itruncate_reservation(mp);
>> -	if (xfs_sb_version_hasreflink(&mp->m_sb))
>> -		resp->tr_itruncate.tr_logcount =
>> -				XFS_ITRUNCATE_LOG_COUNT_REFLINK;
>> -	else
>> -		resp->tr_itruncate.tr_logcount = XFS_ITRUNCATE_LOG_COUNT;
>> -	resp->tr_itruncate.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
>> +	ASSERT(resp->tr_attrsetm.tr_logres > 0);
>>   
>>   	resp->tr_rename.tr_logres = xfs_calc_rename_reservation(mp);
>>   	resp->tr_rename.tr_logcount = XFS_RENAME_LOG_COUNT;
>> @@ -846,15 +847,69 @@ xfs_trans_resv_calc(
>>   	resp->tr_create.tr_logcount = XFS_CREATE_LOG_COUNT;
>>   	resp->tr_create.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
>>   
>> +	resp->tr_mkdir.tr_logres = xfs_calc_mkdir_reservation(mp);
>> +	resp->tr_mkdir.tr_logcount = XFS_MKDIR_LOG_COUNT;
>> +	resp->tr_mkdir.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
>> +
>> +	if (!xfs_sb_version_hasparent(&mp->m_sb))
>> +		return;
>> +
>> +	/* rename can add/remove/modify 2 parent attributes */
>> +	resp->tr_rename.tr_logres += 2 * max(resp->tr_attrsetm.tr_logres,
>> +					     resp->tr_attrrm.tr_logres);
>> +	resp->tr_rename.tr_logcount += 2 * max(resp->tr_attrsetm.tr_logcount,
>> +					       resp->tr_attrrm.tr_logcount);
>> +
>> +	/* create will add 1 parent attribute */
>> +	resp->tr_create.tr_logres += resp->tr_attrsetm.tr_logres;
>> +	resp->tr_create.tr_logcount += resp->tr_attrsetm.tr_logcount;
>> +
>> +	/* mkdir will add 1 parent attribute */
>> +	resp->tr_mkdir.tr_logres += resp->tr_attrsetm.tr_logres;
>> +	resp->tr_mkdir.tr_logcount += resp->tr_attrsetm.tr_logcount;
>> +
>> +	/* link will add 1 parent attribute */
>> +	resp->tr_link.tr_logres += resp->tr_attrsetm.tr_logres;
>> +	resp->tr_link.tr_logcount += resp->tr_attrsetm.tr_logcount;
>> +
>> +	/* symlink will add 1 parent attribute */
>> +	resp->tr_symlink.tr_logres += resp->tr_attrsetm.tr_logres;
>> +	resp->tr_symlink.tr_logcount += resp->tr_attrsetm.tr_logcount;
>> +
>> +	/* remove will remove 1 parent attribute */
>> +	resp->tr_remove.tr_logres += resp->tr_attrrm.tr_logres;
>> +	resp->tr_remove.tr_logcount = resp->tr_attrrm.tr_logcount;
>> +}
>> +
>> +void
>> +xfs_trans_resv_calc(
>> +	struct xfs_mount	*mp,
>> +	struct xfs_trans_resv	*resp)
>> +{
>> +	/*
>> +	 * The following transactions are logged in physical format and
>> +	 * require a permanent reservation on space.
>> +	 */
>> +	resp->tr_write.tr_logres = xfs_calc_write_reservation(mp);
>> +	if (xfs_sb_version_hasreflink(&mp->m_sb))
>> +		resp->tr_write.tr_logcount = XFS_WRITE_LOG_COUNT_REFLINK;
>> +	else
>> +		resp->tr_write.tr_logcount = XFS_WRITE_LOG_COUNT;
>> +	resp->tr_write.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
>> +
>> +	resp->tr_itruncate.tr_logres = xfs_calc_itruncate_reservation(mp);
>> +	if (xfs_sb_version_hasreflink(&mp->m_sb))
>> +		resp->tr_itruncate.tr_logcount =
>> +				XFS_ITRUNCATE_LOG_COUNT_REFLINK;
>> +	else
>> +		resp->tr_itruncate.tr_logcount = XFS_ITRUNCATE_LOG_COUNT;
>> +	resp->tr_itruncate.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
>> +
>>   	resp->tr_create_tmpfile.tr_logres =
>>   			xfs_calc_create_tmpfile_reservation(mp);
>>   	resp->tr_create_tmpfile.tr_logcount = XFS_CREATE_TMPFILE_LOG_COUNT;
>>   	resp->tr_create_tmpfile.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
>>   
>> -	resp->tr_mkdir.tr_logres = xfs_calc_mkdir_reservation(mp);
>> -	resp->tr_mkdir.tr_logcount = XFS_MKDIR_LOG_COUNT;
>> -	resp->tr_mkdir.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
>> -
>>   	resp->tr_ifree.tr_logres = xfs_calc_ifree_reservation(mp);
>>   	resp->tr_ifree.tr_logcount = XFS_INACTIVE_LOG_COUNT;
>>   	resp->tr_ifree.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
>> @@ -886,6 +941,8 @@ xfs_trans_resv_calc(
>>   		resp->tr_qm_dqalloc.tr_logcount = XFS_WRITE_LOG_COUNT;
>>   	resp->tr_qm_dqalloc.tr_logflags |= XFS_TRANS_PERM_LOG_RES;
>>   
>> +	xfs_calc_namespace_reservations(mp, resp);
>> +
>>   	/*
>>   	 * The following transactions are logged in logical format with
>>   	 * a default log count.
>> -- 
>> 2.7.4
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
>> the body of a message tomajordomo@vger.kernel.org
>> More majordomo info athttp://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 02/17] Set up infastructure for deferred attribute operations
  2017-10-19 19:02   ` Darrick J. Wong
@ 2017-10-21  1:08     ` Allison Henderson
  0 siblings, 0 replies; 66+ messages in thread
From: Allison Henderson @ 2017-10-21  1:08 UTC (permalink / raw)
  Cc: linux-xfs

On 10/19/2017 12:02 PM, Darrick J. Wong wrote:

> On Wed, Oct 18, 2017 at 03:55:18PM -0700, Allison Henderson wrote:
>> This patch adds two new log item types for setting or
>> removing attributes as deferred operations.  The
>> xfs_attri_log_item logs an intent to set or remove an
>> attribute.  The corresponding xfs_attrd_log_item holds
>> a reference to the xfs_attri_log_item and is freed once
>> the transaction is done.  Both log items use a generic
>> xfs_attr_log_format structure that contains the attribute
>> name, value, flags, inode, and an op_flag that indicates
>> if the operations is a set or remove.
>>
>> At the moment, this feature will only be used by the parent
>> pointer patch set which uses attributes to store information
>> about an inodes parent.
>>
>> Signed-off-by: Allison Henderson<allison.henderson@oracle.com>
>> ---
>>   fs/xfs/Makefile                |   2 +
>>   fs/xfs/libxfs/xfs_attr.c       |   2 +-
>>   fs/xfs/libxfs/xfs_defer.h      |   1 +
>>   fs/xfs/libxfs/xfs_log_format.h |  36 ++-
>>   fs/xfs/libxfs/xfs_types.h      |   1 +
>>   fs/xfs/xfs_attr.h              |  20 +-
>>   fs/xfs/xfs_attr_item.c         | 512 +++++++++++++++++++++++++++++++++++++++++
>>   fs/xfs/xfs_attr_item.h         | 111 +++++++++
>>   fs/xfs/xfs_log_recover.c       | 140 +++++++++++
>>   fs/xfs/xfs_super.c             |   1 +
>>   fs/xfs/xfs_trans.h             |  13 ++
>>   fs/xfs/xfs_trans_attr.c        | 286 +++++++++++++++++++++++
>>   12 files changed, 1121 insertions(+), 4 deletions(-)
>>
>> diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile
>> index a6e955b..ec6486b 100644
>> --- a/fs/xfs/Makefile
>> +++ b/fs/xfs/Makefile
>> @@ -106,6 +106,7 @@ xfs-y				+= xfs_log.o \
>>   				   xfs_bmap_item.o \
>>   				   xfs_buf_item.o \
>>   				   xfs_extfree_item.o \
>> +				   xfs_attr_item.o \
>>   				   xfs_icreate_item.o \
>>   				   xfs_inode_item.o \
>>   				   xfs_refcount_item.o \
>> @@ -115,6 +116,7 @@ xfs-y				+= xfs_log.o \
>>   				   xfs_trans_bmap.o \
>>   				   xfs_trans_buf.o \
>>   				   xfs_trans_extfree.o \
>> +				   xfs_trans_attr.o \
>>   				   xfs_trans_inode.o \
>>   				   xfs_trans_refcount.o \
>>   				   xfs_trans_rmap.o \
>> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
>> index b00ec1f..5325ec2 100644
>> --- a/fs/xfs/libxfs/xfs_attr.c
>> +++ b/fs/xfs/libxfs/xfs_attr.c
>> @@ -74,7 +74,7 @@ STATIC int xfs_attr_fillstate(xfs_da_state_t *state);
>>   STATIC int xfs_attr_refillstate(xfs_da_state_t *state);
>>   
>>   
>> -STATIC int
>> +int
>>   xfs_attr_args_init(
>>   	struct xfs_da_args	*args,
>>   	struct xfs_inode	*dp,
>> diff --git a/fs/xfs/libxfs/xfs_defer.h b/fs/xfs/libxfs/xfs_defer.h
>> index d4f046d..ef0f8bf 100644
>> --- a/fs/xfs/libxfs/xfs_defer.h
>> +++ b/fs/xfs/libxfs/xfs_defer.h
>> @@ -55,6 +55,7 @@ enum xfs_defer_ops_type {
>>   	XFS_DEFER_OPS_TYPE_REFCOUNT,
>>   	XFS_DEFER_OPS_TYPE_RMAP,
>>   	XFS_DEFER_OPS_TYPE_FREE,
>> +	XFS_DEFER_OPS_TYPE_ATTR,
>>   	XFS_DEFER_OPS_TYPE_MAX,
>>   };
>>   
>> diff --git a/fs/xfs/libxfs/xfs_log_format.h b/fs/xfs/libxfs/xfs_log_format.h
>> index 8372e9b..b0ce87e 100644
>> --- a/fs/xfs/libxfs/xfs_log_format.h
>> +++ b/fs/xfs/libxfs/xfs_log_format.h
>> @@ -18,6 +18,8 @@
>>   #ifndef	__XFS_LOG_FORMAT_H__
>>   #define __XFS_LOG_FORMAT_H__
>>   
>> +#include "xfs_attr.h"
>> +
>>   struct xfs_mount;
>>   struct xfs_trans_res;
>>   
>> @@ -116,7 +118,12 @@ static inline uint xlog_get_cycle(char *ptr)
>>   #define XLOG_REG_TYPE_CUD_FORMAT	24
>>   #define XLOG_REG_TYPE_BUI_FORMAT	25
>>   #define XLOG_REG_TYPE_BUD_FORMAT	26
>> -#define XLOG_REG_TYPE_MAX		26
>> +#define XLOG_REG_TYPE_ATTRI_FORMAT	27
>> +#define XLOG_REG_TYPE_ATTRD_FORMAT	28
>> +#define XLOG_REG_TYPE_ATTR_NAME		29
>> +#define XLOG_REG_TYPE_ATTR_VALUE	30
>> +#define XLOG_REG_TYPE_MAX		31
>> +
>>   
>>   /*
>>    * Flags to log operation header
>> @@ -239,6 +246,8 @@ typedef struct xfs_trans_header {
>>   #define	XFS_LI_CUD		0x1243
>>   #define	XFS_LI_BUI		0x1244	/* bmbt update intent */
>>   #define	XFS_LI_BUD		0x1245
>> +#define	XFS_LI_ATTRI		0x1246  /* attr set/remove intent*/
>> +#define	XFS_LI_ATTRD		0x1247  /* attr set/remove done */
>>   
>>   #define XFS_LI_TYPE_DESC \
>>   	{ XFS_LI_EFI,		"XFS_LI_EFI" }, \
>> @@ -254,7 +263,9 @@ typedef struct xfs_trans_header {
>>   	{ XFS_LI_CUI,		"XFS_LI_CUI" }, \
>>   	{ XFS_LI_CUD,		"XFS_LI_CUD" }, \
>>   	{ XFS_LI_BUI,		"XFS_LI_BUI" }, \
>> -	{ XFS_LI_BUD,		"XFS_LI_BUD" }
>> +	{ XFS_LI_BUD,		"XFS_LI_BUD" }, \
>> +	{ XFS_LI_ATTRI,		"XFS_LI_ATTRI" }, \
>> +	{ XFS_LI_ATTRD,		"XFS_LI_ATTRD" }
>>   
>>   /*
>>    * Inode Log Item Format definitions.
>> @@ -863,4 +874,25 @@ struct xfs_icreate_log {
>>   	__be32		icl_gen;	/* inode generation number to use */
>>   };
>>   
>> +/* Flags for deferred attribute operations */
>> +#define ATTR_OP_FLAGS_SET	0x01	/* Set the attribute */
> The names need to have "XFS_" prefixed here because this is a public
> header file, e.g. XFS_ATTR_OP_FLAGS_SET.
OK, will fix

>> +#define ATTR_OP_FLAGS_REMOVE	0x02	/* Remove the attribute */
>> +#define ATTR_OP_FLAGS_MAX	0x02	/* Max flags */
> I would be a little more explicit about which bits of op_flags are
> actual bit flags and which parts are mutually exclusive type codes.
> IOWs, from just these definitions here it's not clear that _SET and
> _REMOVE can't both be set at the same time.
>
> /* upper bits are flags, lower byte is type code */
> #define XFS_ATTR_OP_FLAGS_SET		1
> #define XFS_ATTR_OP_FLAGS_REMOVE	2
> #define XFS_ATTR_OP_FLAGS_TYPE_MASK	0xFF
>
> (TBH this opcode part could be an enum defined elsewhere like what RUI
> pe_flags does...)
>
> #define XFS_ATTR_OP_FLAGS_AFLAG		(1U << 31)
> #define XFS_ATTR_OP_FLAGS_SOMEOTHERFLAG	(1U << 30)
>
Sure that makes sense.  Will update.
>> +
>> +/*
>> + * This is the structure used to lay out an attr log item in the
>> + * log.
>> + */
>> +struct xfs_attr_log_format {
>> +	uint64_t	id;		/* attri identifier */
>> +	xfs_ino_t       ino;		/* the inode for this attr operation */
>> +	uint32_t        op_flags;	/* marks the op as a set or remove */
>> +	uint32_t        name_len;	/* attr name length */
>> +	uint32_t        value_len;	/* attr value length */
>> +	uint32_t        attr_flags;	/* attr flags */
>> +	uint16_t	type;		/* attri log item type */
>> +	uint16_t	size;		/* size of this item */
>> +	uint32_t	pad;		/* pad to 64 bit aligned */
> The names ought to have structure prefixes because this is a public header.
>
> uint64_t	alf_id;
> uint64_t	alf_ino;
> ...
> uint32_t	alf_pad;
>
Alrighty, will do
>> +};
>> +
>>   #endif /* __XFS_LOG_FORMAT_H__ */
>> diff --git a/fs/xfs/libxfs/xfs_types.h b/fs/xfs/libxfs/xfs_types.h
>> index 0220159..5372063 100644
>> --- a/fs/xfs/libxfs/xfs_types.h
>> +++ b/fs/xfs/libxfs/xfs_types.h
>> @@ -23,6 +23,7 @@ typedef uint32_t	prid_t;		/* project ID */
>>   typedef uint32_t	xfs_agblock_t;	/* blockno in alloc. group */
>>   typedef uint32_t	xfs_agino_t;	/* inode # within allocation grp */
>>   typedef uint32_t	xfs_extlen_t;	/* extent length in blocks */
>> +typedef uint32_t	xfs_attrlen_t;	/* attr length */
>>   typedef uint32_t	xfs_agnumber_t;	/* allocation group number */
>>   typedef int32_t		xfs_extnum_t;	/* # of extents in a file */
>>   typedef int16_t		xfs_aextnum_t;	/* # extents in an attribute fork */
>> diff --git a/fs/xfs/xfs_attr.h b/fs/xfs/xfs_attr.h
>> index 8542606..34bb4cb 100644
>> --- a/fs/xfs/xfs_attr.h
>> +++ b/fs/xfs/xfs_attr.h
>> @@ -18,6 +18,8 @@
>>   #ifndef __XFS_ATTR_H__
>>   #define	__XFS_ATTR_H__
>>   
>> +#include "libxfs/xfs_defer.h"
> What does this header file need from xfs_defer.h?
>
>>   struct xfs_inode;
>>   struct xfs_da_args;
>>   struct xfs_attr_list_context;
>> @@ -87,6 +89,20 @@ typedef struct attrlist_ent {	/* data from attr_list() */
>>   } attrlist_ent_t;
>>   
>>   /*
>> + * List of attrs to commit later.
>> + */
>> +struct xfs_attr_item {
>> +	xfs_ino_t	  xattri_ino;
>> +	uint32_t	  xattri_op_flags;
>> +	uint32_t	  xattri_value_len;   /* length of name and val */
>> +	uint32_t	  xattri_name_len;    /* length of name */
>> +	uint32_t	  xattri_flags;       /* attr flags */
>> +	char		  xattri_name[XATTR_NAME_MAX];
>> +	char              xattri_value[XATTR_SIZE_MAX];
> MAXNAMELEN and XFS_XATTR_SIZE_MAX ?
>
> (Ugh, really, we should just clean that up to XFS_MAXNAMELEN...)
>
Yeah, I think you suggest using a "xattri_namevalue[0];" here in the
next patch review.  I'll just pull all the array stuff out....
>> +	struct list_head  xattri_list;
>> +};
>> +
>> +/*
>>    * Given a pointer to the (char*) buffer containing the attr_list() result,
>>    * and an index, return a pointer to the indicated attribute in the buffer.
>>    */
>> @@ -154,6 +170,8 @@ int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name, int flags);
>>   int xfs_attr_remove_args(struct xfs_da_args *args, int flags);
>>   int xfs_attr_list(struct xfs_inode *dp, char *buffer, int bufsize,
>>   		  int flags, struct attrlist_cursor_kern *cursor);
>> -
>> +int xfs_attr_args_init(struct xfs_da_args *args, struct xfs_inode *dp,
>> +		       const unsigned char *name, int flags);
>> +int xfs_attr_calc_size(struct xfs_da_args *args, int *local);
>>   
>>   #endif	/* __XFS_ATTR_H__ */
>> diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
>> new file mode 100644
>> index 0000000..8cbe9b0
>> --- /dev/null
>> +++ b/fs/xfs/xfs_attr_item.c
>> @@ -0,0 +1,512 @@
>> +/*
>> + * Copyright (c) 2017 Oracle, Inc.
>> + * All Rights Reserved.
>> + *
>> + * This program is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU General Public License as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it would be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program; if not, write the Free Software Foundation Inc.
>> + */
>> +#include "xfs.h"
>> +#include "xfs_fs.h"
>> +#include "xfs_format.h"
>> +#include "xfs_log_format.h"
>> +#include "xfs_trans_resv.h"
>> +#include "xfs_bit.h"
>> +#include "xfs_mount.h"
>> +#include "xfs_trans.h"
>> +#include "xfs_trans_priv.h"
>> +#include "xfs_buf_item.h"
>> +#include "xfs_attr_item.h"
>> +#include "xfs_log.h"
>> +#include "xfs_btree.h"
>> +#include "xfs_rmap.h"
>> +
>> +
>> +static inline struct xfs_attri_log_item *ATTRI_ITEM(struct xfs_log_item *lip)
>> +{
>> +	return container_of(lip, struct xfs_attri_log_item, item);
>> +}
>> +
>> +void
>> +xfs_attri_item_free(
>> +	struct xfs_attri_log_item	*attrip)
>> +{
>> +	kmem_free(attrip->item.li_lv_shadow);
>> +	kmem_free(attrip);
>> +}
>> +
>> +/*
>> + * This returns the number of iovecs needed to log the given attri item.
>> + * We only need 1 iovec for an attri item.  It just logs the attr_log_format
>> + * structure.
>> + */
>> +static inline int
>> +xfs_attri_item_sizeof(
>> +	struct xfs_attri_log_item *attrip)
>> +{
>> +	return sizeof(struct xfs_attr_log_format);
>> +}
>> +
>> +STATIC void
>> +xfs_attri_item_size(
>> +	struct xfs_log_item	*lip,
>> +	int			*nvecs,
>> +	int			*nbytes)
>> +{
>> +	struct xfs_attri_log_item       *attrip = ATTRI_ITEM(lip);
>> +
>> +	*nvecs += 1;
>> +	*nbytes += xfs_attri_item_sizeof(attrip);
>> +
>> +	if (attrip->name_len > 0) {
>> +		*nvecs += 1;
>> +		nbytes += attrip->name_len;
>> +	}
>> +
>> +	if (attrip->value_len > 0) {
>> +		*nvecs += 1;
>> +		nbytes += attrip->value_len;
>> +	}
>> +}
>> +
>> +/*
>> + * This is called to fill in the vector of log iovecs for the
>> + * given attri log item. We use only 1 iovec, and we point that
>> + * at the attri_log_format structure embedded in the attri item.
>> + * It is at this point that we assert that all of the attr
>> + * slots in the attri item have been filled.
>> + */
>> +STATIC void
>> +xfs_attri_item_format(
>> +	struct xfs_log_item	*lip,
>> +	struct xfs_log_vec	*lv)
>> +{
>> +	struct xfs_attri_log_item	*attrip = ATTRI_ITEM(lip);
>> +	struct xfs_log_iovec	*vecp = NULL;
>> +
>> +	attrip->format.type = XFS_LI_ATTRI;
>> +	attrip->format.size = 1;
>> +
>> +	xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTRI_FORMAT,
>> +			&attrip->format,
>> +			xfs_attri_item_sizeof(attrip));
>> +	if (attrip->name_len > 0)
>> +		xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTR_NAME,
>> +				attrip->name, attrip->name_len);
>> +
>> +	if (attrip->value_len > 0)
>> +		xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTR_VALUE,
>> +				attrip->value, attrip->value_len);
>> +}
>> +
>> +
>> +/*
>> + * Pinning has no meaning for an attri item, so just return.
>> + */
>> +STATIC void
>> +xfs_attri_item_pin(
>> +	struct xfs_log_item	*lip)
>> +{
>> +}
>> +
>> +/*
>> + * The unpin operation is the last place an ATTRI is manipulated in the log. It
>> + * is either inserted in the AIL or aborted in the event of a log I/O error. In
>> + * either case, the EFI transaction has been successfully committed to make it
> EFI?
Typo left over from extent free intent code that I used as a template....
Will fix :-)
>> + * this far. Therefore, we expect whoever committed the ATTRI to either
>> + * construct and commit the ATTRD or drop the ATTRD's reference in the event of
>> + * error. Simply drop the log's ATTRI reference now that the log is done with
>> + * it.
>> + */
>> +STATIC void
>> +xfs_attri_item_unpin(
>> +	struct xfs_log_item	*lip,
>> +	int			remove)
>> +{
>> +	struct xfs_attri_log_item	*attrip = ATTRI_ITEM(lip);
>> +
>> +	xfs_attri_release(attrip);
>> +}
>> +
>> +/*
>> + * attri items have no locking or pushing.  However, since ATTRIs are pulled
>> + * from the AIL when their corresponding ATTRDs are committed to disk, their
>> + * situation is very similar to being pinned.  Return XFS_ITEM_PINNED so that
>> + * the caller will eventually flush the log.  This should help in getting the
>> + * ATTRI out of the AIL.
>> + */
>> +STATIC uint
>> +xfs_attri_item_push(
>> +	struct xfs_log_item	*lip,
>> +	struct list_head	*buffer_list)
>> +{
>> +	return XFS_ITEM_PINNED;
>> +}
>> +
>> +/*
>> + * The ATTRI has been either committed or aborted if the transaction has been
>> + * cancelled. If the transaction was cancelled, an ATTRD isn't going to be
>> + * constructed and thus we free the ATTRI here directly.
>> + */
>> +STATIC void
>> +xfs_attri_item_unlock(
>> +	struct xfs_log_item	*lip)
>> +{
>> +	if (lip->li_flags & XFS_LI_ABORTED)
>> +		xfs_attri_item_free(ATTRI_ITEM(lip));
>> +}
>> +
>> +/*
>> + * The ATTRI is logged only once and cannot be moved in the log, so simply
>> + * return the lsn at which it's been logged.
>> + */
>> +STATIC xfs_lsn_t
>> +xfs_attri_item_committed(
>> +	struct xfs_log_item	*lip,
>> +	xfs_lsn_t		lsn)
>> +{
>> +	return lsn;
>> +}
>> +
>> +STATIC void
>> +xfs_attri_item_committing(
>> +	struct xfs_log_item	*lip,
>> +	xfs_lsn_t		lsn)
>> +{
>> +}
>> +
>> +/*
>> + * This is the ops vector shared by all attri log items.
>> + */
>> +static const struct xfs_item_ops xfs_attri_item_ops = {
>> +	.iop_size	= xfs_attri_item_size,
>> +	.iop_format	= xfs_attri_item_format,
>> +	.iop_pin	= xfs_attri_item_pin,
>> +	.iop_unpin	= xfs_attri_item_unpin,
>> +	.iop_unlock	= xfs_attri_item_unlock,
>> +	.iop_committed	= xfs_attri_item_committed,
>> +	.iop_push	= xfs_attri_item_push,
>> +	.iop_committing = xfs_attri_item_committing
>> +};
>> +
>> +
>> +/*
>> + * Allocate and initialize an attri item
>> + */
>> +struct xfs_attri_log_item *
>> +xfs_attri_init(
>> +	struct xfs_mount	*mp)
>> +
>> +{
>> +	struct xfs_attri_log_item	*attrip;
>> +	uint			size;
>> +
>> +	size = (uint)(sizeof(struct xfs_attri_log_item));
>> +	attrip = kmem_zalloc(size, KM_SLEEP);
>> +
>> +	xfs_log_item_init(mp, &(attrip->item), XFS_LI_ATTRI,
>> +			  &xfs_attri_item_ops);
>> +	attrip->format.id = (uintptr_t)(void *)attrip;
>> +	atomic_set(&attrip->refcount, 2);
>> +
>> +	return attrip;
>> +}
>> +
>> +/*
>> + * Copy an attr format buffer from the given buf, and into the destination
>> + * attr format structure.
>> + */
>> +int
>> +xfs_attr_copy_format(struct xfs_log_iovec *buf,
>> +		      struct xfs_attr_log_format *dst_attr_fmt)
>> +{
>> +	struct xfs_attr_log_format *src_attr_fmt = buf->i_addr;
>> +	uint len = sizeof(struct xfs_attr_log_format);
>> +
>> +	if (buf->i_len == len) {
>> +		memcpy((char *)dst_attr_fmt, (char *)src_attr_fmt, len);
>> +		return 0;
>> +	}
>> +	return -EFSCORRUPTED;
>> +}
>> +
>> +/*
>> + * Freeing the attri requires that we remove it from the AIL if it has already
>> + * been placed there. However, the ATTRI may not yet have been placed in the
>> + * AIL when called by xfs_attri_release() from ATTRD processing due to the
>> + * ordering of committed vs unpin operations in bulk insert operations. Hence
>> + * the reference count to ensure only the last caller frees the ATTRI.
>> + */
>> +void
>> +xfs_attri_release(
>> +	struct xfs_attri_log_item	*attrip)
>> +{
>> +	ASSERT(atomic_read(&attrip->refcount) > 0);
>> +	if (atomic_dec_and_test(&attrip->refcount)) {
>> +		xfs_trans_ail_remove(&attrip->item,
>> +				     SHUTDOWN_LOG_IO_ERROR);
>> +		xfs_attri_item_free(attrip);
>> +	}
>> +}
>> +
>> +static inline struct xfs_attrd_log_item *ATTRD_ITEM(struct xfs_log_item *lip)
>> +{
>> +	return container_of(lip, struct xfs_attrd_log_item, item);
>> +}
>> +
>> +STATIC void
>> +xfs_attrd_item_free(struct xfs_attrd_log_item *attrdp)
>> +{
>> +	kmem_free(attrdp->item.li_lv_shadow);
>> +	kmem_free(attrdp);
>> +}
>> +
>> +/*
>> + * This returns the number of iovecs needed to log the given attrd item.
>> + * We only need 1 iovec for an attrd item.  It just logs the attr_log_format
>> + * structure.
>> + */
>> +static inline int
>> +xfs_attrd_item_sizeof(
>> +	struct xfs_attrd_log_item *attrdp)
>> +{
>> +	return sizeof(struct xfs_attr_log_format);
>> +}
>> +
>> +STATIC void
>> +xfs_attrd_item_size(
>> +	struct xfs_log_item	*lip,
>> +	int			*nvecs,
>> +	int			*nbytes)
>> +{
>> +	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
>> +	*nvecs += 1;
>> +	*nbytes += xfs_attrd_item_sizeof(attrdp);
>> +
>> +	if (attrdp->name_len > 0) {
>> +		*nvecs += 1;
>> +		nbytes += attrdp->name_len;
>> +	}
>> +
>> +	if (attrdp->value_len > 0) {
>> +		*nvecs += 1;
>> +		nbytes += attrdp->value_len;
>> +	}
>> +}
>> +
>> +/*
>> + * This is called to fill in the vector of log iovecs for the
>> + * given attrd log item. We use only 1 iovec, and we point that
>> + * at the attr_log_format structure embedded in the attrd item.
>> + * It is at this point that we assert that all of the attr
>> + * slots in the attrd item have been filled.
>> + */
>> +STATIC void
>> +xfs_attrd_item_format(
>> +	struct xfs_log_item	*lip,
>> +	struct xfs_log_vec	*lv)
>> +{
>> +	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
>> +	struct xfs_log_iovec	*vecp = NULL;
>> +
>> +	attrdp->format.type = XFS_LI_ATTRD;
>> +	attrdp->format.size = 1;
>> +
>> +	xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTRD_FORMAT,
>> +			&attrdp->format,
>> +			xfs_attrd_item_sizeof(attrdp));
>> +
>> +	if (attrdp->name_len > 0)
>> +		xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTR_NAME,
>> +				attrdp->name, attrdp->name_len);
>> +
>> +	if (attrdp->value_len > 0)
>> +		xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTR_VALUE,
>> +				attrdp->value, attrdp->value_len);
> No need to log name/value for ATTRD items, since we only care about
> matching alf_id between ATTRI and ATTRD items.
Alrighty
>> +}
>> +
>> +/*
>> + * Pinning has no meaning for an attrd item, so just return.
>> + */
>> +STATIC void
>> +xfs_attrd_item_pin(
>> +	struct xfs_log_item	*lip)
>> +{
>> +}
>> +
>> +/*
>> + * Since pinning has no meaning for an attrd item, unpinning does
>> + * not either.
>> + */
>> +STATIC void
>> +xfs_attrd_item_unpin(
>> +	struct xfs_log_item	*lip,
>> +	int			remove)
>> +{
>> +}
>> +
>> +/*
>> + * There isn't much you can do to push on an attrd item.  It is simply stuck
>> + * waiting for the log to be flushed to disk.
>> + */
>> +STATIC uint
>> +xfs_attrd_item_push(
>> +	struct xfs_log_item	*lip,
>> +	struct list_head	*buffer_list)
>> +{
>> +	return XFS_ITEM_PINNED;
>> +}
>> +
>> +/*
>> + * The ATTRD is either committed or aborted if the transaction is cancelled. If
>> + * the transaction is cancelled, drop our reference to the ATTRI and free the
>> + * ATTRD.
>> + */
>> +STATIC void
>> +xfs_attrd_item_unlock(
>> +	struct xfs_log_item	*lip)
>> +{
>> +	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
>> +
>> +	if (lip->li_flags & XFS_LI_ABORTED) {
>> +		xfs_attri_release(attrdp->attrip);
>> +		xfs_attrd_item_free(attrdp);
>> +	}
>> +}
>> +
>> +/*
>> + * When the attrd item is committed to disk, all we need to do is delete our
>> + * reference to our partner attri item and then free ourselves. Since we're
>> + * freeing ourselves we must return -1 to keep the transaction code from
>> + * further referencing this item.
>> + */
>> +STATIC xfs_lsn_t
>> +xfs_attrd_item_committed(
>> +	struct xfs_log_item	*lip,
>> +	xfs_lsn_t		lsn)
>> +{
>> +	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
>> +
>> +	/*
>> +	 * Drop the ATTRI reference regardless of whether the ATTRD has been
>> +	 * aborted. Once the ATTRD transaction is constructed, it is the sole
>> +	 * responsibility of the ATTRD to release the ATTRI (even if the ATTRI
>> +	 * is aborted due to log I/O error).
>> +	 */
>> +	xfs_attri_release(attrdp->attrip);
>> +	xfs_attrd_item_free(attrdp);
>> +
>> +	return (xfs_lsn_t)-1;
>> +}
>> +
>> +STATIC void
>> +xfs_attrd_item_committing(
>> +	struct xfs_log_item	*lip,
>> +	xfs_lsn_t		lsn)
>> +{
>> +}
>> +
>> +/*
>> + * This is the ops vector shared by all attrd log items.
>> + */
>> +static const struct xfs_item_ops xfs_attrd_item_ops = {
>> +	.iop_size	= xfs_attrd_item_size,
>> +	.iop_format	= xfs_attrd_item_format,
>> +	.iop_pin	= xfs_attrd_item_pin,
>> +	.iop_unpin	= xfs_attrd_item_unpin,
>> +	.iop_unlock	= xfs_attrd_item_unlock,
>> +	.iop_committed	= xfs_attrd_item_committed,
>> +	.iop_push	= xfs_attrd_item_push,
>> +	.iop_committing = xfs_attrd_item_committing
>> +};
>> +
>> +/*
>> + * Allocate and initialize an attrd item
>> + */
>> +struct xfs_attrd_log_item *
>> +xfs_attrd_init(
>> +	struct xfs_mount	*mp,
>> +	struct xfs_attri_log_item	*attrip)
>> +
>> +{
>> +	struct xfs_attrd_log_item	*attrdp;
>> +	uint			size;
>> +
>> +	size = (uint)(sizeof(struct xfs_attrd_log_item));
>> +	attrdp = kmem_zalloc(size, KM_SLEEP);
>> +
>> +	xfs_log_item_init(mp, &attrdp->item, XFS_LI_ATTRD,
>> +			  &xfs_attrd_item_ops);
>> +	attrdp->attrip = attrip;
>> +	attrdp->format.id = attrip->format.id;
>> +
>> +	return attrdp;
>> +}
>> +
>> +/*
>> + * Process an attr intent item that was recovered from
>> + * the log.  We need to delete the attr that it describes.
>> + */
>> +int
>> +xfs_attri_recover(
>> +	struct xfs_mount	*mp,
>> +	struct xfs_attri_log_item	*attrip)
>> +{
>> +	struct xfs_attrd_log_item	*attrdp;
>> +	struct xfs_trans	*tp;
>> +	int			error = 0;
>> +	struct xfs_attr_log_format	*attrp;
>> +
>> +	ASSERT(!test_bit(XFS_ATTRI_RECOVERED, &attrip->flags));
>> +
>> +	/*
>> +	 * First check the validity of the attr described by the
>> +	 * ATTRI.  If any are bad, then assume that all are bad and
>> +	 * just toss the ATTRI.
>> +	 */
>> +	attrp = &attrip->format;
>> +	if (attrp->value_len == 0 ||
>> +	    attrp->name_len == 0 ||
>> +	    attrp->op_flags > ATTR_OP_FLAGS_MAX) {
>> +		/*
>> +		 * This will pull the ATTRI from the AIL and
>> +		 * free the memory associated with it.
>> +		 */
>> +		set_bit(XFS_ATTRI_RECOVERED, &attrip->flags);
>> +		xfs_attri_release(attrip);
>> +		return -EIO;
>> +	}
>> +
>> +	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_itruncate, 0, 0, 0, &tp);
>> +	if (error)
>> +		return error;
>> +	attrdp = xfs_trans_get_attrd(tp, attrip);
>> +	attrp = &attrip->format;
>> +
>> +	error = xfs_trans_attr(tp, attrdp, attrp->ino,
>> +				attrp->op_flags,
>> +				attrp->attr_flags,
>> +				attrp->name_len,
>> +				attrp->value_len,
>> +				attrip->name,
>> +				attrip->value);
>> +	if (error)
>> +		goto abort_error;
>> +
>> +
>> +	set_bit(XFS_ATTRI_RECOVERED, &attrip->flags);
>> +	error = xfs_trans_commit(tp);
>> +	return error;
>> +
>> +abort_error:
>> +	xfs_trans_cancel(tp);
>> +	return error;
>> +}
>> diff --git a/fs/xfs/xfs_attr_item.h b/fs/xfs/xfs_attr_item.h
>> new file mode 100644
>> index 0000000..023675d
>> --- /dev/null
>> +++ b/fs/xfs/xfs_attr_item.h
>> @@ -0,0 +1,111 @@
>> +/*
>> + * Copyright (c) 2017 Oracle, Inc.
>> + * All Rights Reserved.
>> + *
>> + * This program is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU General Public License as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it would be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program; if not, write the Free Software Foundation Inc.
>> + */
>> +#ifndef	__XFS_ATTR_ITEM_H__
>> +#define	__XFS_ATTR_ITEM_H__
>> +
>> +/* kernel only ATTRI/ATTRD definitions */
>> +
>> +struct xfs_mount;
>> +struct kmem_zone;
>> +
>> +/*
>> + * Max number of attrs in fast allocation path.
>> + */
>> +#define XFS_ATTRI_MAX_FAST_ATTRS        16
> Daaaang, we can cram *sixteen* different attribute name/value updates in
> a single defer_ops item? :)
>
> Each defer_ops item gets its own transaction to make changes and log the
> done item.  Sixteen attr updates is a /lot/ to be pushing through one
> transaction since (AFAICT) the static log reservations provide worst
> case space for one update.  I think this ought to be 1, especially since
> the actual attr log item structure only appears to have space to store
> one name and one value.
Ok, that sounds reasonable.  :-)  I likely forgot to come back and update it
from what extent free had it set to.  Thx for the catch!

>> +
>> +
>> +/*
>> + * Define ATTR flag bits. Manipulated by set/clear/test_bit operators.
>> + */
>> +#define	XFS_ATTRI_RECOVERED	1
>> +
>> +/*
>> + * This is the "attr intention" log item.  It is used to log the fact
>> + * that some attrs need to be processed.  It is used in conjunction with the
>> + * "attr done" log item described below.
>> + *
>> + * The ATTRI is reference counted so that it is not freed prior to both the
>> + * ATTRI and ATTRD being committed and unpinned. This ensures the ATTRI is
>> + * inserted into the AIL even in the event of out of order ATTRI/ATTRD
>> + * processing. In other words, an ATTRI is born with two references:
>> + *
>> + *      1.) an ATTRI held reference to track ATTRI AIL insertion
>> + *      2.) an ATTRD held reference to track ATTRD commit
>> + *
>> + * On allocation, both references are the responsibility of the caller. Once
>> + * the ATTRI is added to and dirtied in a transaction, ownership of reference
>> + * one transfers to the transaction. The reference is dropped once the ATTRI is
>> + * inserted to the AIL or in the event of failure along the way (e.g., commit
>> + * failure, log I/O error, etc.). Note that the caller remains responsible for
>> + * the ATTRD reference under all circumstances to this point. The caller has no
>> + * means to detect failure once the transaction is committed, however.
>> + * Therefore, an ATTRD is required after this point, even in the event of
>> + * unrelated failure.
>> + *
>> + * Once an ATTRD is allocated and dirtied in a transaction, reference two
>> + * transfers to the transaction. The ATTRD reference is dropped once it reaches
>> + * the unpin handler. Similar to the ATTRI, the reference also drops in the
>> + * event of commit failure or log I/O errors. Note that the ATTRD is not
>> + * inserted in the AIL, so at this point both the ATTI and ATTRD are freed.
>> + */
>> +struct xfs_attri_log_item {
>> +	xfs_log_item_t			item;
>> +	atomic_t			refcount;
>> +	unsigned long			flags;	/* misc flags */
>> +	int				name_len;
>> +	void				*name;
>> +	int				value_len;
>> +	void				*value;
>> +	struct xfs_attr_log_format	format;
>> +};
>> +
>> +/*
>> + * This is the "attr done" log item.  It is used to log
>> + * the fact that some attrs earlier mentioned in an attri item
>> + * have been freed.
>> + */
>> +struct xfs_attrd_log_item {
>> +	struct xfs_log_item		item;
>> +	struct xfs_attri_log_item	*attrip;
>> +	uint				next_attr;
>> +	int				name_len;
>> +	void				*name;
>> +	int				value_len;
>> +	void				*value;
>> +	struct xfs_attr_log_format	format;
>> +};
>> +
>> +/*
>> + * Max number of attrs in fast allocation path.
>> + */
>> +#define	XFS_ATTRD_MAX_FAST_ATTRS	16
>> +
>> +extern struct kmem_zone	*xfs_attri_zone;
>> +extern struct kmem_zone	*xfs_attrd_zone;
>> +
>> +struct xfs_attri_log_item	*xfs_attri_init(struct xfs_mount *mp);
>> +struct xfs_attrd_log_item	*xfs_attrd_init(struct xfs_mount *mp,
>> +					struct xfs_attri_log_item *attrip);
>> +int xfs_attr_copy_format(struct xfs_log_iovec *buf,
>> +			 struct xfs_attr_log_format *dst_attri_fmt);
>> +void			xfs_attri_item_free(struct xfs_attri_log_item *attrip);
>> +void			xfs_attri_release(struct xfs_attri_log_item *attrip);
>> +
>> +int			xfs_attri_recover(struct xfs_mount *mp,
>> +					struct xfs_attri_log_item *attrip);
>> +
>> +#endif	/* __XFS_ATTR_ITEM_H__ */
>> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
>> index ee34899..8326f56 100644
>> --- a/fs/xfs/xfs_log_recover.c
>> +++ b/fs/xfs/xfs_log_recover.c
>> @@ -33,6 +33,7 @@
>>   #include "xfs_log_recover.h"
>>   #include "xfs_inode_item.h"
>>   #include "xfs_extfree_item.h"
>> +#include "xfs_attr_item.h"
>>   #include "xfs_trans_priv.h"
>>   #include "xfs_alloc.h"
>>   #include "xfs_ialloc.h"
>> @@ -1956,6 +1957,8 @@ xlog_recover_reorder_trans(
>>   		case XFS_LI_CUD:
>>   		case XFS_LI_BUI:
>>   		case XFS_LI_BUD:
>> +		case XFS_LI_ATTRI:
>> +		case XFS_LI_ATTRD:
>>   			trace_xfs_log_recover_item_reorder_tail(log,
>>   							trans, item, pass);
>>   			list_move_tail(&item->ri_list, &inode_list);
>> @@ -3489,6 +3492,92 @@ xlog_recover_efd_pass2(
>>   	return 0;
>>   }
>>   
>> +STATIC int
>> +xlog_recover_attri_pass2(
>> +	struct xlog                     *log,
>> +	struct xlog_recover_item        *item,
>> +	xfs_lsn_t                       lsn)
>> +{
>> +	int                             error;
>> +	struct xfs_mount                *mp = log->l_mp;
>> +	struct xfs_attri_log_item       *attrip;
>> +	struct xfs_attr_log_format     *attri_formatp;
>> +
>> +	attri_formatp = item->ri_buf[0].i_addr;
>> +
>> +	attrip = xfs_attri_init(mp);
>> +	error = xfs_attr_copy_format(&item->ri_buf[0], &attrip->format);
>> +	if (error) {
>> +		xfs_attri_item_free(attrip);
>> +		return error;
>> +	}
>> +
>> +	spin_lock(&log->l_ailp->xa_lock);
>> +	/*
>> +	 * The ATTRI has two references. One for the ATTRD and one for ATTRI to
>> +	 * ensure it makes it into the AIL. Insert the ATTRI into the AIL
>> +	 * directly and drop the ATTRI reference. Note that
>> +	 * xfs_trans_ail_update() drops the AIL lock.
>> +	 */
>> +	xfs_trans_ail_update(log->l_ailp, &attrip->item, lsn);
>> +	xfs_attri_release(attrip);
>> +	return 0;
>> +}
>> +
>> +
>> +/*
>> + * This routine is called when an ATTRD format structure is found in a committed
>> + * transaction in the log. Its purpose is to cancel the corresponding ATTRI if
>> + * it was still in the log. To do this it searches the AIL for the ATTRI with
>> + * an id equal to that in the ATTRD format structure. If we find it we drop
>> + * the ATTRD reference, which removes the ATTRI from the AIL and frees it.
>> + */
>> +STATIC int
>> +xlog_recover_attrd_pass2(
>> +	struct xlog                     *log,
>> +	struct xlog_recover_item        *item)
>> +{
>> +	struct xfs_attr_log_format    *attrd_formatp;
>> +	struct xfs_attri_log_item      *attrip = NULL;
>> +	struct xfs_log_item          *lip;
>> +	uint64_t                attri_id;
>> +	struct xfs_ail_cursor   cur;
>> +	struct xfs_ail          *ailp = log->l_ailp;
>> +
>> +	attrd_formatp = item->ri_buf[0].i_addr;
>> +	ASSERT((item->ri_buf[0].i_len ==
>> +				(sizeof(struct xfs_attr_log_format))));
>> +	attri_id = attrd_formatp->id;
>> +
>> +	/*
>> +	 * Search for the ATTRI with the id in the ATTRD format structure in the
>> +	 * AIL.
>> +	 */
>> +	spin_lock(&ailp->xa_lock);
>> +	lip = xfs_trans_ail_cursor_first(ailp, &cur, 0);
>> +	while (lip != NULL) {
>> +		if (lip->li_type == XFS_LI_ATTRI) {
>> +			attrip = (struct xfs_attri_log_item *)lip;
>> +			if (attrip->format.id == attri_id) {
>> +				/*
>> +				 * Drop the ATTRD reference to the ATTRI. This
>> +				 * removes the ATTRI from the AIL and frees it.
>> +				 */
>> +				spin_unlock(&ailp->xa_lock);
>> +				xfs_attri_release(attrip);
>> +				spin_lock(&ailp->xa_lock);
>> +				break;
>> +			}
>> +		}
>> +		lip = xfs_trans_ail_cursor_next(ailp, &cur);
>> +	}
>> +
>> +	xfs_trans_ail_cursor_done(&cur);
>> +	spin_unlock(&ailp->xa_lock);
>> +
>> +	return 0;
>> +}
>> +
>>   /*
>>    * This routine is called to create an in-core extent rmap update
>>    * item from the rui format structure which was logged on disk.
>> @@ -4108,6 +4197,10 @@ xlog_recover_commit_pass2(
>>   		return xlog_recover_efi_pass2(log, item, trans->r_lsn);
>>   	case XFS_LI_EFD:
>>   		return xlog_recover_efd_pass2(log, item);
>> +	case XFS_LI_ATTRI:
>> +		return xlog_recover_attri_pass2(log, item, trans->r_lsn);
>> +	case XFS_LI_ATTRD:
>> +		return xlog_recover_attrd_pass2(log, item);
>>   	case XFS_LI_RUI:
>>   		return xlog_recover_rui_pass2(log, item, trans->r_lsn);
>>   	case XFS_LI_RUD:
>> @@ -4669,6 +4762,49 @@ xlog_recover_cancel_efi(
>>   	spin_lock(&ailp->xa_lock);
>>   }
>>   
>> +/* Recover the ATTRI if necessary. */
>> +STATIC int
>> +xlog_recover_process_attri(
>> +	struct xfs_mount                *mp,
>> +	struct xfs_ail                  *ailp,
>> +	struct xfs_log_item             *lip)
>> +{
>> +	struct xfs_attri_log_item       *attrip;
>> +	int                             error;
>> +
>> +	/*
>> +	 * Skip ATTRIs that we've already processed.
>> +	 */
>> +	attrip = container_of(lip, struct xfs_attri_log_item, item);
>> +	if (test_bit(XFS_ATTRI_RECOVERED, &attrip->flags))
>> +		return 0;
>> +
>> +	spin_unlock(&ailp->xa_lock);
>> +	error = xfs_attri_recover(mp, attrip);
>> +	spin_lock(&ailp->xa_lock);
>> +
>> +	return error;
>> +}
>> +
>> +/* Release the ATTRI since we're cancelling everything. */
>> +STATIC void
>> +xlog_recover_cancel_attri(
>> +	struct xfs_mount                *mp,
>> +	struct xfs_ail                  *ailp,
>> +	struct xfs_log_item             *lip)
>> +{
>> +	struct xfs_attri_log_item         *attrip;
>> +
>> +	attrip = container_of(lip, struct xfs_attri_log_item, item);
>> +
>> +	spin_unlock(&ailp->xa_lock);
>> +	xfs_attri_release(attrip);
>> +	spin_lock(&ailp->xa_lock);
>> +}
>> +
>> +
>> +
>> +
>>   /* Recover the RUI if necessary. */
>>   STATIC int
>>   xlog_recover_process_rui(
>> @@ -4861,6 +4997,10 @@ xlog_recover_process_intents(
>>   		case XFS_LI_EFI:
>>   			error = xlog_recover_process_efi(log->l_mp, ailp, lip);
>>   			break;
>> +		case XFS_LI_ATTRI:
>> +			error = xlog_recover_process_attri(log->l_mp,
>> +							   ailp, lip);
> FWIW you're allowed (in xfs land only) to use the double-indent
> convention:
>
> error = xlog_recover_process_attri(log->l_mp,
> 		ailp, lip);
>
> Instead of spending time lining things up with the '('.
>
Ok, from here out I'll use the double indent then
>> +			break;
>>   		case XFS_LI_RUI:
>>   			error = xlog_recover_process_rui(log->l_mp, ailp, lip);
>>   			break;
>> diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
>> index 584cf2d..046ced4 100644
>> --- a/fs/xfs/xfs_super.c
>> +++ b/fs/xfs/xfs_super.c
>> @@ -2024,6 +2024,7 @@ init_xfs_fs(void)
>>   	xfs_rmap_update_init_defer_op();
>>   	xfs_refcount_update_init_defer_op();
>>   	xfs_bmap_update_init_defer_op();
>> +	xfs_attr_init_defer_op();
>>   
>>   	xfs_dir_startup();
>>   
>> diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
>> index 815b53d2..d003637 100644
>> --- a/fs/xfs/xfs_trans.h
>> +++ b/fs/xfs/xfs_trans.h
>> @@ -40,6 +40,9 @@ struct xfs_cud_log_item;
>>   struct xfs_defer_ops;
>>   struct xfs_bui_log_item;
>>   struct xfs_bud_log_item;
>> +struct xfs_attrd_log_item;
>> +struct xfs_attri_log_item;
>> +
>>   
>>   typedef struct xfs_log_item {
>>   	struct list_head		li_ail;		/* AIL pointers */
>> @@ -223,12 +226,22 @@ void		xfs_trans_dirty_buf(struct xfs_trans *, struct xfs_buf *);
>>   void		xfs_trans_log_inode(xfs_trans_t *, struct xfs_inode *, uint);
>>   
>>   void		xfs_extent_free_init_defer_op(void);
>> +void            xfs_attr_init_defer_op(void);
>> +
>>   struct xfs_efd_log_item	*xfs_trans_get_efd(struct xfs_trans *,
>>   				  struct xfs_efi_log_item *,
>>   				  uint);
>>   int		xfs_trans_free_extent(struct xfs_trans *,
>>   				      struct xfs_efd_log_item *, xfs_fsblock_t,
>>   				      xfs_extlen_t, struct xfs_owner_info *);
>> +struct xfs_attrd_log_item *
>> +xfs_trans_get_attrd(struct xfs_trans *tp,
>> +		    struct xfs_attri_log_item *attrip);
>> +int xfs_trans_attr(struct xfs_trans *tp, struct xfs_attrd_log_item *attrdp,
>> +			xfs_ino_t ino, uint32_t attr_op_flags, uint32_t flags,
>> +			uint32_t name_len, uint32_t value_len,
>> +			char *name, char *value);
>> +
>>   int		xfs_trans_commit(struct xfs_trans *);
>>   int		xfs_trans_roll(struct xfs_trans **);
>>   int		xfs_trans_roll_inode(struct xfs_trans **, struct xfs_inode *);
>> diff --git a/fs/xfs/xfs_trans_attr.c b/fs/xfs/xfs_trans_attr.c
>> new file mode 100644
>> index 0000000..39eb18d
>> --- /dev/null
>> +++ b/fs/xfs/xfs_trans_attr.c
>> @@ -0,0 +1,286 @@
>> +/*
>> + * Copyright (c) 2017, Oracle Inc.
>> + * All Rights Reserved.
>> + *
>> + * This program is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU General Public License as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it would be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program; if not, write the Free Software Foundation Inc.
>> + */
>> +#include "xfs.h"
>> +#include "xfs_fs.h"
>> +#include "xfs_shared.h"
>> +#include "xfs_format.h"
>> +#include "xfs_log_format.h"
>> +#include "xfs_trans_resv.h"
>> +#include "xfs_bit.h"
>> +#include "xfs_mount.h"
>> +#include "xfs_defer.h"
>> +#include "xfs_trans.h"
>> +#include "xfs_trans_priv.h"
>> +#include "xfs_attr_item.h"
>> +#include "xfs_alloc.h"
>> +#include "xfs_bmap.h"
>> +#include "xfs_trace.h"
>> +#include "libxfs/xfs_da_format.h"
>> +#include "xfs_da_btree.h"
>> +#include "xfs_attr.h"
>> +#include "xfs_inode.h"
>> +#include "xfs_icache.h"
>> +
>> +/*
>> + * This routine is called to allocate an "extent free done"
>> + * log item that will hold nextents worth of extents.  The
>> + * caller must use all nextents extents, because we are not
>> + * flexible about this at all.
>> + */
>> +struct xfs_attrd_log_item *
>> +xfs_trans_get_attrd(struct xfs_trans		*tp,
>> +		  struct xfs_attri_log_item	*attrip)
>> +{
>> +	struct xfs_attrd_log_item			*attrdp;
>> +
>> +	ASSERT(tp != NULL);
>> +
>> +	attrdp = xfs_attrd_init(tp->t_mountp, attrip);
>> +	ASSERT(attrdp != NULL);
>> +
>> +	/*
>> +	 * Get a log_item_desc to point at the new item.
>> +	 */
>> +	xfs_trans_add_item(tp, &attrdp->item);
>> +	return attrdp;
>> +}
>> +
>> +/*
>> + * Delete an attr and log it to the ATTRD. Note that the transaction is marked
>> + * dirty regardless of whether the attr delete succeeds or fails to support the
>> + * ATTRI/ATTRD lifecycle rules.
>> + */
>> +int
>> +xfs_trans_attr(
>> +	struct xfs_trans	*tp,
>> +	struct xfs_attrd_log_item	*attrdp,
>> +	xfs_ino_t		ino,
>> +	uint32_t		op_flags,
>> +	uint32_t                flags,
>> +	uint32_t		name_len,
>> +	uint32_t		value_len,
>> +	char			*name,
>> +	char			*value)
>> +{
>> +	uint			next_attr;
>> +	struct xfs_attr_log_format *attrp;
>> +	int			error;
>> +	int                     local;
>> +	struct xfs_da_args      args;
>> +	struct xfs_inode	*dp;
>> +	struct xfs_defer_ops    dfops;
>> +	xfs_fsblock_t		firstblock = NULLFSBLOCK;
>> +	struct xfs_mount	*mp = tp->t_mountp;
>> +
>> +	error = xfs_iget(mp, tp, ino, flags, 0, &dp);
>> +	if (error)
>> +		return error;
>> +
>> +	ASSERT(XFS_IFORK_Q((dp)));
>> +	tp->t_flags |= XFS_TRANS_RESERVE;
>> +
>> +	error = xfs_attr_args_init(&args, dp, name, flags);
>> +	if (error)
>> +		return error;
>> +
>> +	args.name = name;
>> +	args.namelen = name_len;
>> +	args.hashval = xfs_da_hashname(args.name, args.namelen);
>> +	args.value = value;
>> +	args.valuelen = value_len;
>> +	args.dfops = &dfops;
> dfops needs an xfs_defer_init().
>
> Oh, the initialization of dfops is after where we start using it.
> Please move it up (or the args.dfops assignment down).
Good catch, will fix :-)
>> +	args.firstblock = &firstblock;
>> +	args.op_flags = XFS_DA_OP_OKNOENT;
>> +	args.total = xfs_attr_calc_size(&args, &local);
>> +	args.trans = tp;
>> +	ASSERT(local);
>> +
>> +	xfs_ilock(dp, XFS_ILOCK_EXCL);
>> +	xfs_defer_init(args.dfops, args.firstblock);
>> +
>> +	if (op_flags & ATTR_OP_FLAGS_SET) {
> switch (op_flags & XFS_ATTR_OP_FLAGS_TYPE_MASK) {
> case XFS_ATTR_OP_FLAGS_SET:
> 	...
> case XFS_ATTR_OP_FLAGS_REMOVE:
> 	...
> default:
> 	...
> }
Alrighty, will update
>> +		args.op_flags |= XFS_DA_OP_ADDNAME;
>> +		error = xfs_attr_set_args(&args, flags, false);
>> +	} else if (op_flags & ATTR_OP_FLAGS_REMOVE) {
>> +		error = xfs_attr_remove_args(&args, flags);
>> +	} else {
>> +		ASSERT(0);
> We're reading and processing log items off the disk, so cleaning up and
> returning -EFSCORRUPTED is more appropriate here.
>
Ok then, will fix
>> +	}
>> +
>> +	if (error)
>> +		xfs_defer_cancel(&dfops);
>> +
>> +	xfs_iunlock(dp, XFS_ILOCK_EXCL);
>> +
>> +	/*
>> +	 * Mark the transaction dirty, even on error. This ensures the
>> +	 * transaction is aborted, which:
>> +	 *
>> +	 * 1.) releases the ATTRI and frees the ATTRD
>> +	 * 2.) shuts down the filesystem
>> +	 */
>> +	tp->t_flags |= XFS_TRANS_DIRTY;
>> +	attrdp->item.li_desc->lid_flags |= XFS_LID_DIRTY;
>> +
>> +	next_attr = attrdp->next_attr;
>> +	attrp = &(attrdp->format);
>> +	attrp->ino = ino;
>> +	attrp->op_flags = op_flags;
>> +	attrp->value_len = value_len;
>> +	attrp->name_len = name_len;
>> +	attrp->attr_flags = flags;
>> +
>> +	attrdp->name = name;
>> +	attrdp->value = value;
>> +	attrdp->name_len = name_len;
>> +	attrdp->value_len = value_len;
>> +	attrdp->next_attr++;
>> +
>> +	return error;
>> +}
>> +
>> +static int
>> +xfs_attr_diff_items(
>> +	void				*priv,
>> +	struct list_head		*a,
>> +	struct list_head		*b)
>> +{
>> +	return 0;
>> +}
>> +
>> +/* Get an ATTRI. */
>> +STATIC void *
>> +xfs_attr_create_intent(
>> +	struct xfs_trans		*tp,
>> +	unsigned int			count)
>> +{
>> +	struct xfs_attri_log_item		*attrip;
>> +
>> +	ASSERT(tp != NULL);
>> +	ASSERT(count > 0);
>> +
>> +	attrip = xfs_attri_init(tp->t_mountp);
>> +	ASSERT(attrip != NULL);
>> +
>> +	/*
>> +	 * Get a log_item_desc to point at the new item.
>> +	 */
>> +	xfs_trans_add_item(tp, &attrip->item);
>> +	return attrip;
>> +}
>> +
>> +/* Log an attr to the intent item. */
>> +STATIC void
>> +xfs_attr_log_item(
>> +	struct xfs_trans		*tp,
>> +	void				*intent,
>> +	struct list_head		*item)
>> +{
>> +	struct xfs_attri_log_item	*attrip = intent;
>> +	struct xfs_attr_item		*free;
>> +	struct xfs_attr_log_format	*attrp;
>> +
>> +	free = container_of(item, struct xfs_attr_item, xattri_list);
>> +
>> +	tp->t_flags |= XFS_TRANS_DIRTY;
>> +	attrip->item.li_desc->lid_flags |= XFS_LID_DIRTY;
>> +
>> +	attrp = &attrip->format;
>> +	attrp->ino = free->xattri_ino;
>> +	attrp->op_flags = free->xattri_op_flags;
>> +	attrp->value_len = free->xattri_value_len;
>> +	attrp->name_len = free->xattri_name_len;
>> +	attrp->attr_flags = free->xattri_flags;
>> +
>> +	attrip->name = &(free->xattri_name[0]);
>> +	attrip->value = &(free->xattri_value[0]);
>> +	attrip->name_len = free->xattri_name_len;
>> +	attrip->value_len = free->xattri_value_len;
>> +}
>> +
>> +/* Get an ATTRD so we can process all the attrs. */
>> +STATIC void *
>> +xfs_attr_create_done(
>> +	struct xfs_trans		*tp,
>> +	void				*intent,
>> +	unsigned int			count)
>> +{
>> +	return xfs_trans_get_attrd(tp, intent);
>> +}
>> +
>> +/* Process an attr. */
>> +STATIC int
>> +xfs_attr_finish_item(
>> +	struct xfs_trans		*tp,
>> +	struct xfs_defer_ops		*dop,
>> +	struct list_head		*item,
>> +	void				*done_item,
>> +	void				**state)
>> +{
>> +	struct xfs_attr_item	*free;
>> +	int				error;
>> +
>> +	free = container_of(item, struct xfs_attr_item, xattri_list);
>> +	error = xfs_trans_attr(tp, done_item,
>> +			free->xattri_ino,
>> +			free->xattri_op_flags,
>> +			free->xattri_flags,
>> +			free->xattri_name_len,
>> +			free->xattri_value_len,
>> +			free->xattri_name,
>> +			free->xattri_value);
>> +	kmem_free(free);
>> +	return error;
>> +}
>> +
>> +/* Abort all pending EFIs. */
> EFIs?
>
> --D
I'll do a search for the EFI's and weed them out.  Thank you!! :-)
>> +STATIC void
>> +xfs_attr_abort_intent(
>> +	void				*intent)
>> +{
>> +	xfs_attri_release(intent);
>> +}
>> +
>> +/* Cancel an attr */
>> +STATIC void
>> +xfs_attr_cancel_item(
>> +	struct list_head		*item)
>> +{
>> +	struct xfs_attr_item	*free;
>> +
>> +	free = container_of(item, struct xfs_attr_item, xattri_list);
>> +	kmem_free(free);
>> +}
>> +
>> +static const struct xfs_defer_op_type xfs_attr_defer_type = {
>> +	.type		= XFS_DEFER_OPS_TYPE_ATTR,
>> +	.max_items	= XFS_ATTRI_MAX_FAST_ATTRS,
>> +	.diff_items	= xfs_attr_diff_items,
>> +	.create_intent	= xfs_attr_create_intent,
>> +	.abort_intent	= xfs_attr_abort_intent,
>> +	.log_item	= xfs_attr_log_item,
>> +	.create_done	= xfs_attr_create_done,
>> +	.finish_item	= xfs_attr_finish_item,
>> +	.cancel_item	= xfs_attr_cancel_item,
>> +};
>> +
>> +/* Register the deferred op type. */
>> +void
>> +xfs_attr_init_defer_op(void)
>> +{
>> +	xfs_defer_init_op_type(&xfs_attr_defer_type);
>> +}
>> -- 
>> 2.7.4
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
>> the body of a message tomajordomo@vger.kernel.org
>> More majordomo info athttp://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 03/17] Add xfs_attr_set_defered and xfs_attr_remove_defered
  2017-10-19 19:13   ` Darrick J. Wong
@ 2017-10-21  1:08     ` Allison Henderson
  0 siblings, 0 replies; 66+ messages in thread
From: Allison Henderson @ 2017-10-21  1:08 UTC (permalink / raw)
  Cc: linux-xfs

On 10/19/2017 12:13 PM, Darrick J. Wong wrote:
> On Wed, Oct 18, 2017 at 03:55:19PM -0700, Allison Henderson wrote:
>> These routines set up set and start a new deferred attribute
>> operation.  These functions are meant to be called by other
>> code needing to initiate a deferred attribute operation.  We
>> will use these routines later in the parent pointer patches.
>>
>> Signed-off-by: Allison Henderson<allison.henderson@oracle.com>
>> ---
>>   fs/xfs/libxfs/xfs_attr.c | 58 ++++++++++++++++++++++++++++++++++++++++++++++++
>>   fs/xfs/xfs_attr.h        |  7 ++++++
>>   2 files changed, 65 insertions(+)
>>
>> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
>> index 5325ec2..59f3502 100644
>> --- a/fs/xfs/libxfs/xfs_attr.c
>> +++ b/fs/xfs/libxfs/xfs_attr.c
>> @@ -458,6 +458,37 @@ xfs_attr_set(
>>   	return error;
>>   }
>>   
>> +int
>> +xfs_attr_set_deferred(
>> +	struct xfs_inode	*dp,
>> +	struct xfs_defer_ops    *dfops,
>> +	const unsigned char	*name,
>> +	unsigned int		namelen,
>> +	unsigned char		*value,
>> +	unsigned int		valuelen,
>> +	int			flags)
>> +{
>> +
>> +	struct xfs_attr_item     *new;
>> +
>> +	ASSERT(namelen != 0);
>> +	ASSERT(valuelen != 0);
>> +
>> +	new = kmem_alloc(sizeof(struct xfs_attr_item), KM_SLEEP|KM_NOFS);
> Yikes, this is a 66,000 byte allocation.  I wouldn't take it for granted
> that we can ask for more than a page's worth of memory.
>
> Seeing as we already know namelen and valuelen, let's go for the
> zero length array at the end of the struct approach:
>
> struct xfs_attr_item {
> 	...
> 	uint16_t		xattri_namelen;
> 	uint8_t			xattri_namevalue[0];
> };
>
> #define XFS_ATTR_ITEM_SIZEOF(namelen, valuelen)	\
> 	(sizeof(struct xfs_attr_item) + (namelen) + (valuelen))
>
> new = kmem_alloc(XFS_ATTR_ITEM_SIZEOF(namelen, valuelen), KM...);
> if (!new)
> 	return -ENOMEM;
> memcpy(new->xattri_namevalue, name, namelen);
> new->xattri_namelen = namelen;
> memcpy(&new->xattri_namevalue[namelen], value, valuelen);
> ...
Ok, yeah that makes sense.  Will do.
> Assuming there isn't a way to attach the caller's buffers to the attr
> item without copying anything(?)  (It would be nice if we could, but
> between the defer_add and the defer_finish a lot can happen w.r.t.
> variable scope so that might be a bad idea.)
>
>> +	memset(new, 0, sizeof(struct xfs_attr_item));
>> +	new->xattri_ino = dp->i_ino;
>> +	new->xattri_op_flags = ATTR_OP_FLAGS_SET;
>> +	new->xattri_name_len = namelen;
>> +	new->xattri_value_len = valuelen;
>> +	new->xattri_flags = flags;
>> +	memcpy(new->xattri_name, name, namelen);
>> +	memcpy(&new->xattri_value, value, valuelen);
>> +
>> +	xfs_defer_add(dfops, XFS_DEFER_OPS_TYPE_ATTR, &new->xattri_list);
>> +
>> +	return 0;
>> +}
>> +
>>   /*
>>    * Generic handler routine to remove a name from an attribute list.
>>    * Transitions attribute list from Btree to shortform as necessary.
>> @@ -531,6 +562,33 @@ xfs_attr_remove(
>>   	return error;
>>   }
>>   
>> +int
>> +xfs_attr_remove_deferred(
>> +	struct xfs_inode        *dp,
>> +	struct xfs_defer_ops    *dfops,
>> +	const unsigned char     *name,
>> +	unsigned int		namelen,
>> +	int                     flags)
>> +{
>> +
>> +	struct xfs_attr_item     *new;
>> +
>> +	ASSERT(namelen != 0);
>> +
>> +	new = kmem_alloc(sizeof(struct xfs_attr_item), KM_SLEEP|KM_NOFS);
>> +	memset(new, 0, sizeof(struct xfs_attr_item));
>> +	new->xattri_ino = dp->i_ino;
>> +	new->xattri_op_flags = ATTR_OP_FLAGS_REMOVE;
>> +	new->xattri_name_len = namelen;
>> +	new->xattri_value_len = 0;
>> +	new->xattri_flags = flags;
>> +	memcpy(new->xattri_name, name, namelen);
>> +
>> +	xfs_defer_add(dfops, XFS_DEFER_OPS_TYPE_ATTR, &new->xattri_list);
>> +
>> +	return 0;
>> +}
>> +
>>   /*========================================================================
>>    * External routines when attribute list is inside the inode
>>    *========================================================================*/
>> diff --git a/fs/xfs/xfs_attr.h b/fs/xfs/xfs_attr.h
>> index 34bb4cb..f4a53fd 100644
>> --- a/fs/xfs/xfs_attr.h
>> +++ b/fs/xfs/xfs_attr.h
>> @@ -173,5 +173,12 @@ int xfs_attr_list(struct xfs_inode *dp, char *buffer, int bufsize,
>>   int xfs_attr_args_init(struct xfs_da_args *args, struct xfs_inode *dp,
>>   		       const unsigned char *name, int flags);
>>   int xfs_attr_calc_size(struct xfs_da_args *args, int *local);
>> +int xfs_attr_set_deferred(struct xfs_inode *dp, struct xfs_defer_ops *dfops,
>> +			  const unsigned char *name, unsigned int name_len,
>> +			  unsigned char *value, unsigned int valuelen,
>> +			  int flags);
>> +int xfs_attr_remove_deferred(struct xfs_inode *dp, struct xfs_defer_ops *dfops,
>> +			    const unsigned char *name, unsigned int namelen,
>> +			    int flags);
>>   
>>   #endif	/* __XFS_ATTR_H__ */
>> -- 
>> 2.7.4
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
>> the body of a message tomajordomo@vger.kernel.org
>> More majordomo info athttp://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 04/17] Remove all strlen calls in all xfs_attr_* functions for attr names.
  2017-10-19 19:15   ` Darrick J. Wong
@ 2017-10-21  1:10     ` Allison Henderson
  0 siblings, 0 replies; 66+ messages in thread
From: Allison Henderson @ 2017-10-21  1:10 UTC (permalink / raw)
  Cc: linux-xfs

On 10/19/2017 12:15 PM, Darrick J. Wong wrote:
> On Wed, Oct 18, 2017 at 03:55:20PM -0700, Allison Henderson wrote:
>> Parent pointer attributes use a binary name, so strlen will not work.
>> Calling functions will need to pass in the name length
>>
>> Signed-off-by: Allison Henderson<allison.henderson@oracle.com>
>> ---
>>   fs/xfs/libxfs/xfs_attr.c | 12 ++++++++----
>>   fs/xfs/xfs_acl.c         | 12 +++++++-----
>>   fs/xfs/xfs_attr.h        |  9 +++++----
>>   fs/xfs/xfs_ioctl.c       | 13 ++++++++++---
>>   fs/xfs/xfs_iops.c        |  6 ++++--
>>   fs/xfs/xfs_trans_attr.c  |  2 +-
>>   fs/xfs/xfs_xattr.c       | 10 +++++++---
>>   7 files changed, 42 insertions(+), 22 deletions(-)
>>
>> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
>> index 59f3502..b94f0cd 100644
>> --- a/fs/xfs/libxfs/xfs_attr.c
>> +++ b/fs/xfs/libxfs/xfs_attr.c
>> @@ -79,6 +79,7 @@ xfs_attr_args_init(
>>   	struct xfs_da_args	*args,
>>   	struct xfs_inode	*dp,
>>   	const unsigned char	*name,
>> +	int			namelen,
> I think these should be size_t since they describe memory buffer sizes,
> and that's what strlen() returns.
>
> At least change it to 'unsigned int' since negative size makes no sense here...
>
> --D
>
Ok, will update.  Thx!
>>   	int			flags)
>>   {
>>   
>> @@ -91,7 +92,7 @@ xfs_attr_args_init(
>>   	args->dp = dp;
>>   	args->flags = flags;
>>   	args->name = name;
>> -	args->namelen = strlen((const char *)name);
>> +	args->namelen = namelen;
>>   	if (args->namelen >= MAXNAMELEN)
>>   		return -EFAULT;		/* match IRIX behaviour */
>>   
>> @@ -137,6 +138,7 @@ int
>>   xfs_attr_get(
>>   	struct xfs_inode	*ip,
>>   	const unsigned char	*name,
>> +	int			namelen,
>>   	unsigned char		*value,
>>   	int			*valuelenp,
>>   	int			flags)
>> @@ -150,7 +152,7 @@ xfs_attr_get(
>>   	if (XFS_FORCED_SHUTDOWN(ip->i_mount))
>>   		return -EIO;
>>   
>> -	error = xfs_attr_args_init(&args, ip, name, flags);
>> +	error = xfs_attr_args_init(&args, ip, name, namelen, flags);
>>   	if (error)
>>   		return error;
>>   
>> @@ -386,6 +388,7 @@ int
>>   xfs_attr_set(
>>   	struct xfs_inode	*dp,
>>   	const unsigned char	*name,
>> +	int			namelen,
>>   	unsigned char		*value,
>>   	int			valuelen,
>>   	int			flags)
>> @@ -402,7 +405,7 @@ xfs_attr_set(
>>   	if (XFS_FORCED_SHUTDOWN(dp->i_mount))
>>   		return -EIO;
>>   
>> -	error = xfs_attr_args_init(&args, dp, name, flags);
>> +	error = xfs_attr_args_init(&args, dp, name, namelen, flags);
>>   	if (error)
>>   		return error;
>>   
>> @@ -497,6 +500,7 @@ int
>>   xfs_attr_remove(
>>   	struct xfs_inode	*dp,
>>   	const unsigned char	*name,
>> +	int			namelen,
>>   	int			flags)
>>   {
>>   	struct xfs_mount	*mp = dp->i_mount;
>> @@ -510,7 +514,7 @@ xfs_attr_remove(
>>   	if (XFS_FORCED_SHUTDOWN(dp->i_mount))
>>   		return -EIO;
>>   
>> -	error = xfs_attr_args_init(&args, dp, name, flags);
>> +	error = xfs_attr_args_init(&args, dp, name, namelen, flags);
>>   	if (error)
>>   		return error;
>>   
>> diff --git a/fs/xfs/xfs_acl.c b/fs/xfs/xfs_acl.c
>> index 7034e17..72eca24 100644
>> --- a/fs/xfs/xfs_acl.c
>> +++ b/fs/xfs/xfs_acl.c
>> @@ -153,8 +153,8 @@ xfs_get_acl(struct inode *inode, int type)
>>   	if (!xfs_acl)
>>   		return ERR_PTR(-ENOMEM);
>>   
>> -	error = xfs_attr_get(ip, ea_name, (unsigned char *)xfs_acl,
>> -							&len, ATTR_ROOT);
>> +	error = xfs_attr_get(ip, ea_name, strlen((const char *)ea_name),
>> +			     (unsigned char *)xfs_acl, &len, ATTR_ROOT);
>>   	if (error) {
>>   		/*
>>   		 * If the attribute doesn't exist make sure we have a negative
>> @@ -204,15 +204,17 @@ __xfs_set_acl(struct inode *inode, struct posix_acl *acl, int type)
>>   		len -= sizeof(struct xfs_acl_entry) *
>>   			 (XFS_ACL_MAX_ENTRIES(ip->i_mount) - acl->a_count);
>>   
>> -		error = xfs_attr_set(ip, ea_name, (unsigned char *)xfs_acl,
>> -				len, ATTR_ROOT);
>> +		error = xfs_attr_set(ip, ea_name, strlen((const char *)ea_name),
>> +				     (unsigned char *)xfs_acl, len, ATTR_ROOT);
>>   
>>   		kmem_free(xfs_acl);
>>   	} else {
>>   		/*
>>   		 * A NULL ACL argument means we want to remove the ACL.
>>   		 */
>> -		error = xfs_attr_remove(ip, ea_name, ATTR_ROOT);
>> +		error = xfs_attr_remove(ip, ea_name,
>> +					strlen((const char *)ea_name),
>> +					ATTR_ROOT);
>>   
>>   		/*
>>   		 * If the attribute didn't exist to start with that's fine.
>> diff --git a/fs/xfs/xfs_attr.h b/fs/xfs/xfs_attr.h
>> index f4a53fd..532567e 100644
>> --- a/fs/xfs/xfs_attr.h
>> +++ b/fs/xfs/xfs_attr.h
>> @@ -161,17 +161,18 @@ int xfs_attr_list_int_ilocked(struct xfs_attr_list_context *);
>>   int xfs_attr_list_int(struct xfs_attr_list_context *);
>>   int xfs_inode_hasattr(struct xfs_inode *ip);
>>   int xfs_attr_get_ilocked(struct xfs_inode *ip, struct xfs_da_args *args);
>> -int xfs_attr_get(struct xfs_inode *ip, const unsigned char *name,
>> +int xfs_attr_get(struct xfs_inode *ip, const unsigned char *name, int namelen,
>>   		 unsigned char *value, int *valuelenp, int flags);
>> -int xfs_attr_set(struct xfs_inode *dp, const unsigned char *name,
>> +int xfs_attr_set(struct xfs_inode *dp, const unsigned char *name, int namelen,
>>   		 unsigned char *value, int valuelen, int flags);
>>   int xfs_attr_set_args(struct xfs_da_args *args, int flags, bool roll_trans);
>> -int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name, int flags);
>> +int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name,
>> +		    int namelen, int flags);
>>   int xfs_attr_remove_args(struct xfs_da_args *args, int flags);
>>   int xfs_attr_list(struct xfs_inode *dp, char *buffer, int bufsize,
>>   		  int flags, struct attrlist_cursor_kern *cursor);
>>   int xfs_attr_args_init(struct xfs_da_args *args, struct xfs_inode *dp,
>> -		       const unsigned char *name, int flags);
>> +		       const unsigned char *name, int namelen, int flags);
>>   int xfs_attr_calc_size(struct xfs_da_args *args, int *local);
>>   int xfs_attr_set_deferred(struct xfs_inode *dp, struct xfs_defer_ops *dfops,
>>   			  const unsigned char *name, unsigned int name_len,
>> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
>> index aa75389..1c9f813 100644
>> --- a/fs/xfs/xfs_ioctl.c
>> +++ b/fs/xfs/xfs_ioctl.c
>> @@ -448,6 +448,7 @@ xfs_attrmulti_attr_get(
>>   {
>>   	unsigned char		*kbuf;
>>   	int			error = -EFAULT;
>> +	int			namelen;
>>   
>>   	if (*len > XFS_XATTR_SIZE_MAX)
>>   		return -EINVAL;
>> @@ -455,7 +456,9 @@ xfs_attrmulti_attr_get(
>>   	if (!kbuf)
>>   		return -ENOMEM;
>>   
>> -	error = xfs_attr_get(XFS_I(inode), name, kbuf, (int *)len, flags);
>> +	namelen = strlen((const char *)name);
>> +	error = xfs_attr_get(XFS_I(inode), name, namelen,
>> +			     kbuf, (int *)len, flags);
>>   	if (error)
>>   		goto out_kfree;
>>   
>> @@ -477,6 +480,7 @@ xfs_attrmulti_attr_set(
>>   {
>>   	unsigned char		*kbuf;
>>   	int			error;
>> +	int			namelen;
>>   
>>   	if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
>>   		return -EPERM;
>> @@ -487,7 +491,8 @@ xfs_attrmulti_attr_set(
>>   	if (IS_ERR(kbuf))
>>   		return PTR_ERR(kbuf);
>>   
>> -	error = xfs_attr_set(XFS_I(inode), name, kbuf, len, flags);
>> +	namelen = strlen((const char *)name);
>> +	error = xfs_attr_set(XFS_I(inode), name, namelen, kbuf, len, flags);
>>   	if (!error)
>>   		xfs_forget_acl(inode, name, flags);
>>   	kfree(kbuf);
>> @@ -501,10 +506,12 @@ xfs_attrmulti_attr_remove(
>>   	uint32_t		flags)
>>   {
>>   	int			error;
>> +	int			namelen;
>>   
>>   	if (IS_IMMUTABLE(inode) || IS_APPEND(inode))
>>   		return -EPERM;
>> -	error = xfs_attr_remove(XFS_I(inode), name, flags);
>> +	namelen = strlen((const char *)name);
>> +	error = xfs_attr_remove(XFS_I(inode), name, namelen, flags);
>>   	if (!error)
>>   		xfs_forget_acl(inode, name, flags);
>>   	return error;
>> diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
>> index 17081c7..5247bfc 100644
>> --- a/fs/xfs/xfs_iops.c
>> +++ b/fs/xfs/xfs_iops.c
>> @@ -70,8 +70,10 @@ xfs_initxattrs(
>>   	int			error = 0;
>>   
>>   	for (xattr = xattr_array; xattr->name != NULL; xattr++) {
>> -		error = xfs_attr_set(ip, xattr->name, xattr->value,
>> -				      xattr->value_len, ATTR_SECURE);
>> +		error = xfs_attr_set(ip, xattr->name,
>> +				     strlen((const char *)xattr->name),
>> +				     xattr->value, xattr->value_len,
>> +				     ATTR_SECURE);
>>   		if (error < 0)
>>   			break;
>>   	}
>> diff --git a/fs/xfs/xfs_trans_attr.c b/fs/xfs/xfs_trans_attr.c
>> index 39eb18d..a45e9d0 100644
>> --- a/fs/xfs/xfs_trans_attr.c
>> +++ b/fs/xfs/xfs_trans_attr.c
>> @@ -93,7 +93,7 @@ xfs_trans_attr(
>>   	ASSERT(XFS_IFORK_Q((dp)));
>>   	tp->t_flags |= XFS_TRANS_RESERVE;
>>   
>> -	error = xfs_attr_args_init(&args, dp, name, flags);
>> +	error = xfs_attr_args_init(&args, dp, name, name_len, flags);
>>   	if (error)
>>   		return error;
>>   
>> diff --git a/fs/xfs/xfs_xattr.c b/fs/xfs/xfs_xattr.c
>> index 0594db4..4ef09c4 100644
>> --- a/fs/xfs/xfs_xattr.c
>> +++ b/fs/xfs/xfs_xattr.c
>> @@ -38,6 +38,7 @@ xfs_xattr_get(const struct xattr_handler *handler, struct dentry *unused,
>>   	int xflags = handler->flags;
>>   	struct xfs_inode *ip = XFS_I(inode);
>>   	int error, asize = size;
>> +	int namelen = strlen((const char *)name);
>>   
>>   	/* Convert Linux syscall to XFS internal ATTR flags */
>>   	if (!size) {
>> @@ -45,7 +46,8 @@ xfs_xattr_get(const struct xattr_handler *handler, struct dentry *unused,
>>   		value = NULL;
>>   	}
>>   
>> -	error = xfs_attr_get(ip, (unsigned char *)name, value, &asize, xflags);
>> +	error = xfs_attr_get(ip, (unsigned char *)name, namelen, value,
>> +			     &asize, xflags);
>>   	if (error)
>>   		return error;
>>   	return asize;
>> @@ -81,6 +83,7 @@ xfs_xattr_set(const struct xattr_handler *handler, struct dentry *unused,
>>   	int			xflags = handler->flags;
>>   	struct xfs_inode	*ip = XFS_I(inode);
>>   	int			error;
>> +	int			namelen = strlen((const char *)name);
>>   
>>   	/* Convert Linux syscall to XFS internal ATTR flags */
>>   	if (flags & XATTR_CREATE)
>> @@ -89,8 +92,9 @@ xfs_xattr_set(const struct xattr_handler *handler, struct dentry *unused,
>>   		xflags |= ATTR_REPLACE;
>>   
>>   	if (!value)
>> -		return xfs_attr_remove(ip, (unsigned char *)name, xflags);
>> -	error = xfs_attr_set(ip, (unsigned char *)name,
>> +		return xfs_attr_remove(ip, (unsigned char *)name,
>> +				       namelen, xflags);
>> +	error = xfs_attr_set(ip, (unsigned char *)name, namelen,
>>   				(void *)value, size, xflags);
>>   	if (!error)
>>   		xfs_forget_acl(inode, name, xflags);
>> -- 
>> 2.7.4
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
>> the body of a message tomajordomo@vger.kernel.org
>> More majordomo info athttp://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 06/17] xfs: get directory offset when removing directory name
  2017-10-19 19:17   ` Darrick J. Wong
@ 2017-10-21  1:11     ` Allison Henderson
  0 siblings, 0 replies; 66+ messages in thread
From: Allison Henderson @ 2017-10-21  1:11 UTC (permalink / raw)
  Cc: linux-xfs, Dave Chinner

On 10/19/2017 12:17 PM, Darrick J. Wong wrote:
> On Wed, Oct 18, 2017 at 03:55:22PM -0700, Allison Henderson wrote:
>> From: Mark Tinguely<tinguely@sgi.com>
>>
>> Return the directory offset information when removing an entry to the
>> directory.
>>
>> This offset will be used as the parent pointer offset in xfs_remove.
>>
>> [dchinner: forward ported and cleaned up]
>> [achender: rebased, changed __unint32_t to xfs_dir2_dataptr_t]
>>
>> Signed-off-by: Mark Tinguely<tinguely@sgi.com>
>> Signed-off-by: Dave Chinner<dchinner@redhat.com>
>> Signed-off-by: Allison Henderson<allison.henderson@oracle.com>
>> ---
>> v2: Changed typedefs to raw struct types
>>
>> Signed-off-by: Allison Henderson<allison.henderson@oracle.com>
>> ---
>>   fs/xfs/libxfs/xfs_dir2.c       | 15 +++++++++------
>>   fs/xfs/libxfs/xfs_dir2.h       |  4 +++-
>>   fs/xfs/libxfs/xfs_dir2_block.c |  4 ++--
>>   fs/xfs/libxfs/xfs_dir2_leaf.c  |  5 +++--
>>   fs/xfs/libxfs/xfs_dir2_node.c  |  5 +++--
>>   fs/xfs/libxfs/xfs_dir2_sf.c    |  2 ++
>>   fs/xfs/xfs_inode.c             |  7 ++++---
>>   7 files changed, 26 insertions(+), 16 deletions(-)
>>
>> diff --git a/fs/xfs/libxfs/xfs_dir2.c b/fs/xfs/libxfs/xfs_dir2.c
>> index a1ca460..0511eb9 100644
>> --- a/fs/xfs/libxfs/xfs_dir2.c
>> +++ b/fs/xfs/libxfs/xfs_dir2.c
>> @@ -443,13 +443,14 @@ xfs_dir_lookup(
>>    */
>>   int
>>   xfs_dir_removename(
>> -	xfs_trans_t	*tp,
>> -	xfs_inode_t	*dp,
>> -	struct xfs_name	*name,
>> -	xfs_ino_t	ino,
>> -	xfs_fsblock_t	*first,		/* bmap's firstblock */
>> +	struct xfs_trans	*tp,
>> +	struct xfs_inode	*dp,
>> +	struct xfs_name		*name,
>> +	xfs_ino_t		ino,
>> +	xfs_fsblock_t	*first,			/* bmap's firstblock */
> Indentation problem?
>
> --D
Ok, will fix.  It looked right in the editor, maybe my tabs are not set 
quite right.
>>   	struct xfs_defer_ops	*dfops,		/* bmap's freeblock list */
>> -	xfs_extlen_t	total)		/* bmap's total block count */
>> +	xfs_extlen_t		total,		/* bmap's total block count */
>> +	xfs_dir2_dataptr_t	*offset)	/* OUT: offset in directory */
>>   {
>>   	struct xfs_da_args *args;
>>   	int		rval;
>> @@ -495,6 +496,8 @@ xfs_dir_removename(
>>   		rval = xfs_dir2_leaf_removename(args);
>>   	else
>>   		rval = xfs_dir2_node_removename(args);
>> +	if (offset)
>> +		*offset = args->offset;
>>   out_free:
>>   	kmem_free(args);
>>   	return rval;
>> diff --git a/fs/xfs/libxfs/xfs_dir2.h b/fs/xfs/libxfs/xfs_dir2.h
>> index e349900..e1bd05d 100644
>> --- a/fs/xfs/libxfs/xfs_dir2.h
>> +++ b/fs/xfs/libxfs/xfs_dir2.h
>> @@ -139,7 +139,9 @@ extern int xfs_dir_lookup(struct xfs_trans *tp, struct xfs_inode *dp,
>>   extern int xfs_dir_removename(struct xfs_trans *tp, struct xfs_inode *dp,
>>   				struct xfs_name *name, xfs_ino_t ino,
>>   				xfs_fsblock_t *first,
>> -				struct xfs_defer_ops *dfops, xfs_extlen_t tot);
>> +				struct xfs_defer_ops *dfops,
>> +				xfs_extlen_t tot,
>> +				xfs_dir2_dataptr_t *offset);
>>   extern int xfs_dir_replace(struct xfs_trans *tp, struct xfs_inode *dp,
>>   				struct xfs_name *name, xfs_ino_t inum,
>>   				xfs_fsblock_t *first,
>> diff --git a/fs/xfs/libxfs/xfs_dir2_block.c b/fs/xfs/libxfs/xfs_dir2_block.c
>> index 79684d5..4dbe2fc 100644
>> --- a/fs/xfs/libxfs/xfs_dir2_block.c
>> +++ b/fs/xfs/libxfs/xfs_dir2_block.c
>> @@ -791,9 +791,9 @@ xfs_dir2_block_removename(
>>   	/*
>>   	 * Point to the data entry using the leaf entry.
>>   	 */
>> +	args->offset = be32_to_cpu(blp[ent].address);
>>   	dep = (xfs_dir2_data_entry_t *)((char *)hdr +
>> -			xfs_dir2_dataptr_to_off(args->geo,
>> -						be32_to_cpu(blp[ent].address)));
>> +			xfs_dir2_dataptr_to_off(args->geo, args->offset));
>>   	/*
>>   	 * Mark the data entry's space free.
>>   	 */
>> diff --git a/fs/xfs/libxfs/xfs_dir2_leaf.c b/fs/xfs/libxfs/xfs_dir2_leaf.c
>> index 2ac7a7e..197e627 100644
>> --- a/fs/xfs/libxfs/xfs_dir2_leaf.c
>> +++ b/fs/xfs/libxfs/xfs_dir2_leaf.c
>> @@ -1383,9 +1383,10 @@ xfs_dir2_leaf_removename(
>>   	 * Point to the leaf entry, use that to point to the data entry.
>>   	 */
>>   	lep = &ents[index];
>> -	db = xfs_dir2_dataptr_to_db(args->geo, be32_to_cpu(lep->address));
>> +	args->offset = be32_to_cpu(lep->address);
>> +	db = xfs_dir2_dataptr_to_db(args->geo, args->offset);
>>   	dep = (xfs_dir2_data_entry_t *)((char *)hdr +
>> -		xfs_dir2_dataptr_to_off(args->geo, be32_to_cpu(lep->address)));
>> +		xfs_dir2_dataptr_to_off(args->geo, args->offset));
>>   	needscan = needlog = 0;
>>   	oldbest = be16_to_cpu(bf[0].length);
>>   	ltp = xfs_dir2_leaf_tail_p(args->geo, leaf);
>> diff --git a/fs/xfs/libxfs/xfs_dir2_node.c b/fs/xfs/libxfs/xfs_dir2_node.c
>> index 8bc91f8..13d5244 100644
>> --- a/fs/xfs/libxfs/xfs_dir2_node.c
>> +++ b/fs/xfs/libxfs/xfs_dir2_node.c
>> @@ -1238,9 +1238,10 @@ xfs_dir2_leafn_remove(
>>   	/*
>>   	 * Extract the data block and offset from the entry.
>>   	 */
>> -	db = xfs_dir2_dataptr_to_db(args->geo, be32_to_cpu(lep->address));
>> +	args->offset = be32_to_cpu(lep->address);
>> +	db = xfs_dir2_dataptr_to_db(args->geo, args->offset);
>>   	ASSERT(dblk->blkno == db);
>> -	off = xfs_dir2_dataptr_to_off(args->geo, be32_to_cpu(lep->address));
>> +	off = xfs_dir2_dataptr_to_off(args->geo, args->offset);
>>   	ASSERT(dblk->index == off);
>>   
>>   	/*
>> diff --git a/fs/xfs/libxfs/xfs_dir2_sf.c b/fs/xfs/libxfs/xfs_dir2_sf.c
>> index 489bdef..9e90c22 100644
>> --- a/fs/xfs/libxfs/xfs_dir2_sf.c
>> +++ b/fs/xfs/libxfs/xfs_dir2_sf.c
>> @@ -919,6 +919,8 @@ xfs_dir2_sf_removename(
>>   								XFS_CMP_EXACT) {
>>   			ASSERT(dp->d_ops->sf_get_ino(sfp, sfep) ==
>>   			       args->inumber);
>> +			args->offset = xfs_dir2_byte_to_dataptr(
>> +						xfs_dir2_sf_get_offset(sfep));
>>   			break;
>>   		}
>>   	}
>> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
>> index 3abcb17..358a98a 100644
>> --- a/fs/xfs/xfs_inode.c
>> +++ b/fs/xfs/xfs_inode.c
>> @@ -2639,8 +2639,8 @@ xfs_remove(
>>   		goto out_trans_cancel;
>>   
>>   	xfs_defer_init(&dfops, &first_block);
>> -	error = xfs_dir_removename(tp, dp, name, ip->i_ino,
>> -					&first_block, &dfops, resblks);
>> +	error = xfs_dir_removename(tp, dp, name, ip->i_ino, &first_block,
>> +				   &dfops, resblks, NULL);
>>   	if (error) {
>>   		ASSERT(error != -ENOENT);
>>   		goto out_bmap_cancel;
>> @@ -3150,7 +3150,8 @@ xfs_rename(
>>   					&first_block, &dfops, spaceres);
>>   	} else
>>   		error = xfs_dir_removename(tp, src_dp, src_name, src_ip->i_ino,
>> -					   &first_block, &dfops, spaceres);
>> +					   &first_block, &dfops, spaceres,
>> +					   NULL);
>>   	if (error)
>>   		goto out_bmap_cancel;
>>   
>> -- 
>> 2.7.4
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
>> the body of a message tomajordomo@vger.kernel.org
>> More majordomo info athttp://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 12/17] xfs: parent pointer attribute creation
  2017-10-19 19:36   ` Darrick J. Wong
       [not found]     ` <9185d3e8-4b41-b2d8-294b-934f7d3409f0@oracle.com>
@ 2017-10-21  1:11     ` Allison Henderson
  1 sibling, 0 replies; 66+ messages in thread
From: Allison Henderson @ 2017-10-21  1:11 UTC (permalink / raw)
  Cc: linux-xfs

On 10/19/2017 12:36 PM, Darrick J. Wong wrote:
> On Wed, Oct 18, 2017 at 03:55:28PM -0700, Allison Henderson wrote:
>> From: Dave Chinner<dchinner@redhat.com>
>>
>> [bfoster: rebase, use VFS inode generation]
>> [achender: rebased, changed __unint32_t to xfs_dir2_dataptr_t,
>> 	   fixed some null pointer bugs]
>>
>> Signed-off-by: Dave Chinner<dchinner@redhat.com>
>> Signed-off-by: Allison Henderson<allison.henderson@oracle.com>
>> ---
>> v2: remove unnecessary ENOSPC handling in xfs_attr_set_first_parent
>>
>> Signed-off-by: Allison Henderson<allison.henderson@oracle.com>
>> ---
>>   fs/xfs/Makefile            |  1 +
>>   fs/xfs/libxfs/xfs_attr.c   | 71 ++++++++++++++++++++++++++++++---
>>   fs/xfs/libxfs/xfs_bmap.c   | 51 ++++++++++++++----------
>>   fs/xfs/libxfs/xfs_bmap.h   |  1 +
>>   fs/xfs/libxfs/xfs_parent.c | 98 ++++++++++++++++++++++++++++++++++++++++++++++
>>   fs/xfs/xfs_attr.h          | 15 ++++++-
>>   fs/xfs/xfs_inode.c         | 16 +++++++-
>>   7 files changed, 225 insertions(+), 28 deletions(-)
>>
>> diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile
>> index ec6486b..3015bca 100644
>> --- a/fs/xfs/Makefile
>> +++ b/fs/xfs/Makefile
>> @@ -52,6 +52,7 @@ xfs-y				+= $(addprefix libxfs/, \
>>   				   xfs_inode_fork.o \
>>   				   xfs_inode_buf.o \
>>   				   xfs_log_rlimit.o \
>> +				   xfs_parent.o \
>>   				   xfs_ag_resv.o \
>>   				   xfs_rmap.o \
>>   				   xfs_rmap_btree.o \
>> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
>> index 8f8bfff9..8aad242 100644
>> --- a/fs/xfs/libxfs/xfs_attr.c
>> +++ b/fs/xfs/libxfs/xfs_attr.c
>> @@ -91,12 +91,14 @@ xfs_attr_args_init(
>>   	args->whichfork = XFS_ATTR_FORK;
>>   	args->dp = dp;
>>   	args->flags = flags;
>> -	args->name = name;
>> -	args->namelen = namelen;
>> -	if (args->namelen >= MAXNAMELEN)
>> -		return -EFAULT;		/* match IRIX behaviour */
>> +	if (name) {
> When do we have a NULL name?

Ideally we shouldn't, though on a remove we should have a NULL value, since
we only need the name.  I suppose I'm still in the habit of coding 
defensively
though it may make since to generate the oops, or even add an assert if 
it happens.
Thx!

>> +		args->name = name;
>> +		args->namelen = namelen;
>> +		if (args->namelen >= MAXNAMELEN)
>> +			return -EFAULT;		/* match IRIX behaviour */
>>   
>> -	args->hashval = xfs_da_hashname(args->name, args->namelen);
>> +		args->hashval = xfs_da_hashname(args->name, args->namelen);
>> +	}
>>   	return 0;
>>   }
>>   
>> @@ -206,6 +208,65 @@ xfs_attr_calc_size(
>>   }
>>   
>>   /*
>> + * Add the initial parent pointer attribute.
>> + *
>> + * Inode must be locked and completely empty as we are adding the attribute
>> + * fork to the inode. This open codes bits of xfs_bmap_add_attrfork() and
>> + * xfs_attr_set() because we know the inode is completely empty at this point
> Hrmm... in general I don't like opencoding bits of other functions
> without a good justification.
>
>> + * and so don't need to handle all the different combinations of fork
>> + * configurations here.
>> + */
>> +int
>> +xfs_attr_set_first_parent(
>> +	struct xfs_trans	*tp,
>> +	struct xfs_inode	*ip,
>> +	struct xfs_parent_name_rec *rec,
>> +	int			reclen,
>> +	const char		*value,
>> +	int			valuelen,
>> +	struct xfs_defer_ops	*dfops,
>> +	xfs_fsblock_t		*firstblock)
> These all need one more level of indentation due to struct xfs_parent_name_rec.
Sure, I will push those out a level
>> +{
>> +	struct xfs_da_args	args;
>> +	int			flags = ATTR_PARENT;
>> +	int			local;
>> +	int			sf_size;
>> +	int			error;
>> +
>> +	tp->t_flags |= XFS_TRANS_RESERVE;
>> +
>> +	error = xfs_attr_args_init(&args, ip, (char *)rec, reclen, flags);
>> +	if (error)
>> +		return error;
>> +
>> +	args.name = (char *)rec;
>> +	args.namelen = reclen;
>> +	args.hashval = xfs_da_hashname(args.name, args.namelen);
> Aren't these already set by xfs_attr_args_init?
Some of them are: name, namelen, hashval, dp, and flags.
But not firstblock dfops, op_flags, total, or trans.

I guess I kind of liked seeing things initialized all in one spot rather
than split up like that. But it shouldn't hurt anything to remove the
re-inits if that is not preferable.
>> +	args.value = (char *)value;
>> +	args.valuelen = valuelen;
>> +	args.firstblock = firstblock;
>> +	args.dfops = dfops;
>> +	args.op_flags = XFS_DA_OP_ADDNAME | XFS_DA_OP_OKNOENT;
>> +	args.total = xfs_attr_calc_size(&args, &local);
>> +	args.trans = tp;
>> +	ASSERT(local);
>> +
>> +	/* set the attribute fork appropriately */
>> +	sf_size = sizeof(struct xfs_attr_sf_hdr) +
>> +			XFS_ATTR_SF_ENTSIZE_BYNAME(reclen, valuelen);
>> +	xfs_bmap_set_attrforkoff(ip, sf_size, NULL);
>> +	ip->i_afp = kmem_zone_zalloc(xfs_ifork_zone, KM_SLEEP);
>> +	ip->i_afp->if_flags = XFS_IFEXTENTS;
>> +
>> +
>> +	/* Try to add the attr to the attribute list in the inode. */
>> +	xfs_attr_shortform_create(&args);
> Are we sure that we'll always be able to cram the parent attribute into
> the shortform area?  Minimum inode size is 512 bytes, core size is
> currently 176 bytes, max parent attribute size is ~280 bytes... I guess
> that works.
>
> But I wouldn't want this to blow up some day when the inode core gets
> bigger and this no longer fits.  Will using the regular xfs_attr_set
> function cover all these sizing cases?  What's the benefit to all this
> short circuiting?

Hmm, I'm going to speculate that the original intent was to optimize
on the current conditions of the inode and the attrs fitting in just
right?  (Dave may need to correct me if that's not right....).

You make good points though.  Unless someone has an objection, I
can put in the normal xfs_attr_set

>> +	error = xfs_attr_shortform_addname(&args);
>> +
>> +	return error;
>> +}
>> +
>> +/*
>>    * set the attribute specified in @args. In the case of the parent attribute
>>    * being set, we do not want to roll the transaction on shortform-to-leaf
>>    * conversion, as the attribute must be added in the same transaction as the
>> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
>> index 044a363..7ee98be 100644
>> --- a/fs/xfs/libxfs/xfs_bmap.c
>> +++ b/fs/xfs/libxfs/xfs_bmap.c
>> @@ -1066,6 +1066,35 @@ xfs_bmap_add_attrfork_local(
>>   	return -EFSCORRUPTED;
>>   }
>>   
>> +int
>> +xfs_bmap_set_attrforkoff(
>> +	struct xfs_inode	*ip,
>> +	int			size,
>> +	int			*version)
>> +{
>> +	switch (ip->i_d.di_format) {
>> +	case XFS_DINODE_FMT_DEV:
>> +		ip->i_d.di_forkoff = roundup(sizeof(xfs_dev_t), 8) >> 3;
>> +		break;
>> +	case XFS_DINODE_FMT_UUID:
>> +		ip->i_d.di_forkoff = roundup(sizeof(uuid_t), 8) >> 3;
>> +		break;
>> +	case XFS_DINODE_FMT_LOCAL:
>> +	case XFS_DINODE_FMT_EXTENTS:
>> +	case XFS_DINODE_FMT_BTREE:
>> +		ip->i_d.di_forkoff = xfs_attr_shortform_bytesfit(ip, size);
>> +		if (!ip->i_d.di_forkoff)
>> +			ip->i_d.di_forkoff = xfs_default_attroffset(ip) >> 3;
>> +		else if ((ip->i_mount->m_flags & XFS_MOUNT_ATTR2) && version)
>> +			*version = 2;
>> +		break;
>> +	default:
>> +		ASSERT(0);
>> +		return -EINVAL;
>> +	}
>> +	return 0;
>> +}
>> +
>>   /*
>>    * Convert inode from non-attributed to attributed.
>>    * Must not be in a transaction, ip must not be locked.
>> @@ -1120,27 +1149,7 @@ xfs_bmap_add_attrfork(
>>   	xfs_trans_ijoin(tp, ip, 0);
>>   	xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
>>   
>> -	switch (ip->i_d.di_format) {
>> -	case XFS_DINODE_FMT_DEV:
>> -		ip->i_d.di_forkoff = roundup(sizeof(xfs_dev_t), 8) >> 3;
>> -		break;
>> -	case XFS_DINODE_FMT_UUID:
>> -		ip->i_d.di_forkoff = roundup(sizeof(uuid_t), 8) >> 3;
>> -		break;
>> -	case XFS_DINODE_FMT_LOCAL:
>> -	case XFS_DINODE_FMT_EXTENTS:
>> -	case XFS_DINODE_FMT_BTREE:
>> -		ip->i_d.di_forkoff = xfs_attr_shortform_bytesfit(ip, size);
>> -		if (!ip->i_d.di_forkoff)
>> -			ip->i_d.di_forkoff = xfs_default_attroffset(ip) >> 3;
>> -		else if (mp->m_flags & XFS_MOUNT_ATTR2)
>> -			version = 2;
>> -		break;
>> -	default:
>> -		ASSERT(0);
>> -		error = -EINVAL;
>> -		goto trans_cancel;
>> -	}
>> +	xfs_bmap_set_attrforkoff(ip, size, &version);
>>   
>>   	ASSERT(ip->i_afp == NULL);
>>   	ip->i_afp = kmem_zone_zalloc(xfs_ifork_zone, KM_SLEEP);
>> diff --git a/fs/xfs/libxfs/xfs_bmap.h b/fs/xfs/libxfs/xfs_bmap.h
>> index 851982a..533f40f 100644
>> --- a/fs/xfs/libxfs/xfs_bmap.h
>> +++ b/fs/xfs/libxfs/xfs_bmap.h
>> @@ -209,6 +209,7 @@ void	xfs_bmap_trace_exlist(struct xfs_inode *ip, xfs_extnum_t cnt,
>>   void	xfs_trim_extent(struct xfs_bmbt_irec *irec, xfs_fileoff_t bno,
>>   		xfs_filblks_t len);
>>   int	xfs_bmap_add_attrfork(struct xfs_inode *ip, int size, int rsvd);
>> +int	xfs_bmap_set_attrforkoff(struct xfs_inode *ip, int size, int *version);
>>   void	xfs_bmap_local_to_extents_empty(struct xfs_inode *ip, int whichfork);
>>   void	xfs_bmap_add_free(struct xfs_mount *mp, struct xfs_defer_ops *dfops,
>>   			  xfs_fsblock_t bno, xfs_filblks_t len,
>> diff --git a/fs/xfs/libxfs/xfs_parent.c b/fs/xfs/libxfs/xfs_parent.c
>> new file mode 100644
>> index 0000000..88f7edc
>> --- /dev/null
>> +++ b/fs/xfs/libxfs/xfs_parent.c
>> @@ -0,0 +1,98 @@
>> +/*
>> + * Copyright (c) 2015 Red Hat, Inc.
>> + * All rights reserved.
>> + *
>> + * This program is free software; you can redistribute it and/or
>> + * modify it under the terms of the GNU General Public License as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it would be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program; if not, write the Free Software Foundation
>> + */
>> +#include "xfs.h"
>> +#include "xfs_fs.h"
>> +#include "xfs_format.h"
>> +#include "xfs_log_format.h"
>> +#include "xfs_shared.h"
>> +#include "xfs_trans_resv.h"
>> +#include "xfs_mount.h"
>> +#include "xfs_bmap_btree.h"
>> +#include "xfs_inode.h"
>> +#include "xfs_error.h"
>> +#include "xfs_trace.h"
>> +#include "xfs_trans.h"
>> +#include "xfs_attr.h"
>> +
>> +/*
>> + * Parent pointer attribute handling.
>> + *
>> + * Because the attribute value is a filename component, it will never be longer
>> + * than 255 bytes. This means the attribute will always be a local format
>> + * attribute as it is xfs_attr_leaf_entsize_local_max() for v5 filesystems will
>> + * always be larger than this (max is 75% of block size).
>> + *
>> + * Creating a new parent attribute will always create a new attribute - there
>> + * should never, ever be an existing attribute in the tree for a new inode.
>> + * ENOSPC behaviour is problematic - creating the inode without the parent
>> + * pointer is effectively a corruption, so we allow parent attribute creation
>> + * to dip into the reserve block pool to avoid unexpected ENOSPC errors from
>> + * occurring.
>> + */
>> +
>> +/*
>> + * Create the initial parent attribute.
>> + *
>> + * The initial attribute creation also needs to be atomic w.r.t the parent
>> + * directory modification. Hence it needs to run in the same transaction and the
>> + * transaction committed by the caller.  Because the attribute created is
>> + * guaranteed to be a local attribute and is always going to be the first
>> + * attribute in the attribute fork, we can do this safely in the single
>> + * transaction context as it is impossible for an overwrite to occur and hence
>> + * we'll never have a rolling overwrite transaction occurring here. Hence we
>> + * can short-cut a lot of the normal xfs_attr_set() code paths that are needed
>> + * to handle the generic cases.
> Is there some other part of inode creation (ACL propagation?) that
> thinks it could be the creator of the first attribute and will react
> negatively to this?
Hmm, not that I can think of, but I wonder if there was at the time?
>> + */
>> +static int
>> +xfs_parent_create_nrec(
>> +	struct xfs_trans	*tp,
>> +	struct xfs_inode	*child,
>> +	struct xfs_parent_name_irec *nrec,
>> +	struct xfs_defer_ops	*dfops,
>> +	xfs_fsblock_t		*firstblock)
>> +{
>> +	struct xfs_parent_name_rec rec;
>> +
>> +	rec.p_ino = cpu_to_be64(nrec->p_ino);
>> +	rec.p_gen = cpu_to_be32(nrec->p_gen);
>> +	rec.p_diroffset = cpu_to_be32(nrec->p_diroffset);
> The disk->header and header->disk converters should be their own
> functions so that later when I add parent pointer iterators I can pass
> the irec to the iterator function directly.
>
> (Granted I could just as easily do that later in my own patch...)
>
I don't mind adding here if we're already have a need for it.  Saves time
changing it later :-)
>> +
>> +	return xfs_attr_set_first_parent(tp, child, &rec, sizeof(rec),
>> +				   nrec->p_name, nrec->p_namelen,
>> +				   dfops, firstblock);
>> +}
>> +
>> +int
>> +xfs_parent_create(
> What's this function do?  (Needs comment.)
>
> --D
This is the subroutine that we use during creation, but I think you
pointed out some issues with it in your later reviews, since this should
probably be part of the deferred operation code. I will add comments
when I revise it though.  Thx!

>> +	struct xfs_trans	*tp,
>> +	struct xfs_inode	*parent,
>> +	struct xfs_inode	*child,
>> +	struct xfs_name		*child_name,
>> +	xfs_dir2_dataptr_t	diroffset,
>> +	struct xfs_defer_ops	*dfops,
>> +	xfs_fsblock_t		*firstblock)
>> +{
>> +	struct xfs_parent_name_irec nrec;
>> +
>> +	nrec.p_ino = parent->i_ino;
>> +	nrec.p_gen = VFS_I(parent)->i_generation;
>> +	nrec.p_diroffset = diroffset;
>> +	nrec.p_name = child_name->name;
>> +	nrec.p_namelen = child_name->len;
>> +
>> +	return xfs_parent_create_nrec(tp, child, &nrec, dfops, firstblock);
>> +}
>> diff --git a/fs/xfs/xfs_attr.h b/fs/xfs/xfs_attr.h
>> index 7901c3b..b48e31b 100644
>> --- a/fs/xfs/xfs_attr.h
>> +++ b/fs/xfs/xfs_attr.h
>> @@ -19,6 +19,8 @@
>>   #define	__XFS_ATTR_H__
>>   
>>   #include "libxfs/xfs_defer.h"
>> +#include "libxfs/xfs_da_format.h"
>> +#include "libxfs/xfs_format.h"
>>   
>>   struct xfs_inode;
>>   struct xfs_da_args;
>> @@ -183,5 +185,16 @@ int xfs_attr_set_deferred(struct xfs_inode *dp, struct xfs_defer_ops *dfops,
>>   int xfs_attr_remove_deferred(struct xfs_inode *dp, struct xfs_defer_ops *dfops,
>>   			    const unsigned char *name, unsigned int namelen,
>>   			    int flags);
>> -
>> +/*
>> + * Parent pointer attribute prototypes
>> + */
>> +int xfs_parent_create(struct xfs_trans *tp, struct xfs_inode *parent,
>> +		      struct xfs_inode *child, struct xfs_name *child_name,
>> +		      xfs_dir2_dataptr_t diroffset, struct xfs_defer_ops *dfops,
>> +		      xfs_fsblock_t *firstblock);
>> +int xfs_attr_set_first_parent(struct xfs_trans *tp, struct xfs_inode *ip,
>> +			      struct xfs_parent_name_rec *rec, int reclen,
>> +			      const char *value, int valuelen,
>> +			      struct xfs_defer_ops *dfops,
>> +			      xfs_fsblock_t *firstblock);
>>   #endif	/* __XFS_ATTR_H__ */
>> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
>> index f7986d8..4396561 100644
>> --- a/fs/xfs/xfs_inode.c
>> +++ b/fs/xfs/xfs_inode.c
>> @@ -1164,6 +1164,7 @@ xfs_create(
>>   	struct xfs_dquot	*pdqp = NULL;
>>   	struct xfs_trans_res	*tres;
>>   	uint			resblks;
>> +	xfs_dir2_dataptr_t	diroffset;
>>   
>>   	trace_xfs_create(dp, name);
>>   
>> @@ -1253,7 +1254,7 @@ xfs_create(
>>   	error = xfs_dir_createname(tp, dp, name, ip->i_ino,
>>   					&first_block, &dfops, resblks ?
>>   					resblks - XFS_IALLOC_SPACE_RES(mp) : 0,
>> -					NULL);
>> +					&diroffset);
>>   	if (error) {
>>   		ASSERT(error != -ENOSPC);
>>   		goto out_trans_cancel;
>> @@ -1272,6 +1273,19 @@ xfs_create(
>>   	}
>>   
>>   	/*
>> +	 * If we have parent pointers, we need to add the attribute containing
>> +	 * the parent information now. This must be done within the same
>> +	 * transaction the directory entry is created, while the new inode
>> +	 * contains nothing in the inode literal area.
>> +	 */
>> +	if (xfs_sb_version_hasparent(&mp->m_sb)) {
>> +		error = xfs_parent_create(tp, dp, ip, name, diroffset,
>> +					  &dfops, &first_block);
>> +		if (error)
>> +			goto out_bmap_cancel;
>> +	}
>> +
>> +	/*
>>   	 * If this is a synchronous mount, make sure that the
>>   	 * create transaction goes to disk before returning to
>>   	 * the user.
>> -- 
>> 2.7.4
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
>> the body of a message tomajordomo@vger.kernel.org
>> More majordomo info athttp://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 13/17] xfs: add parent attributes to link
  2017-10-19 19:40   ` Darrick J. Wong
@ 2017-10-21  1:12     ` Allison Henderson
  0 siblings, 0 replies; 66+ messages in thread
From: Allison Henderson @ 2017-10-21  1:12 UTC (permalink / raw)
  Cc: linux-xfs

On 10/19/2017 12:40 PM, Darrick J. Wong wrote:

> On Wed, Oct 18, 2017 at 03:55:29PM -0700, Allison Henderson wrote:
>> From: Dave Chinner<dchinner@redhat.com>
> Needs a description of what w're doing and maybe why...
>
Sure, I think initially these had quick one line descriptions that
got formatted into the subject line when I created the patch set

>> [bfoster: rebase, use VFS inode fields, fix xfs_bmap_finish() usage]
>> [achender: rebased, changed __unint32_t to xfs_dir2_dataptr_t,
>> 	   fixed null pointer bugs]
>>
>> Signed-off-by: Dave Chinner<dchinner@redhat.com>
>> Signed-off-by: Allison Henderson<allison.henderson@oracle.com>
>> ---
>>   fs/xfs/libxfs/xfs_attr.c   | 20 +++++++++++++-
>>   fs/xfs/libxfs/xfs_parent.c | 43 ++++++++++++++++++++++++++++++
>>   fs/xfs/xfs_attr.h          | 10 +++++++
>>   fs/xfs/xfs_inode.c         | 66 ++++++++++++++++++++++++++++++++++++----------
>>   4 files changed, 124 insertions(+), 15 deletions(-)
>>
>> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
>> index 8aad242..e7692ef 100644
>> --- a/fs/xfs/libxfs/xfs_attr.c
>> +++ b/fs/xfs/libxfs/xfs_attr.c
>> @@ -35,6 +35,7 @@
>>   #include "xfs_bmap_util.h"
>>   #include "xfs_bmap_btree.h"
>>   #include "xfs_attr.h"
>> +#include "xfs_attr_sf.h"
>>   #include "xfs_attr_leaf.h"
>>   #include "xfs_attr_remote.h"
>>   #include "xfs_error.h"
>> @@ -266,6 +267,23 @@ xfs_attr_set_first_parent(
>>   	return error;
>>   }
>>   
>> +int
>> +xfs_attr_set_parent(
>> +	struct xfs_trans	*tp,
>> +	struct xfs_inode	*ip,
>> +	struct xfs_parent_name_rec *rec,
>> +	int			reclen,
>> +	const char		*value,
>> +	int			valuelen,
>> +	struct xfs_defer_ops	*dfops,
>> +	xfs_fsblock_t		*firstblock)
>> +{
>> +	int                     flags = ATTR_PARENT;
>> +
>> +	return xfs_attr_set_deferred(ip, dfops, (char *)rec, reclen,
>> +				    (char *)value, valuelen, flags);
>> +}
>> +
>>   /*
>>    * set the attribute specified in @args. In the case of the parent attribute
>>    * being set, we do not want to roll the transaction on shortform-to-leaf
>> @@ -512,8 +530,8 @@ xfs_attr_set(
>>   	 */
>>   	xfs_trans_log_inode(args.trans, dp, XFS_ILOG_CORE);
>>   	error = xfs_trans_commit(args.trans);
>> -	xfs_iunlock(dp, XFS_ILOCK_EXCL);
>>   
>> +	xfs_iunlock(dp, XFS_ILOCK_EXCL);
>>   	return error;
>>   
>>   out_defer_cancel:
>> diff --git a/fs/xfs/libxfs/xfs_parent.c b/fs/xfs/libxfs/xfs_parent.c
>> index 88f7edc..0707336 100644
>> --- a/fs/xfs/libxfs/xfs_parent.c
>> +++ b/fs/xfs/libxfs/xfs_parent.c
>> @@ -96,3 +96,46 @@ xfs_parent_create(
>>   
>>   	return xfs_parent_create_nrec(tp, child, &nrec, dfops, firstblock);
>>   }
>> +
>> +static int
>> +xfs_parent_add_nrec(
>> +	struct xfs_trans	*tp,
>> +	struct xfs_inode	*child,
>> +	struct xfs_parent_name_irec *nrec,
>> +	struct xfs_defer_ops	*dfops,
>> +	xfs_fsblock_t		*firstblock)
>> +{
>> +	struct xfs_parent_name_rec rec;
>> +
>> +	rec.p_ino = cpu_to_be64(nrec->p_ino);
>> +	rec.p_gen = cpu_to_be32(nrec->p_gen);
>> +	rec.p_diroffset = cpu_to_be32(nrec->p_diroffset);
>> +
>> +	return xfs_attr_set_parent(tp, child, &rec, sizeof(rec),
>> +				   nrec->p_name, nrec->p_namelen,
>> +				   dfops, firstblock);
>> +}
>> +
>> +/*
>> + * Add a parent record to an inode with existing parent records.
>> + */
>> +int
>> +xfs_parent_add(
>> +	struct xfs_trans	*tp,
>> +	struct xfs_inode	*parent,
>> +	struct xfs_inode	*child,
>> +	struct xfs_name		*child_name,
>> +	uint32_t		diroffset,
>> +	struct xfs_defer_ops	*dfops,
>> +	xfs_fsblock_t		*firstblock)
>> +{
>> +	struct xfs_parent_name_irec nrec;
>> +
>> +	nrec.p_ino = parent->i_ino;
>> +	nrec.p_gen = VFS_I(parent)->i_generation;
>> +	nrec.p_diroffset = diroffset;
>> +	nrec.p_name = child_name->name;
>> +	nrec.p_namelen = child_name->len;
>> +
>> +	return xfs_parent_add_nrec(tp, child, &nrec, dfops, firstblock);
>> +}
>> diff --git a/fs/xfs/xfs_attr.h b/fs/xfs/xfs_attr.h
>> index b48e31b..acb6157 100644
>> --- a/fs/xfs/xfs_attr.h
>> +++ b/fs/xfs/xfs_attr.h
>> @@ -197,4 +197,14 @@ int xfs_attr_set_first_parent(struct xfs_trans *tp, struct xfs_inode *ip,
>>   			      const char *value, int valuelen,
>>   			      struct xfs_defer_ops *dfops,
>>   			      xfs_fsblock_t *firstblock);
>> +
>> +int xfs_parent_add(struct xfs_trans *tp, struct xfs_inode *parent,
>> +		   struct xfs_inode *child, struct xfs_name *child_name,
>> +		   xfs_dir2_dataptr_t diroffset, struct xfs_defer_ops *dfops,
>> +		   xfs_fsblock_t *firstblock);
>> +int xfs_attr_set_parent(struct xfs_trans *tp, struct xfs_inode *ip,
>> +			struct xfs_parent_name_rec *rec, int reclen,
>> +			const char *value, int valuelen,
>> +			struct xfs_defer_ops *dfops, xfs_fsblock_t *firstblock);
>> +
>>   #endif	/* __XFS_ATTR_H__ */
>> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
>> index 4396561..51b623b 100644
>> --- a/fs/xfs/xfs_inode.c
>> +++ b/fs/xfs/xfs_inode.c
>> @@ -1451,6 +1451,8 @@ xfs_link(
>>   	struct xfs_defer_ops	dfops;
>>   	xfs_fsblock_t           first_block;
>>   	int			resblks;
>> +	uint32_t		diroffset;
>> +	bool			first_parent = false;
>>   
>>   	trace_xfs_link(tdp, target_name);
>>   
>> @@ -1467,6 +1469,25 @@ xfs_link(
>>   	if (error)
>>   		goto std_return;
>>   
>> +	/*
>> +	 * If we have parent pointers and there is no attribute fork (i.e. we
>> +	 * are linking in a O_TMPFILE created inode) we need to add the
>> +	 * attribute fork to the inode. Because we may have an existing data
>> +	 * fork, we do this before we start the link transaction as adding an
>> +	 * attribute fork requires it's own transaction.
> Ok, so an inode that isn't pointed to by any directory will have zero
> parent link attributes and possibly not even an attr fork.  Got it.
>
> --D
>
>> +	 */
>> +	if (xfs_sb_version_hasparent(&mp->m_sb) && !xfs_inode_hasattr(sip)) {
>> +		int sf_size = sizeof(struct xfs_attr_sf_hdr) +
>> +				XFS_ATTR_SF_ENTSIZE_BYNAME(
>> +					sizeof(struct xfs_parent_name_rec),
>> +					target_name->len);
>> +		ASSERT(VFS_I(sip)->i_nlink == 0);
>> +		error = xfs_bmap_add_attrfork(sip, sf_size, 0);
>> +		if (error)
>> +			goto std_return;
>> +		first_parent = true;
>> +	}
>> +
>>   	resblks = XFS_LINK_SPACE_RES(mp, target_name->len);
>>   	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_link, resblks, 0, 0, &tp);
>>   	if (error == -ENOSPC) {
>> @@ -1498,8 +1519,6 @@ xfs_link(
>>   			goto error_return;
>>   	}
>>   
>> -	xfs_defer_init(&dfops, &first_block);
>> -
>>   	/*
>>   	 * Handle initial link state of O_TMPFILE inode
>>   	 */
>> @@ -1509,36 +1528,55 @@ xfs_link(
>>   			goto error_return;
>>   	}
>>   
>> +	xfs_defer_init(&dfops, &first_block);
>>   	error = xfs_dir_createname(tp, tdp, target_name, sip->i_ino,
>> -				   &first_block, &dfops, resblks, NULL);
>> +				   &first_block, &dfops, resblks, &diroffset);
>>   	if (error)
>> -		goto error_return;
>> +		goto out_defer_cancel;
>>   	xfs_trans_ichgtime(tp, tdp, XFS_ICHGTIME_MOD | XFS_ICHGTIME_CHG);
>>   	xfs_trans_log_inode(tp, tdp, XFS_ILOG_CORE);
>>   
>>   	error = xfs_bumplink(tp, sip);
>>   	if (error)
>> -		goto error_return;
>> +		goto out_defer_cancel;
>>   
>>   	/*
>> -	 * If this is a synchronous mount, make sure that the
>> -	 * link transaction goes to disk before returning to
>> -	 * the user.
>> +	 * If we have parent pointers, we now need to add the parent record to
>> +	 * the attribute fork of the inode. If this is the initial parent
>> +	 * atribute, we need to create it correctly, otherwise we can just add
>> +	 * the parent to the inode.
>> +	 */
>> +	if (xfs_sb_version_hasparent(&mp->m_sb)) {
>> +		if (first_parent)
>> +			error = xfs_parent_create(tp, tdp, sip, target_name,
>> +						  diroffset, &dfops,
>> +						  &first_block);
>> +		else
>> +			error = xfs_parent_add(tp, tdp, sip, target_name,
>> +					       diroffset, &dfops,
>> +					       &first_block);
>> +		if (error)
>> +			goto out_defer_cancel;
>> +	}
>> +
>> +	/*
>> +	 * If this is a synchronous mount, make sure that the link transaction
>> +	 * goes to disk before returning to the user.
>>   	 */
>>   	if (mp->m_flags & (XFS_MOUNT_WSYNC|XFS_MOUNT_DIRSYNC))
>>   		xfs_trans_set_sync(tp);
>>   
>>   	error = xfs_defer_finish(&tp, &dfops);
>> -	if (error) {
>> -		xfs_defer_cancel(&dfops);
>> -		goto error_return;
>> -	}
>> +	if (error)
>> +		goto out_defer_cancel;
>>   
>>   	return xfs_trans_commit(tp);
>>   
>> - error_return:
>> +out_defer_cancel:
>> +	xfs_defer_cancel(&dfops);
>> +error_return:
>>   	xfs_trans_cancel(tp);
>> - std_return:
>> +std_return:
>>   	return error;
>>   }
>>   
>> -- 
>> 2.7.4
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
>> the body of a message tomajordomo@vger.kernel.org
>> More majordomo info athttp://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 14/17] xfs: remove parent pointers in unlink
  2017-10-19 19:43   ` Darrick J. Wong
@ 2017-10-21  1:12     ` Allison Henderson
  0 siblings, 0 replies; 66+ messages in thread
From: Allison Henderson @ 2017-10-21  1:12 UTC (permalink / raw)
  Cc: linux-xfs

On 10/19/2017 12:43 PM, Darrick J. Wong wrote:
> On Wed, Oct 18, 2017 at 03:55:30PM -0700, Allison Henderson wrote:
>> From: Dave Chinner<dchinner@redhat.com>
>>
>> [bfoster: rebase, use VFS inode generation]
>> [achender: rebased, changed __unint32_t to xfs_dir2_dataptr_t
>> 	   implemented xfs_attr_remove_parent]
>>
>> Signed-off-by: Dave Chinner<dchinner@redhat.com>
>> Signed-off-by: Allison Henderson<allison.henderson@oracle.com>
>> ---
>>   fs/xfs/libxfs/xfs_attr.c   | 15 +++++++++++++++
>>   fs/xfs/libxfs/xfs_parent.c | 22 ++++++++++++++++++++++
>>   fs/xfs/xfs_attr.h          |  7 +++++++
>>   fs/xfs/xfs_inode.c         | 10 +++++++++-
>>   fs/xfs/xfs_qm.c            |  2 +-
>>   fs/xfs/xfs_qm.h            |  1 +
>>   6 files changed, 55 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
>> index e7692ef..7547eb7 100644
>> --- a/fs/xfs/libxfs/xfs_attr.c
>> +++ b/fs/xfs/libxfs/xfs_attr.c
>> @@ -42,6 +42,7 @@
>>   #include "xfs_quota.h"
>>   #include "xfs_trans_space.h"
>>   #include "xfs_trace.h"
>> +#include "xfs_qm.h"
>>   
>>   /*
>>    * xfs_attr.c
>> @@ -571,6 +572,20 @@ xfs_attr_set_deferred(
>>   	return 0;
>>   }
>>   
>> +int
>> +xfs_attr_remove_parent(
>> +	struct xfs_trans		*tp,
>> +	struct xfs_inode		*dp,
>> +	struct xfs_parent_name_rec	*rec,
>> +	int				reclen,
>> +	struct xfs_defer_ops		*dfops,
>> +	xfs_fsblock_t			*firstblock)
>> +{
>> +	int flags = ATTR_PARENT;
>> +
>> +	return xfs_attr_remove_deferred(dp, dfops, (char *) rec, reclen, flags);
> return xfs_attr_remove_deferred(dp, dfops, (char *) rec, reclen, ATTR_PARENT); ?
>
> What do you think of changing these prototypes to take (void *) so you
> don't have to cast so much?  The name and value are (more or less) a
> dumb array of bytes to the filesystem.
>
> Also kinda wondering if the corresponding routines in xfs_parent.c could
> just call the xfs_attr functions directly instead of jumping through
> single line helpers...
Ok, sure unless anyone is particularly opinionated about them, I will see
if I can collapse them down and simplify some of it
>> +}
>> +
>>   /*
>>    * Generic handler routine to remove a name from an attribute list.
>>    * Transitions attribute list from Btree to shortform as necessary.
>> diff --git a/fs/xfs/libxfs/xfs_parent.c b/fs/xfs/libxfs/xfs_parent.c
>> index 0707336..ca695c4 100644
>> --- a/fs/xfs/libxfs/xfs_parent.c
>> +++ b/fs/xfs/libxfs/xfs_parent.c
>> @@ -139,3 +139,25 @@ xfs_parent_add(
>>   
>>   	return xfs_parent_add_nrec(tp, child, &nrec, dfops, firstblock);
>>   }
>> +
>> +/*
>> + * Remove a parent record from a child inode.
>> + */
>> +int
>> +xfs_parent_remove(
>> +	struct xfs_trans	*tp,
>> +	struct xfs_inode	*parent,
>> +	struct xfs_inode	*child,
>> +	xfs_dir2_dataptr_t	diroffset,
>> +	struct xfs_defer_ops	*dfops,
>> +	xfs_fsblock_t		*firstblock)
>> +{
>> +	struct xfs_parent_name_rec rec;
>> +
>> +	rec.p_ino = cpu_to_be64(parent->i_ino);
>> +	rec.p_gen = cpu_to_be32(VFS_I(parent)->i_generation);
>> +	rec.p_diroffset = cpu_to_be32(diroffset);
>> +
>> +	return xfs_attr_remove_parent(tp, child, &rec, sizeof(rec),
>> +				      dfops, firstblock);
>> +}
>> diff --git a/fs/xfs/xfs_attr.h b/fs/xfs/xfs_attr.h
>> index acb6157..7a3bf8b 100644
>> --- a/fs/xfs/xfs_attr.h
>> +++ b/fs/xfs/xfs_attr.h
>> @@ -207,4 +207,11 @@ int xfs_attr_set_parent(struct xfs_trans *tp, struct xfs_inode *ip,
>>   			const char *value, int valuelen,
>>   			struct xfs_defer_ops *dfops, xfs_fsblock_t *firstblock);
>>   
>> +int xfs_parent_remove(struct xfs_trans *tp, struct xfs_inode *parent,
>> +		      struct xfs_inode *child, xfs_dir2_dataptr_t diroffset,
>> +		      struct xfs_defer_ops *dfops, xfs_fsblock_t *firstblock);
>> +int xfs_attr_remove_parent(struct xfs_trans *tp, struct xfs_inode *ip,
>> +			struct xfs_parent_name_rec *rec, int reclen,
>> +			struct xfs_defer_ops *dfops, xfs_fsblock_t *firstblock);
>> +
>>   #endif	/* __XFS_ATTR_H__ */
>> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
>> index 51b623b..a360c3d 100644
>> --- a/fs/xfs/xfs_inode.c
>> +++ b/fs/xfs/xfs_inode.c
>> @@ -2612,6 +2612,7 @@ xfs_remove(
>>   	struct xfs_defer_ops	dfops;
>>   	xfs_fsblock_t           first_block;
>>   	uint			resblks;
>> +	uint32_t		dir_offset;
>>   
>>   	trace_xfs_remove(dp, name);
>>   
>> @@ -2692,12 +2693,19 @@ xfs_remove(
>>   
>>   	xfs_defer_init(&dfops, &first_block);
>>   	error = xfs_dir_removename(tp, dp, name, ip->i_ino, &first_block,
>> -				   &dfops, resblks, NULL);
>> +				   &dfops, resblks, &dir_offset);
>>   	if (error) {
>>   		ASSERT(error != -ENOENT);
>>   		goto out_bmap_cancel;
>>   	}
>>   
>> +	if (xfs_sb_version_hasparent(&mp->m_sb)) {
>> +		error = xfs_parent_remove(tp, dp, ip, dir_offset, &dfops,
>> +					  &first_block);
>> +		if (error)
>> +			goto out_bmap_cancel;
>> +	}
>> +
>>   	/*
>>   	 * If this is a synchronous mount, make sure that the
>>   	 * remove transaction goes to disk before returning to
>> diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
>> index 010a13a..a047f0f 100644
>> --- a/fs/xfs/xfs_qm.c
>> +++ b/fs/xfs/xfs_qm.c
>> @@ -307,7 +307,7 @@ xfs_qm_dqattach_one(
>>   	return 0;
>>   }
>>   
>> -static bool
>> +bool
>>   xfs_qm_need_dqattach(
>>   	struct xfs_inode	*ip)
>>   {
>> diff --git a/fs/xfs/xfs_qm.h b/fs/xfs/xfs_qm.h
>> index 2975a82..9976369 100644
>> --- a/fs/xfs/xfs_qm.h
>> +++ b/fs/xfs/xfs_qm.h
>> @@ -176,6 +176,7 @@ extern int		xfs_qm_scall_setqlim(struct xfs_mount *, xfs_dqid_t, uint,
>>   					struct qc_dqblk *);
>>   extern int		xfs_qm_scall_quotaon(struct xfs_mount *, uint);
>>   extern int		xfs_qm_scall_quotaoff(struct xfs_mount *, uint);
>> +extern bool		xfs_qm_need_dqattach(struct xfs_inode *ip);
> Huh?
>
> --D
Oops, this was left over from an earlier version of the set.  I will 
clean that out.  Thx!
>>   
>>   static inline struct xfs_def_quota *
>>   xfs_get_defquota(struct xfs_dquot *dqp, struct xfs_quotainfo *qi)
>> -- 
>> 2.7.4
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
>> the body of a message tomajordomo@vger.kernel.org
>> More majordomo info athttps://urldefense.proofpoint.com/v2/url?u=http-3A__vger.kernel.org_majordomo-2Dinfo.html&d=DwIBAg&c=RoP1YumCXCgaWHvlZYR8PQcxBKCX5YTpkKY057SbK10&r=XFp4B05bcXkJ0dhYaFjd3F8telP01COkBp9cI7mKLb4&m=AP3xXRbI1ezDBf_ru9Ay6jkvjAH2QxSpsltqVIvaZq4&s=Y3VZDyZoeTDKYN7na4r13xYv3nG75lVXDukvn8LIDEE&e=  
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message tomajordomo@vger.kernel.org
> More majordomo info athttps://urldefense.proofpoint.com/v2/url?u=http-3A__vger.kernel.org_majordomo-2Dinfo.html&d=DwIBAg&c=RoP1YumCXCgaWHvlZYR8PQcxBKCX5YTpkKY057SbK10&r=XFp4B05bcXkJ0dhYaFjd3F8telP01COkBp9cI7mKLb4&m=AP3xXRbI1ezDBf_ru9Ay6jkvjAH2QxSpsltqVIvaZq4&s=Y3VZDyZoeTDKYN7na4r13xYv3nG75lVXDukvn8LIDEE&e=  


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 15/17] xfs_bmap_add_attrfork(): re-add error handling from set_attrforkoff() call
  2017-10-19 19:43   ` Darrick J. Wong
@ 2017-10-21  1:13     ` Allison Henderson
  0 siblings, 0 replies; 66+ messages in thread
From: Allison Henderson @ 2017-10-21  1:13 UTC (permalink / raw)
  Cc: linux-xfs

On 10/19/2017 12:43 PM, Darrick J. Wong wrote:
> On Wed, Oct 18, 2017 at 03:55:31PM -0700, Allison Henderson wrote:
>> From: Brian Foster<bfoster@redhat.com>
>>
>> - fix for "xfs: parent pointer attribute creation"
>>
>> [achender: rebased]
> Please fold this into that patch.
>
> --D
>
Sure, will do
>> Signed-off-by: Brian Foster<bfoster@redhat.com>
>> Signed-off-by: Allison Henderson<allison.henderson@oracle.com>
>> ---
>>   fs/xfs/libxfs/xfs_bmap.c | 4 +++-
>>   1 file changed, 3 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
>> index 7ee98be..a631fe1 100644
>> --- a/fs/xfs/libxfs/xfs_bmap.c
>> +++ b/fs/xfs/libxfs/xfs_bmap.c
>> @@ -1149,7 +1149,9 @@ xfs_bmap_add_attrfork(
>>   	xfs_trans_ijoin(tp, ip, 0);
>>   	xfs_trans_log_inode(tp, ip, XFS_ILOG_CORE);
>>   
>> -	xfs_bmap_set_attrforkoff(ip, size, &version);
>> +	error = xfs_bmap_set_attrforkoff(ip, size, &version);
>> +	if (error)
>> +		goto trans_cancel;
>>   
>>   	ASSERT(ip->i_afp == NULL);
>>   	ip->i_afp = kmem_zone_zalloc(xfs_ifork_zone, KM_SLEEP);
>> -- 
>> 2.7.4
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
>> the body of a message tomajordomo@vger.kernel.org
>> More majordomo info athttp://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 17/17] Add the parent pointer support to the superblock version 5.
  2017-10-19 19:45   ` Darrick J. Wong
@ 2017-10-21  1:13     ` Allison Henderson
  0 siblings, 0 replies; 66+ messages in thread
From: Allison Henderson @ 2017-10-21  1:13 UTC (permalink / raw)
  Cc: linux-xfs, Dave Chinner

On 10/19/2017 12:45 PM, Darrick J. Wong wrote:
> On Wed, Oct 18, 2017 at 03:55:33PM -0700, Allison Henderson wrote:
>> [dchinner: forward ported and cleaned up]
>> [achender: rebased and added parent pointer attribute to
>>             compatible attributes mask]
>>
>> Signed-off-by: Mark Tinguely<tinguely@sgi.com>
>> Signed-off-by: Dave Chinner<dchinner@redhat.com>
>> Signed-off-by: Allison Henderson<allison.henderson@oracle.com>
>> ---
>> v2: remove unrelated type clean up in xfs_format.h
>>
>> Signed-off-by: Allison Henderson<allison.henderson@oracle.com>
>> ---
>>   fs/xfs/libxfs/xfs_format.h | 7 +++++--
>>   fs/xfs/libxfs/xfs_fs.h     | 1 +
>>   fs/xfs/xfs_fsops.c         | 4 +++-
>>   3 files changed, 9 insertions(+), 3 deletions(-)
>>
>> diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h
>> index 121862a..f3e3132 100644
>> --- a/fs/xfs/libxfs/xfs_format.h
>> +++ b/fs/xfs/libxfs/xfs_format.h
>> @@ -459,10 +459,12 @@ xfs_sb_has_compat_feature(
>>   #define XFS_SB_FEAT_RO_COMPAT_FINOBT   (1 << 0)		/* free inode btree */
>>   #define XFS_SB_FEAT_RO_COMPAT_RMAPBT   (1 << 1)		/* reverse map btree */
>>   #define XFS_SB_FEAT_RO_COMPAT_REFLINK  (1 << 2)		/* reflinked files */
>> +#define XFS_SB_FEAT_RO_COMPAT_PARENT	(1 << 3)	/* parent inode ptr */
>>   #define XFS_SB_FEAT_RO_COMPAT_ALL \
>>   		(XFS_SB_FEAT_RO_COMPAT_FINOBT | \
>>   		 XFS_SB_FEAT_RO_COMPAT_RMAPBT | \
>> -		 XFS_SB_FEAT_RO_COMPAT_REFLINK)
>> +		 XFS_SB_FEAT_RO_COMPAT_REFLINK| \
>> +		 XFS_SB_FEAT_RO_COMPAT_PARENT)
>>   #define XFS_SB_FEAT_RO_COMPAT_UNKNOWN	~XFS_SB_FEAT_RO_COMPAT_ALL
>>   static inline bool
>>   xfs_sb_has_ro_compat_feature(
>> @@ -558,7 +560,8 @@ static inline bool xfs_sb_version_hasreflink(struct xfs_sb *sbp)
>>   
>>   static inline bool xfs_sb_version_hasparent(struct xfs_sb *sbp)
>>   {
>> -	return false; /* We'll enable this at the end of the set */
>> +	return (XFS_SB_VERSION_NUM(sbp) == XFS_SB_VERSION_5 &&
>> +		(sbp->sb_features_ro_compat & XFS_SB_FEAT_RO_COMPAT_PARENT));
>>   }
>>   
>>   /*
>> diff --git a/fs/xfs/libxfs/xfs_fs.h b/fs/xfs/libxfs/xfs_fs.h
>> index 8c61f21..b8108f8 100644
>> --- a/fs/xfs/libxfs/xfs_fs.h
>> +++ b/fs/xfs/libxfs/xfs_fs.h
>> @@ -222,6 +222,7 @@ typedef struct xfs_fsop_resblks {
>>   #define XFS_FSOP_GEOM_FLAGS_SPINODES	0x40000	/* sparse inode chunks	*/
>>   #define XFS_FSOP_GEOM_FLAGS_RMAPBT	0x80000	/* reverse mapping btree */
>>   #define XFS_FSOP_GEOM_FLAGS_REFLINK	0x100000 /* files can share blocks */
>> +#define XFS_FSOP_GEOM_FLAGS_PARENT	0x200000 /* parent pointers */
>>   
>>   /*
>>    * Minimum and maximum sizes need for growth checks.
>> diff --git a/fs/xfs/xfs_fsops.c b/fs/xfs/xfs_fsops.c
>> index 8f22fc5..9a0ce52 100644
>> --- a/fs/xfs/xfs_fsops.c
>> +++ b/fs/xfs/xfs_fsops.c
>> @@ -111,7 +111,9 @@ xfs_fs_geometry(
>>   			(xfs_sb_version_hasrmapbt(&mp->m_sb) ?
>>   				XFS_FSOP_GEOM_FLAGS_RMAPBT : 0) |
>>   			(xfs_sb_version_hasreflink(&mp->m_sb) ?
>> -				XFS_FSOP_GEOM_FLAGS_REFLINK : 0);
>> +				XFS_FSOP_GEOM_FLAGS_REFLINK : 0) |
>> +			(xfs_sb_version_hasparent(&mp->m_sb) ?
>> +				XFS_FSOP_GEOM_FLAGS_PARENT : 0);
>>   		geo->logsectsize = xfs_sb_version_hassector(&mp->m_sb) ?
>>   				mp->m_sb.sb_logsectsize : BBSIZE;
>>   		geo->rtsectsize = mp->m_sb.sb_blocksize;
> xfs_fs_fill_super ought to have a warning about parent pointers being
> experimental.
>
> --D
Sure, I will add a warning blurb in there
>> -- 
>> 2.7.4
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
>> the body of a message tomajordomo@vger.kernel.org
>> More majordomo info athttp://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 01/17] Add helper functions xfs_attr_set_args and xfs_attr_remove_args
  2017-10-19 20:03   ` Darrick J. Wong
@ 2017-10-21  1:14     ` Allison Henderson
  0 siblings, 0 replies; 66+ messages in thread
From: Allison Henderson @ 2017-10-21  1:14 UTC (permalink / raw)
  Cc: linux-xfs

On 10/19/2017 01:03 PM, Darrick J. Wong wrote:

> On Wed, Oct 18, 2017 at 03:55:17PM -0700, Allison Henderson wrote:
>> These sub-routines set or remove the attributes specified in
>> @args. We will use this later for setting parent pointers as a
>> deferred attribute operation.
>>
>> Signed-off-by: Allison Henderson<allison.henderson@oracle.com>
>> ---
>>   fs/xfs/libxfs/xfs_attr.c | 322 +++++++++++++++++++++++++++--------------------
>>   fs/xfs/xfs_attr.h        |   2 +
>>   2 files changed, 189 insertions(+), 135 deletions(-)
>>
>> diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
>> index 6249c92..b00ec1f 100644
>> --- a/fs/xfs/libxfs/xfs_attr.c
>> +++ b/fs/xfs/libxfs/xfs_attr.c
>> @@ -203,6 +203,185 @@ xfs_attr_calc_size(
>>   	return nblks;
>>   }
>>   
>> +/*
>> + * set the attribute specified in @args. In the case of the parent attribute
>> + * being set, we do not want to roll the transaction on shortform-to-leaf
>> + * conversion, as the attribute must be added in the same transaction as the
>> + * parent directory modifications. Hence @roll_trans needs to be set
>> + * appropriately to control whether the transaction is committed during this
>> + * function.
> Hmm... shouldn't the deferred attribute set code take care of all the
> conversions/attr fork expansions/whatever is necessary to cram in the
> parent pointer?  So in theory all the parent pointer updates should look
> like this:
>
> xfs_defer_init(&dfops...);
> xfs_some_name_creating_operation(tp, ...);
>
> if (hasparent) {
> 	xfs_parent_set(tp, ip);
> }
> xfs_defer_finish(&tp, &dfops);
> xfs_trans_commit(tp);
>
> xfs_parent_set() would then be:
>
> if (fits in inode) {
> 	xfs_attr_set_sf(ip, key, value);
> 	xfs_log_attr_area();
> 	return;
> }
>
> xfs_attr_do_conversions_if_needed(dfops);
> xfs_attr_set(dfops);
>
> ...and if there's a really good reason to try to cram things in, we can
> add those later?  As I scanned the series, that was what kept coming up
> in my head -- just tell the xfs_parent.c code to set a parent pointer;
> it can figure out if there's sufficient space to put it directly into
> the inode and log that, or we need something else that actually requires
> the deferred ops mechanism then do that.
>
> (Maybe I'm just fixating on xfs_parent_create...)

Ok, I think may have caused some confusion by not revising the comments
for these functions.  xfs_attr_remove_args and xfs_attr_set_args are
sort of subsets of code that came out of xfs_attr_remove and xfs_attr_set.
In the first revision of the set, the link/unlink/rename operations called
them directly to set up the parent pointers.  Now these two functions
are called through xfs_trans_attr.

But you've still made very good points because I forgot that xfs_create
path uses xfs_parent_create not xfs_parent_add.  So I think the pseudo
code you are proposing makes sense.  I will see if I can revise the
create path a bit.  Thanks for the catch!
> Higher level questions about robustness: if we try to set a parent
> ptr and the attr name already exists, do we error out?  If we try to
> remove a parent ptr and there's no attr name, do we error out?
Erroring out seems logical sense these these would not be normal
conditions?

> I think
> you've enough here to start thinking what changes need to be made to
> xfs_repair to validate & fix the parent pointers, and what test cases
> (and possibly ioctls) will need to be constructed to verify that we
> actually get the parent pointers we want.
>
> --D
Right, I will put together some basic add/remove/rename tests to
add to xfstests.

Thank you for the very thorough review!!

>> + */
>> +int
>> +xfs_attr_set_args(
>> +	struct xfs_da_args	*args,
>> +	int			flags,
>> +	bool			roll_trans)
>> +{
>> +	struct xfs_inode	*dp = args->dp;
>> +	struct xfs_mount        *mp = dp->i_mount;
>> +	struct xfs_trans_res    tres;
>> +	int			rsvd = 0;
>> +	int			error = 0;
>> +
>> +	tres.tr_logres = M_RES(mp)->tr_attrsetm.tr_logres +
>> +			 M_RES(mp)->tr_attrsetrt.tr_logres * args->total;
>> +	tres.tr_logcount = XFS_ATTRSET_LOG_COUNT;
>> +	tres.tr_logflags = XFS_TRANS_PERM_LOG_RES;
>> +
>> +	/*
>> +	 * Root fork attributes can use reserved data blocks for this
>> +	 * operation if necessary
>> +	 */
>> +	error = xfs_trans_alloc(mp, &tres, args->total, 0,
>> +				rsvd ? XFS_TRANS_RESERVE : 0, &args->trans);
>> +	if (error)
>> +		goto out;
>> +
>> +	error = xfs_trans_reserve_quota_nblks(args->trans, dp, args->total, 0,
>> +					      rsvd ? XFS_QMOPT_RES_REGBLKS |
>> +						     XFS_QMOPT_FORCE_RES :
>> +						     XFS_QMOPT_RES_REGBLKS);
>> +	if (error)
>> +		goto out;
>> +
>> +	xfs_trans_ijoin(args->trans, dp, 0);
>> +	/*
>> +	 * If the attribute list is non-existent or a shortform list,
>> +	 * upgrade it to a single-leaf-block attribute list.
>> +	 */
>> +	if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL ||
>> +	    (dp->i_d.di_aformat == XFS_DINODE_FMT_EXTENTS &&
>> +	     dp->i_d.di_anextents == 0)) {
>> +
>> +		/*
>> +		 * Build initial attribute list (if required).
>> +		 */
>> +		if (dp->i_d.di_aformat == XFS_DINODE_FMT_EXTENTS)
>> +			xfs_attr_shortform_create(args);
>> +
>> +		/*
>> +		 * Try to add the attr to the attribute list in the inode.
>> +		 */
>> +		error = xfs_attr_shortform_addname(args);
>> +		if (error != -ENOSPC) {
>> +			ASSERT(args->trans);
>> +			if (!error && (flags & ATTR_KERNOTIME) == 0)
>> +				xfs_trans_ichgtime(args->trans, dp,
>> +						   XFS_ICHGTIME_CHG);
>> +			goto out;
>> +		}
>> +
>> +		/*
>> +		 * It won't fit in the shortform, transform to a leaf block.
>> +		 * GROT: another possible req'mt for a double-split btree op.
>> +		 */
>> +		error = xfs_attr_shortform_to_leaf(args);
>> +		if (error)
>> +			goto out;
>> +		xfs_defer_ijoin(args->dfops, dp);
>> +		if (roll_trans) {
>> +			error = xfs_defer_finish(&args->trans, args->dfops);
>> +			if (error) {
>> +				args->trans = NULL;
>> +				goto out;
>> +			}
>> +
>> +			/*
>> +			 * Commit the leaf transformation.  We'll need another
>> +			 * (linked) transaction to add the new attribute to the
>> +			 * leaf.
>> +			 */
>> +			error = xfs_trans_roll_inode(&args->trans, dp);
>> +			if (error)
>> +				goto out;
>> +		}
>> +	}
>> +
>> +	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
>> +		error = xfs_attr_leaf_addname(args);
>> +	else
>> +		error = xfs_attr_node_addname(args);
>> +	if (error)
>> +		goto out;
>> +
>> +	if ((flags & ATTR_KERNOTIME) == 0)
>> +		xfs_trans_ichgtime(args->trans, dp, XFS_ICHGTIME_CHG);
>> +
>> +	xfs_trans_log_inode(args->trans, dp, XFS_ILOG_CORE);
>> +out:
>> +	return error;
>> +}
>> +
>> +/*
>> + * Remove the attribute specified in @args.
>> + */
>> +int
>> +xfs_attr_remove_args(
>> +	struct xfs_da_args      *args,
>> +	int			flags)
>> +{
>> +	struct xfs_inode	*dp = args->dp;
>> +	struct xfs_mount	*mp = dp->i_mount;
>> +	int			error;
>> +	int                     rsvd = 0;
>> +
>> +	error = xfs_qm_dqattach_locked(dp, 0);
>> +	if (error)
>> +		return error;
>> +
>> +	/*
>> +	 * Root fork attributes can use reserved data blocks for this
>> +	 * operation if necessary
>> +	 */
>> +	if (flags & ATTR_ROOT)
>> +		rsvd = XFS_TRANS_RESERVE;
>> +	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_attrrm,
>> +		XFS_ATTRRM_SPACE_RES(mp), 0, rsvd, &args->trans);
>> +
>> +	if (error)
>> +		goto out;
>> +
>> +	/*
>> +	 * No need to make quota reservations here. We expect to release some
>> +	 * blocks not allocate in the common case.
>> +	 */
>> +	xfs_trans_ijoin(args->trans, dp, 0);
>> +
>> +	if (!xfs_inode_hasattr(dp)) {
>> +		error = -ENOATTR;
>> +	} else if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL) {
>> +		ASSERT(dp->i_afp->if_flags & XFS_IFINLINE);
>> +		error = xfs_attr_shortform_remove(args);
>> +	} else if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
>> +		error = xfs_attr_leaf_removename(args);
>> +	} else {
>> +		error = xfs_attr_node_removename(args);
>> +	}
>> +
>> +	if (error)
>> +		goto out;
>> +
>> +	/*
>> +	 * If this is a synchronous mount, make sure that the
>> +	 * transaction goes to disk before returning to the user.
>> +	 */
>> +	if (mp->m_flags & XFS_MOUNT_WSYNC)
>> +		xfs_trans_set_sync(args->trans);
>> +
>> +	if ((flags & ATTR_KERNOTIME) == 0)
>> +		xfs_trans_ichgtime(args->trans, dp, XFS_ICHGTIME_CHG);
>> +
>> +	xfs_trans_log_inode(args->trans, dp, XFS_ILOG_CORE);
>> +
>> +	return error;
>> +
>> +out:
>> +	if (args->trans)
>> +		xfs_trans_cancel(args->trans);
>> +
>> +	return error;
>> +}
>> +
>>   int
>>   xfs_attr_set(
>>   	struct xfs_inode	*dp,
>> @@ -214,10 +393,9 @@ xfs_attr_set(
>>   	struct xfs_mount	*mp = dp->i_mount;
>>   	struct xfs_da_args	args;
>>   	struct xfs_defer_ops	dfops;
>> -	struct xfs_trans_res	tres;
>>   	xfs_fsblock_t		firstblock;
>>   	int			rsvd = (flags & ATTR_ROOT) != 0;
>> -	int			error, err2, local;
>> +	int			error, local;
>>   
>>   	XFS_STATS_INC(mp, xs_attr_set);
>>   
>> @@ -252,106 +430,11 @@ xfs_attr_set(
>>   			return error;
>>   	}
>>   
>> -	tres.tr_logres = M_RES(mp)->tr_attrsetm.tr_logres +
>> -			 M_RES(mp)->tr_attrsetrt.tr_logres * args.total;
>> -	tres.tr_logcount = XFS_ATTRSET_LOG_COUNT;
>> -	tres.tr_logflags = XFS_TRANS_PERM_LOG_RES;
>> -
>> -	/*
>> -	 * Root fork attributes can use reserved data blocks for this
>> -	 * operation if necessary
>> -	 */
>> -	error = xfs_trans_alloc(mp, &tres, args.total, 0,
>> -			rsvd ? XFS_TRANS_RESERVE : 0, &args.trans);
>> -	if (error)
>> -		return error;
>> -
>>   	xfs_ilock(dp, XFS_ILOCK_EXCL);
>> -	error = xfs_trans_reserve_quota_nblks(args.trans, dp, args.total, 0,
>> -				rsvd ? XFS_QMOPT_RES_REGBLKS | XFS_QMOPT_FORCE_RES :
>> -				       XFS_QMOPT_RES_REGBLKS);
>> -	if (error) {
>> -		xfs_iunlock(dp, XFS_ILOCK_EXCL);
>> -		xfs_trans_cancel(args.trans);
>> -		return error;
>> -	}
>> -
>> -	xfs_trans_ijoin(args.trans, dp, 0);
>> -
>> -	/*
>> -	 * If the attribute list is non-existent or a shortform list,
>> -	 * upgrade it to a single-leaf-block attribute list.
>> -	 */
>> -	if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL ||
>> -	    (dp->i_d.di_aformat == XFS_DINODE_FMT_EXTENTS &&
>> -	     dp->i_d.di_anextents == 0)) {
>> -
>> -		/*
>> -		 * Build initial attribute list (if required).
>> -		 */
>> -		if (dp->i_d.di_aformat == XFS_DINODE_FMT_EXTENTS)
>> -			xfs_attr_shortform_create(&args);
>> -
>> -		/*
>> -		 * Try to add the attr to the attribute list in
>> -		 * the inode.
>> -		 */
>> -		error = xfs_attr_shortform_addname(&args);
>> -		if (error != -ENOSPC) {
>> -			/*
>> -			 * Commit the shortform mods, and we're done.
>> -			 * NOTE: this is also the error path (EEXIST, etc).
>> -			 */
>> -			ASSERT(args.trans != NULL);
>> -
>> -			/*
>> -			 * If this is a synchronous mount, make sure that
>> -			 * the transaction goes to disk before returning
>> -			 * to the user.
>> -			 */
>> -			if (mp->m_flags & XFS_MOUNT_WSYNC)
>> -				xfs_trans_set_sync(args.trans);
>> -
>> -			if (!error && (flags & ATTR_KERNOTIME) == 0) {
>> -				xfs_trans_ichgtime(args.trans, dp,
>> -							XFS_ICHGTIME_CHG);
>> -			}
>> -			err2 = xfs_trans_commit(args.trans);
>> -			xfs_iunlock(dp, XFS_ILOCK_EXCL);
>> -
>> -			return error ? error : err2;
>> -		}
>> -
>> -		/*
>> -		 * It won't fit in the shortform, transform to a leaf block.
>> -		 * GROT: another possible req'mt for a double-split btree op.
>> -		 */
>> -		xfs_defer_init(args.dfops, args.firstblock);
>> -		error = xfs_attr_shortform_to_leaf(&args);
>> -		if (error)
>> -			goto out_defer_cancel;
>> -		xfs_defer_ijoin(args.dfops, dp);
>> -		error = xfs_defer_finish(&args.trans, args.dfops);
>> -		if (error)
>> -			goto out_defer_cancel;
>> -
>> -		/*
>> -		 * Commit the leaf transformation.  We'll need another (linked)
>> -		 * transaction to add the new attribute to the leaf.
>> -		 */
>> -
>> -		error = xfs_trans_roll_inode(&args.trans, dp);
>> -		if (error)
>> -			goto out;
>> -
>> -	}
>> -
>> -	if (xfs_bmap_one_block(dp, XFS_ATTR_FORK))
>> -		error = xfs_attr_leaf_addname(&args);
>> -	else
>> -		error = xfs_attr_node_addname(&args);
>> +	xfs_defer_init(args.dfops, args.firstblock);
>> +	error = xfs_attr_set_args(&args, flags, true);
>>   	if (error)
>> -		goto out;
>> +		goto out_defer_cancel;
>>   
>>   	/*
>>   	 * If this is a synchronous mount, make sure that the
>> @@ -360,9 +443,6 @@ xfs_attr_set(
>>   	if (mp->m_flags & XFS_MOUNT_WSYNC)
>>   		xfs_trans_set_sync(args.trans);
>>   
>> -	if ((flags & ATTR_KERNOTIME) == 0)
>> -		xfs_trans_ichgtime(args.trans, dp, XFS_ICHGTIME_CHG);
>> -
>>   	/*
>>   	 * Commit the last in the sequence of transactions.
>>   	 */
>> @@ -374,10 +454,6 @@ xfs_attr_set(
>>   
>>   out_defer_cancel:
>>   	xfs_defer_cancel(&dfops);
>> -	args.trans = NULL;
>> -out:
>> -	if (args.trans)
>> -		xfs_trans_cancel(args.trans);
>>   	xfs_iunlock(dp, XFS_ILOCK_EXCL);
>>   	return error;
>>   }
>> @@ -417,38 +493,15 @@ xfs_attr_remove(
>>   	 */
>>   	args.op_flags = XFS_DA_OP_OKNOENT;
>>   
>> -	error = xfs_qm_dqattach(dp, 0);
>> -	if (error)
>> -		return error;
>> -
>> -	/*
>> -	 * Root fork attributes can use reserved data blocks for this
>> -	 * operation if necessary
>> -	 */
>> -	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_attrrm,
>> -			XFS_ATTRRM_SPACE_RES(mp), 0,
>> -			(flags & ATTR_ROOT) ? XFS_TRANS_RESERVE : 0,
>> -			&args.trans);
>> -	if (error)
>> -		return error;
>> -
>>   	xfs_ilock(dp, XFS_ILOCK_EXCL);
>>   	/*
>>   	 * No need to make quota reservations here. We expect to release some
>>   	 * blocks not allocate in the common case.
>>   	 */
>>   	xfs_trans_ijoin(args.trans, dp, 0);
>> +	xfs_defer_init(args.dfops, args.firstblock);
>>   
>> -	if (!xfs_inode_hasattr(dp)) {
>> -		error = -ENOATTR;
>> -	} else if (dp->i_d.di_aformat == XFS_DINODE_FMT_LOCAL) {
>> -		ASSERT(dp->i_afp->if_flags & XFS_IFINLINE);
>> -		error = xfs_attr_shortform_remove(&args);
>> -	} else if (xfs_bmap_one_block(dp, XFS_ATTR_FORK)) {
>> -		error = xfs_attr_leaf_removename(&args);
>> -	} else {
>> -		error = xfs_attr_node_removename(&args);
>> -	}
>> +	error = xfs_attr_remove_args(&args, flags);
>>   
>>   	if (error)
>>   		goto out;
>> @@ -460,9 +513,6 @@ xfs_attr_remove(
>>   	if (mp->m_flags & XFS_MOUNT_WSYNC)
>>   		xfs_trans_set_sync(args.trans);
>>   
>> -	if ((flags & ATTR_KERNOTIME) == 0)
>> -		xfs_trans_ichgtime(args.trans, dp, XFS_ICHGTIME_CHG);
>> -
>>   	/*
>>   	 * Commit the last in the sequence of transactions.
>>   	 */
>> @@ -473,6 +523,8 @@ xfs_attr_remove(
>>   	return error;
>>   
>>   out:
>> +	xfs_defer_cancel(&dfops);
>> +
>>   	if (args.trans)
>>   		xfs_trans_cancel(args.trans);
>>   	xfs_iunlock(dp, XFS_ILOCK_EXCL);
>> diff --git a/fs/xfs/xfs_attr.h b/fs/xfs/xfs_attr.h
>> index 5d5a5e2..8542606 100644
>> --- a/fs/xfs/xfs_attr.h
>> +++ b/fs/xfs/xfs_attr.h
>> @@ -149,7 +149,9 @@ int xfs_attr_get(struct xfs_inode *ip, const unsigned char *name,
>>   		 unsigned char *value, int *valuelenp, int flags);
>>   int xfs_attr_set(struct xfs_inode *dp, const unsigned char *name,
>>   		 unsigned char *value, int valuelen, int flags);
>> +int xfs_attr_set_args(struct xfs_da_args *args, int flags, bool roll_trans);
>>   int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name, int flags);
>> +int xfs_attr_remove_args(struct xfs_da_args *args, int flags);
>>   int xfs_attr_list(struct xfs_inode *dp, char *buffer, int bufsize,
>>   		  int flags, struct attrlist_cursor_kern *cursor);
>>   
>> -- 
>> 2.7.4
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
>> the body of a message tomajordomo@vger.kernel.org
>> More majordomo info athttps://urldefense.proofpoint.com/v2/url?u=http-3A__vger.kernel.org_majordomo-2Dinfo.html&d=DwIBAg&c=RoP1YumCXCgaWHvlZYR8PQcxBKCX5YTpkKY057SbK10&r=XFp4B05bcXkJ0dhYaFjd3F8telP01COkBp9cI7mKLb4&m=aicitOc3doix8VyB_l-FSSD7nPvIu7TIiw0VVfKnBhc&s=JKZ0ozAg9jwyNpJK-9-53aT_kpKIedFceJDGJpDQh3U&e=  
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message tomajordomo@vger.kernel.org
> More majordomo info athttps://urldefense.proofpoint.com/v2/url?u=http-3A__vger.kernel.org_majordomo-2Dinfo.html&d=DwIBAg&c=RoP1YumCXCgaWHvlZYR8PQcxBKCX5YTpkKY057SbK10&r=XFp4B05bcXkJ0dhYaFjd3F8telP01COkBp9cI7mKLb4&m=aicitOc3doix8VyB_l-FSSD7nPvIu7TIiw0VVfKnBhc&s=JKZ0ozAg9jwyNpJK-9-53aT_kpKIedFceJDGJpDQh3U&e=  


^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 00/17] Parent Pointers V3
  2017-10-20 22:41   ` Dave Chinner
@ 2017-10-21  7:34     ` Amir Goldstein
  2017-10-22 23:27       ` Dave Chinner
  0 siblings, 1 reply; 66+ messages in thread
From: Amir Goldstein @ 2017-10-21  7:34 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Allison Henderson, linux-xfs

On Sat, Oct 21, 2017 at 1:41 AM, Dave Chinner <david@fromorbit.com> wrote:
> On Thu, Oct 19, 2017 at 07:11:50AM +0300, Amir Goldstein wrote:
>> On Thu, Oct 19, 2017 at 1:55 AM, Allison Henderson
>> <allison.henderson@oracle.com> wrote:
>> > Hi all,
>> >
>> > This is the third version of parent pointer attributes for xfs.
>> > I've integrated the suggestions made since v2, mostly moving the
>> > attr buffers in the xfs_attr_log_item to pointers that point to
>> > xfs_attr_item. I've also implementing the recovery routines for
>> > the xfs_attr_log_format.  If I missed anything please point it
>> > out.  As always, comments and feedback are appreciated.  Thank
>> > you!
>> >
>>
>> A minor comment about the cover letter.
>> All designated reviewers must know exactly what "parent pointers" are for,
>> but it could be useful to add some context in the cover letter about the purpose
>> of this work for the sake of other readers on the list. Useful to refer to the
>> upcoming scrub support patches.
>>
>> BTW, not sure if this was mentioned in the previous lifetime of those
>> patches, but parent pointers can be used to implement exportfs operation
>> xfs_fs_fh_to_parent() for "non-connectable" file handles (FILEID_INO32_GEN)
>> and to implement xfs_fs_get_name(), which would make reconnect_path()
>> *much* more efficient.
>
> However, XFS only uses FILEID_INO32_GEN for directories
> because they have known parents. For them, we implement ->get_parent()
> and that means reconnect_path just does ->lookup("..") to find the
> parents and doesn't need anything special.
>
> We use FILEID_INO32_GEN_PARENT for all other types of files to
> encode the ino # + generation of the parent inode into the handle.
> That means for any non-dir file handle, our implemention of
> ->fh_to_parent  will get us the parent info as efficiently as
> possible.

We only encode FILEID_INO32_GEN_PARENT when we are asked for
a "connectable" file handle, which is NOT the case with name_to_handle_at(2)
and is the case with nfsd on when NFS share is exported with subtree_check
options, which AFAIK, is the less common case.

The question is, when we encode a non-connectable file handle
(FILEID_INO32_GEN), will nfsd benefit from getting a connected file handle
after decode? (result may always be connected if dentry is already in cache).


>
> IOWs, parent pointers won't actually speed up filehandle ->
> dentry reconnection on XFS at all because we already encode parent
> pointers into the filehandles that need them....

Look at default implementation of ->get_name() in expfs.c and you will
see why I wrote that xfs_fs_get_name() would make reconnect_path()
*much* more efficient, even for directories.

Cheers,
Amir.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 00/17] Parent Pointers V3
  2017-10-21  7:34     ` Amir Goldstein
@ 2017-10-22 23:27       ` Dave Chinner
  2017-10-23  4:30         ` Amir Goldstein
  0 siblings, 1 reply; 66+ messages in thread
From: Dave Chinner @ 2017-10-22 23:27 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Allison Henderson, linux-xfs

On Sat, Oct 21, 2017 at 10:34:30AM +0300, Amir Goldstein wrote:
> On Sat, Oct 21, 2017 at 1:41 AM, Dave Chinner <david@fromorbit.com> wrote:
> > On Thu, Oct 19, 2017 at 07:11:50AM +0300, Amir Goldstein wrote:
> >> On Thu, Oct 19, 2017 at 1:55 AM, Allison Henderson
> >> <allison.henderson@oracle.com> wrote:
> >> > Hi all,
> >> >
> >> > This is the third version of parent pointer attributes for xfs.
> >> > I've integrated the suggestions made since v2, mostly moving the
> >> > attr buffers in the xfs_attr_log_item to pointers that point to
> >> > xfs_attr_item. I've also implementing the recovery routines for
> >> > the xfs_attr_log_format.  If I missed anything please point it
> >> > out.  As always, comments and feedback are appreciated.  Thank
> >> > you!
> >> >
> >>
> >> A minor comment about the cover letter.
> >> All designated reviewers must know exactly what "parent pointers" are for,
> >> but it could be useful to add some context in the cover letter about the purpose
> >> of this work for the sake of other readers on the list. Useful to refer to the
> >> upcoming scrub support patches.
> >>
> >> BTW, not sure if this was mentioned in the previous lifetime of those
> >> patches, but parent pointers can be used to implement exportfs operation
> >> xfs_fs_fh_to_parent() for "non-connectable" file handles (FILEID_INO32_GEN)
> >> and to implement xfs_fs_get_name(), which would make reconnect_path()
> >> *much* more efficient.
> >
> > However, XFS only uses FILEID_INO32_GEN for directories
> > because they have known parents. For them, we implement ->get_parent()
> > and that means reconnect_path just does ->lookup("..") to find the
> > parents and doesn't need anything special.
> >
> > We use FILEID_INO32_GEN_PARENT for all other types of files to
> > encode the ino # + generation of the parent inode into the handle.
> > That means for any non-dir file handle, our implemention of
> > ->fh_to_parent  will get us the parent info as efficiently as
> > possible.
> 
> We only encode FILEID_INO32_GEN_PARENT when we are asked for
> a "connectable" file handle, which is NOT the case with name_to_handle_at(2)
> and is the case with nfsd on when NFS share is exported with subtree_check
> options, which AFAIK, is the less common case.
> 
> The question is, when we encode a non-connectable file handle
> (FILEID_INO32_GEN), will nfsd benefit from getting a connected file handle
> after decode? (result may always be connected if dentry is already in cache).
> 
> > IOWs, parent pointers won't actually speed up filehandle ->
> > dentry reconnection on XFS at all because we already encode parent
> > pointers into the filehandles that need them....
> 
> Look at default implementation of ->get_name() in expfs.c and you will
> see why I wrote that xfs_fs_get_name() would make reconnect_path()
> *much* more efficient, even for directories.

If you think it will be a win, then I'm not going to stop you from
implementing and benchmarking it to demonstrate how much faster it
is. Should be pretty simple to benchmark - stat a whole bunch of
files on an NFS client, drop server caches, stat the files again. If
it's a win, the server CPU and IO load should be lower, and the
client side stat rate should be faster....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 00/17] Parent Pointers V3
  2017-10-22 23:27       ` Dave Chinner
@ 2017-10-23  4:30         ` Amir Goldstein
  2017-10-23  5:32           ` Dave Chinner
  0 siblings, 1 reply; 66+ messages in thread
From: Amir Goldstein @ 2017-10-23  4:30 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Allison Henderson, linux-xfs

On Mon, Oct 23, 2017 at 2:27 AM, Dave Chinner <david@fromorbit.com> wrote:
> On Sat, Oct 21, 2017 at 10:34:30AM +0300, Amir Goldstein wrote:
>> On Sat, Oct 21, 2017 at 1:41 AM, Dave Chinner <david@fromorbit.com> wrote:
>> > On Thu, Oct 19, 2017 at 07:11:50AM +0300, Amir Goldstein wrote:
>> >> On Thu, Oct 19, 2017 at 1:55 AM, Allison Henderson
>> >> <allison.henderson@oracle.com> wrote:
>> >> > Hi all,
>> >> >
>> >> > This is the third version of parent pointer attributes for xfs.
>> >> > I've integrated the suggestions made since v2, mostly moving the
>> >> > attr buffers in the xfs_attr_log_item to pointers that point to
>> >> > xfs_attr_item. I've also implementing the recovery routines for
>> >> > the xfs_attr_log_format.  If I missed anything please point it
>> >> > out.  As always, comments and feedback are appreciated.  Thank
>> >> > you!
>> >> >
>> >>
>> >> A minor comment about the cover letter.
>> >> All designated reviewers must know exactly what "parent pointers" are for,
>> >> but it could be useful to add some context in the cover letter about the purpose
>> >> of this work for the sake of other readers on the list. Useful to refer to the
>> >> upcoming scrub support patches.
>> >>
>> >> BTW, not sure if this was mentioned in the previous lifetime of those
>> >> patches, but parent pointers can be used to implement exportfs operation
>> >> xfs_fs_fh_to_parent() for "non-connectable" file handles (FILEID_INO32_GEN)
>> >> and to implement xfs_fs_get_name(), which would make reconnect_path()
>> >> *much* more efficient.
>> >
>> > However, XFS only uses FILEID_INO32_GEN for directories
>> > because they have known parents. For them, we implement ->get_parent()
>> > and that means reconnect_path just does ->lookup("..") to find the
>> > parents and doesn't need anything special.
>> >
>> > We use FILEID_INO32_GEN_PARENT for all other types of files to
>> > encode the ino # + generation of the parent inode into the handle.
>> > That means for any non-dir file handle, our implemention of
>> > ->fh_to_parent  will get us the parent info as efficiently as
>> > possible.
>>
>> We only encode FILEID_INO32_GEN_PARENT when we are asked for
>> a "connectable" file handle, which is NOT the case with name_to_handle_at(2)
>> and is the case with nfsd on when NFS share is exported with subtree_check
>> options, which AFAIK, is the less common case.
>>
>> The question is, when we encode a non-connectable file handle
>> (FILEID_INO32_GEN), will nfsd benefit from getting a connected file handle
>> after decode? (result may always be connected if dentry is already in cache).
>>
>> > IOWs, parent pointers won't actually speed up filehandle ->
>> > dentry reconnection on XFS at all because we already encode parent
>> > pointers into the filehandles that need them....
>>
>> Look at default implementation of ->get_name() in expfs.c and you will
>> see why I wrote that xfs_fs_get_name() would make reconnect_path()
>> *much* more efficient, even for directories.
>
> If you think it will be a win, then I'm not going to stop you from
> implementing and benchmarking it to demonstrate how much faster it
> is. Should be pretty simple to benchmark - stat a whole bunch of
> files on an NFS client, drop server caches, stat the files again. If
> it's a win, the server CPU and IO load should be lower, and the
> client side stat rate should be faster....
>

Sure. In due time.

Mean while I would like to ask if there is a concrete reason for storing
diroffset to parent pointer of directory inodes, besides the obvious reason
of not special casing directory inodes.

If there is no such reason I would like to propose to store parent
pointers of directory inodes with an xattr name that is diroffset agnostic,
namely:

struct xfs_dir_parent_name_rec {
      __be64  p_ino;
      __be32  p_gen;
} __attribute__((packed));

It makes sense for metadata consistency anyway, but the reason I am
proposing this is so xfs_fs_get_name() could be implemented at O(1).

For those who are not familiar, the get_name() operation takes a
parent dentry and disconnected child dentry and returns a name
for the child, so that the disconnected dentry could be reconnected to the
dcache tree.

This operation is called (indirectly) from nfsd when decoding directory
file handles, because directory file handles must be fully connected to
dcache tree before nfsd can work with them.

Currently, the get_name() operation is implemented by few file systems,
but for the majority of file systems, the default implementation in
fs/exportfs/expfs.c is used, which iterates the parent directory entries
looking for a match to child inode number.

In the worst case, the default get_name() implementation is
O(<dir size>*<tree depth>) in metadata IO reads, which could be a lot
for very wide directory trees.

Implementing xfs_fs_get_name() as iteration over the inode's xattr looking
for a parent pointer that matches child ino/gen is a likely-win.
Implementing xfs_fs_get_name() as a single xattr_get of
xfs_dir_parent_name_rec is an doubtful win and I won't mind running
the benchmark to demonstrate it.

Cheers,
Amir.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 00/17] Parent Pointers V3
  2017-10-23  4:30         ` Amir Goldstein
@ 2017-10-23  5:32           ` Dave Chinner
  2017-10-23  6:48             ` Amir Goldstein
  0 siblings, 1 reply; 66+ messages in thread
From: Dave Chinner @ 2017-10-23  5:32 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Allison Henderson, linux-xfs

On Mon, Oct 23, 2017 at 07:30:01AM +0300, Amir Goldstein wrote:
> On Mon, Oct 23, 2017 at 2:27 AM, Dave Chinner <david@fromorbit.com> wrote:
> > On Sat, Oct 21, 2017 at 10:34:30AM +0300, Amir Goldstein wrote:
> >> On Sat, Oct 21, 2017 at 1:41 AM, Dave Chinner <david@fromorbit.com> wrote:
> >> > On Thu, Oct 19, 2017 at 07:11:50AM +0300, Amir Goldstein wrote:
> >> >> On Thu, Oct 19, 2017 at 1:55 AM, Allison Henderson
> >> >> <allison.henderson@oracle.com> wrote:
> >> >> > Hi all,
> >> >> >
> >> >> > This is the third version of parent pointer attributes for xfs.
> >> >> > I've integrated the suggestions made since v2, mostly moving the
> >> >> > attr buffers in the xfs_attr_log_item to pointers that point to
> >> >> > xfs_attr_item. I've also implementing the recovery routines for
> >> >> > the xfs_attr_log_format.  If I missed anything please point it
> >> >> > out.  As always, comments and feedback are appreciated.  Thank
> >> >> > you!
> >> >> >
> >> >>
> >> >> A minor comment about the cover letter.
> >> >> All designated reviewers must know exactly what "parent pointers" are for,
> >> >> but it could be useful to add some context in the cover letter about the purpose
> >> >> of this work for the sake of other readers on the list. Useful to refer to the
> >> >> upcoming scrub support patches.
> >> >>
> >> >> BTW, not sure if this was mentioned in the previous lifetime of those
> >> >> patches, but parent pointers can be used to implement exportfs operation
> >> >> xfs_fs_fh_to_parent() for "non-connectable" file handles (FILEID_INO32_GEN)
> >> >> and to implement xfs_fs_get_name(), which would make reconnect_path()
> >> >> *much* more efficient.
> >> >
> >> > However, XFS only uses FILEID_INO32_GEN for directories
> >> > because they have known parents. For them, we implement ->get_parent()
> >> > and that means reconnect_path just does ->lookup("..") to find the
> >> > parents and doesn't need anything special.
> >> >
> >> > We use FILEID_INO32_GEN_PARENT for all other types of files to
> >> > encode the ino # + generation of the parent inode into the handle.
> >> > That means for any non-dir file handle, our implemention of
> >> > ->fh_to_parent  will get us the parent info as efficiently as
> >> > possible.
> >>
> >> We only encode FILEID_INO32_GEN_PARENT when we are asked for
> >> a "connectable" file handle, which is NOT the case with name_to_handle_at(2)
> >> and is the case with nfsd on when NFS share is exported with subtree_check
> >> options, which AFAIK, is the less common case.
> >>
> >> The question is, when we encode a non-connectable file handle
> >> (FILEID_INO32_GEN), will nfsd benefit from getting a connected file handle
> >> after decode? (result may always be connected if dentry is already in cache).
> >>
> >> > IOWs, parent pointers won't actually speed up filehandle ->
> >> > dentry reconnection on XFS at all because we already encode parent
> >> > pointers into the filehandles that need them....
> >>
> >> Look at default implementation of ->get_name() in expfs.c and you will
> >> see why I wrote that xfs_fs_get_name() would make reconnect_path()
> >> *much* more efficient, even for directories.
> >
> > If you think it will be a win, then I'm not going to stop you from
> > implementing and benchmarking it to demonstrate how much faster it
> > is. Should be pretty simple to benchmark - stat a whole bunch of
> > files on an NFS client, drop server caches, stat the files again. If
> > it's a win, the server CPU and IO load should be lower, and the
> > client side stat rate should be faster....
> >
> 
> Sure. In due time.
> 
> Mean while I would like to ask if there is a concrete reason for storing
> diroffset to parent pointer of directory inodes, besides the obvious reason
> of not special casing directory inodes.

Uniquely identifying each parent pointer in the case of multiple
hard links to a single inode from a single directory.

There's a heap of historic information on the history of the
attribute format in this discussion:

http://oss.sgi.com/archives/xfs/2014-01/msg00224.html

To answer your question more specifically, the last part of this
post summarises the link disambiguation problem and proposes the
use of {ino,gen,diroffset} tuples to uniquely identify parent
directory entries:

http://oss.sgi.com/archives/xfs/2014-01/msg00263.html

> If there is no such reason I would like to propose to store parent
> pointers of directory inodes with an xattr name that is diroffset agnostic,
> namely:
> 
> struct xfs_dir_parent_name_rec {
>       __be64  p_ino;
>       __be32  p_gen;
> } __attribute__((packed));
> 
> It makes sense for metadata consistency anyway, but the reason I am
> proposing this is so xfs_fs_get_name() could be implemented at O(1).

Yup, but now consider unlinking an inode that has thousands of hard
links to it, and many of them exist in the same directory. How do we
find the right xattr to remove without a brute-force search of all
the parent pointers that point back to the same directory? i.e. We
have to retreive each one and compare the stored xattr value (i.e.
the filename) to the dirent we are unlinking. IOWs, we now need to
iterate and fetch attributes rather than just constructing a unique
handle and saying "remove!".

Also, we need unique identifiers for parent pointers so they are
useful for online scrub/repair:

http://oss.sgi.com/archives/xfs/2014-02/msg00090.html

Realistically, we have been designing this under the premise that
reverse name lookups are going to be rare compared to directory
modification operations. If we have to do iterative searches to
determine what xattr to remove, or even if a specific parent xattr
*exists*, then we've got a runtime overhead problem that every user
will see, not just those that use reverse lookups in the corner
cases that they are necessary.

If the filehandle doesn't contain all the info needed to do the
inode->name reverse lookup efficiently, then encode that info into
the filehandle for parent pointer enabled filesystems before handing
it out (e.g. add "diroff" to the ino/gen we encode) so that it's
available when necessary on the decode side....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 00/17] Parent Pointers V3
  2017-10-23  5:32           ` Dave Chinner
@ 2017-10-23  6:48             ` Amir Goldstein
  2017-10-23  8:40               ` Dave Chinner
  0 siblings, 1 reply; 66+ messages in thread
From: Amir Goldstein @ 2017-10-23  6:48 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Allison Henderson, linux-xfs

On Mon, Oct 23, 2017 at 8:32 AM, Dave Chinner <david@fromorbit.com> wrote:
> On Mon, Oct 23, 2017 at 07:30:01AM +0300, Amir Goldstein wrote:
>> On Mon, Oct 23, 2017 at 2:27 AM, Dave Chinner <david@fromorbit.com> wrote:
>> > On Sat, Oct 21, 2017 at 10:34:30AM +0300, Amir Goldstein wrote:
>> >> On Sat, Oct 21, 2017 at 1:41 AM, Dave Chinner <david@fromorbit.com> wrote:
>> >> > On Thu, Oct 19, 2017 at 07:11:50AM +0300, Amir Goldstein wrote:
>> >> >> On Thu, Oct 19, 2017 at 1:55 AM, Allison Henderson
>> >> >> <allison.henderson@oracle.com> wrote:
>> >> >> > Hi all,
>> >> >> >
>> >> >> > This is the third version of parent pointer attributes for xfs.
>> >> >> > I've integrated the suggestions made since v2, mostly moving the
>> >> >> > attr buffers in the xfs_attr_log_item to pointers that point to
>> >> >> > xfs_attr_item. I've also implementing the recovery routines for
>> >> >> > the xfs_attr_log_format.  If I missed anything please point it
>> >> >> > out.  As always, comments and feedback are appreciated.  Thank
>> >> >> > you!
>> >> >> >
>> >> >>
>> >> >> A minor comment about the cover letter.
>> >> >> All designated reviewers must know exactly what "parent pointers" are for,
>> >> >> but it could be useful to add some context in the cover letter about the purpose
>> >> >> of this work for the sake of other readers on the list. Useful to refer to the
>> >> >> upcoming scrub support patches.
>> >> >>
>> >> >> BTW, not sure if this was mentioned in the previous lifetime of those
>> >> >> patches, but parent pointers can be used to implement exportfs operation
>> >> >> xfs_fs_fh_to_parent() for "non-connectable" file handles (FILEID_INO32_GEN)
>> >> >> and to implement xfs_fs_get_name(), which would make reconnect_path()
>> >> >> *much* more efficient.
>> >> >
>> >> > However, XFS only uses FILEID_INO32_GEN for directories
>> >> > because they have known parents. For them, we implement ->get_parent()
>> >> > and that means reconnect_path just does ->lookup("..") to find the
>> >> > parents and doesn't need anything special.
>> >> >
>> >> > We use FILEID_INO32_GEN_PARENT for all other types of files to
>> >> > encode the ino # + generation of the parent inode into the handle.
>> >> > That means for any non-dir file handle, our implemention of
>> >> > ->fh_to_parent  will get us the parent info as efficiently as
>> >> > possible.
>> >>
>> >> We only encode FILEID_INO32_GEN_PARENT when we are asked for
>> >> a "connectable" file handle, which is NOT the case with name_to_handle_at(2)
>> >> and is the case with nfsd on when NFS share is exported with subtree_check
>> >> options, which AFAIK, is the less common case.
>> >>
>> >> The question is, when we encode a non-connectable file handle
>> >> (FILEID_INO32_GEN), will nfsd benefit from getting a connected file handle
>> >> after decode? (result may always be connected if dentry is already in cache).
>> >>
>> >> > IOWs, parent pointers won't actually speed up filehandle ->
>> >> > dentry reconnection on XFS at all because we already encode parent
>> >> > pointers into the filehandles that need them....
>> >>
>> >> Look at default implementation of ->get_name() in expfs.c and you will
>> >> see why I wrote that xfs_fs_get_name() would make reconnect_path()
>> >> *much* more efficient, even for directories.
>> >
>> > If you think it will be a win, then I'm not going to stop you from
>> > implementing and benchmarking it to demonstrate how much faster it
>> > is. Should be pretty simple to benchmark - stat a whole bunch of
>> > files on an NFS client, drop server caches, stat the files again. If
>> > it's a win, the server CPU and IO load should be lower, and the
>> > client side stat rate should be faster....
>> >
>>
>> Sure. In due time.
>>
>> Mean while I would like to ask if there is a concrete reason for storing
>> diroffset to parent pointer of directory inodes, besides the obvious reason
>> of not special casing directory inodes.
>
> Uniquely identifying each parent pointer in the case of multiple
> hard links to a single inode from a single directory.
>
> There's a heap of historic information on the history of the
> attribute format in this discussion:
>
> http://oss.sgi.com/archives/xfs/2014-01/msg00224.html
>
> To answer your question more specifically, the last part of this
> post summarises the link disambiguation problem and proposes the
> use of {ino,gen,diroffset} tuples to uniquely identify parent
> directory entries:
>
> http://oss.sgi.com/archives/xfs/2014-01/msg00263.html
>
>> If there is no such reason I would like to propose to store parent
>> pointers of directory inodes with an xattr name that is diroffset agnostic,
>> namely:
>>
>> struct xfs_dir_parent_name_rec {
>>       __be64  p_ino;
>>       __be32  p_gen;
>> } __attribute__((packed));
>>
>> It makes sense for metadata consistency anyway, but the reason I am
>> proposing this is so xfs_fs_get_name() could be implemented at O(1).
>
> Yup, but now consider unlinking an inode that has thousands of hard
> links to it, and many of them exist in the same directory. How do we
> find the right xattr to remove without a brute-force search of all
> the parent pointers that point back to the same directory? i.e. We
> have to retreive each one and compare the stored xattr value (i.e.
> the filename) to the dirent we are unlinking. IOWs, we now need to
> iterate and fetch attributes rather than just constructing a unique
> handle and saying "remove!".
>
> Also, we need unique identifiers for parent pointers so they are
> useful for online scrub/repair:
>
> http://oss.sgi.com/archives/xfs/2014-02/msg00090.html
>
> Realistically, we have been designing this under the premise that
> reverse name lookups are going to be rare compared to directory
> modification operations. If we have to do iterative searches to
> determine what xattr to remove, or even if a specific parent xattr
> *exists*, then we've got a runtime overhead problem that every user
> will see, not just those that use reverse lookups in the corner
> cases that they are necessary.
>
> If the filehandle doesn't contain all the info needed to do the
> inode->name reverse lookup efficiently, then encode that info into
> the filehandle for parent pointer enabled filesystems before handing
> it out (e.g. add "diroff" to the ino/gen we encode) so that it's
> available when necessary on the decode side....
>

That is not possible, because file handle encoding of directories
must be unique identified of the object, meaning encoding cannot change
after rename.

But you missed my question completely. I understand why diroffset
is needed for non-dir. But directories are not allowed to have 2 differnt
parents nor 2 entries in the parent dir, so I was asking if we can
special case parent pointer stored in *directory inodes*.

Am I missing something?

Amir.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 00/17] Parent Pointers V3
  2017-10-23  6:48             ` Amir Goldstein
@ 2017-10-23  8:40               ` Dave Chinner
  2017-10-23  9:06                 ` Amir Goldstein
  0 siblings, 1 reply; 66+ messages in thread
From: Dave Chinner @ 2017-10-23  8:40 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Allison Henderson, linux-xfs

On Mon, Oct 23, 2017 at 09:48:24AM +0300, Amir Goldstein wrote:
> On Mon, Oct 23, 2017 at 8:32 AM, Dave Chinner <david@fromorbit.com> wrote:
> > On Mon, Oct 23, 2017 at 07:30:01AM +0300, Amir Goldstein wrote:
> >> If there is no such reason I would like to propose to store parent
> >> pointers of directory inodes with an xattr name that is diroffset agnostic,
> >> namely:
> >>
> >> struct xfs_dir_parent_name_rec {
> >>       __be64  p_ino;
> >>       __be32  p_gen;
> >> } __attribute__((packed));
> >>
> >> It makes sense for metadata consistency anyway, but the reason I am
> >> proposing this is so xfs_fs_get_name() could be implemented at O(1).

[...]

> But you missed my question completely. I understand why diroffset
> is needed for non-dir. But directories are not allowed to have 2 differnt
> parents nor 2 entries in the parent dir, so I was asking if we can
> special case parent pointer stored in *directory inodes*.

I did answer it - not directly, but the answers are there.

> Am I missing something?

Because you are just looking at it from a "reverse lookup"
perspective, I suspect you're not seeing the "detect and
reconstruct" broken directory structure side of the picture.

The directory offset is redundant information which needed to fully
cross-check and, if necessary, reconstruct broken direcotry
structures. E.g. if a directory is missing a block due to a bad
sector, we can reconstruct that block exactly from the parent
pointer information because we know exactly what inodes had dirents
in the block that was lost, and we know exactly what order they
appeared in the directory block...

If we don't have that info for all child inodes (directory or
regular file), we can't tell the difference between "lost/stale
child that we should ignore" and "child that was referenced by the
block we lost". So even directories need to have the diroffset in
their parent pointer....

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 00/17] Parent Pointers V3
  2017-10-23  8:40               ` Dave Chinner
@ 2017-10-23  9:06                 ` Amir Goldstein
  2017-10-23 17:14                   ` Darrick J. Wong
  0 siblings, 1 reply; 66+ messages in thread
From: Amir Goldstein @ 2017-10-23  9:06 UTC (permalink / raw)
  To: Dave Chinner; +Cc: Allison Henderson, linux-xfs

On Mon, Oct 23, 2017 at 11:40 AM, Dave Chinner <david@fromorbit.com> wrote:
> On Mon, Oct 23, 2017 at 09:48:24AM +0300, Amir Goldstein wrote:
>> On Mon, Oct 23, 2017 at 8:32 AM, Dave Chinner <david@fromorbit.com> wrote:
>> > On Mon, Oct 23, 2017 at 07:30:01AM +0300, Amir Goldstein wrote:
>> >> If there is no such reason I would like to propose to store parent
>> >> pointers of directory inodes with an xattr name that is diroffset agnostic,
>> >> namely:
>> >>
>> >> struct xfs_dir_parent_name_rec {
>> >>       __be64  p_ino;
>> >>       __be32  p_gen;
>> >> } __attribute__((packed));
>> >>
>> >> It makes sense for metadata consistency anyway, but the reason I am
>> >> proposing this is so xfs_fs_get_name() could be implemented at O(1).
>
> [...]
>
>> But you missed my question completely. I understand why diroffset
>> is needed for non-dir. But directories are not allowed to have 2 differnt
>> parents nor 2 entries in the parent dir, so I was asking if we can
>> special case parent pointer stored in *directory inodes*.
>
> I did answer it - not directly, but the answers are there.
>
>> Am I missing something?
>
> Because you are just looking at it from a "reverse lookup"
> perspective, I suspect you're not seeing the "detect and
> reconstruct" broken directory structure side of the picture.
>
> The directory offset is redundant information which needed to fully
> cross-check and, if necessary, reconstruct broken direcotry
> structures. E.g. if a directory is missing a block due to a bad
> sector, we can reconstruct that block exactly from the parent
> pointer information because we know exactly what inodes had dirents
> in the block that was lost, and we know exactly what order they
> appeared in the directory block...
>
> If we don't have that info for all child inodes (directory or
> regular file), we can't tell the difference between "lost/stale
> child that we should ignore" and "child that was referenced by the
> block we lost". So even directories need to have the diroffset in
> their parent pointer....
>

I see. Thanks for explaining that point.
In that case, I would like to re-phrase my proposal to store parent
diroffset in the value rather than in the key of xattr, i.e.:
        name={parent inode #, parent inode generation}
        value={dirent filename, dirent offset}
OR   value={dirent offset, dirent filename}

I guess you do see the value of my proposal to the "reverse lookup"
workload. The question is what is the cost (in code complexity/
maintainability) of special casing directory parent pointer format
and whether it is worth the benefits of "reverse lookup" performance.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 00/17] Parent Pointers V3
  2017-10-23  9:06                 ` Amir Goldstein
@ 2017-10-23 17:14                   ` Darrick J. Wong
  2017-10-23 19:20                     ` Amir Goldstein
  0 siblings, 1 reply; 66+ messages in thread
From: Darrick J. Wong @ 2017-10-23 17:14 UTC (permalink / raw)
  To: Amir Goldstein; +Cc: Dave Chinner, Allison Henderson, linux-xfs

On Mon, Oct 23, 2017 at 12:06:20PM +0300, Amir Goldstein wrote:
> On Mon, Oct 23, 2017 at 11:40 AM, Dave Chinner <david@fromorbit.com> wrote:
> > On Mon, Oct 23, 2017 at 09:48:24AM +0300, Amir Goldstein wrote:
> >> On Mon, Oct 23, 2017 at 8:32 AM, Dave Chinner <david@fromorbit.com> wrote:
> >> > On Mon, Oct 23, 2017 at 07:30:01AM +0300, Amir Goldstein wrote:
> >> >> If there is no such reason I would like to propose to store parent
> >> >> pointers of directory inodes with an xattr name that is diroffset agnostic,
> >> >> namely:
> >> >>
> >> >> struct xfs_dir_parent_name_rec {
> >> >>       __be64  p_ino;
> >> >>       __be32  p_gen;
> >> >> } __attribute__((packed));
> >> >>
> >> >> It makes sense for metadata consistency anyway, but the reason I am
> >> >> proposing this is so xfs_fs_get_name() could be implemented at O(1).
> >
> > [...]
> >
> >> But you missed my question completely. I understand why diroffset
> >> is needed for non-dir. But directories are not allowed to have 2 differnt
> >> parents nor 2 entries in the parent dir, so I was asking if we can
> >> special case parent pointer stored in *directory inodes*.
> >
> > I did answer it - not directly, but the answers are there.
> >
> >> Am I missing something?
> >
> > Because you are just looking at it from a "reverse lookup"
> > perspective, I suspect you're not seeing the "detect and
> > reconstruct" broken directory structure side of the picture.
> >
> > The directory offset is redundant information which needed to fully
> > cross-check and, if necessary, reconstruct broken direcotry
> > structures. E.g. if a directory is missing a block due to a bad
> > sector, we can reconstruct that block exactly from the parent
> > pointer information because we know exactly what inodes had dirents
> > in the block that was lost, and we know exactly what order they
> > appeared in the directory block...
> >
> > If we don't have that info for all child inodes (directory or
> > regular file), we can't tell the difference between "lost/stale
> > child that we should ignore" and "child that was referenced by the
> > block we lost". So even directories need to have the diroffset in
> > their parent pointer....
> >
> 
> I see. Thanks for explaining that point.
> In that case, I would like to re-phrase my proposal to store parent
> diroffset in the value rather than in the key of xattr, i.e.:
>         name={parent inode #, parent inode generation}
>         value={dirent filename, dirent offset}
> OR   value={dirent offset, dirent filename}

Not possible, because the attribute name must be unique:

$ cd /root
$ echo hi > a
$ ln a b

Produces in a:

(rootino, rootinogen) -> (0, 'a')
(rootino, rootinogen) -> (1, 'b')

We must store both pptrs, but we're not allowed to have duplicate attr
names.

In reference to where this thread went over the weekend -- I don't look
favorably on the idea of having two different parent pointer disk
formats -- there has to be a very good justification for forcing
everyone to remember which format applies in which cases.

--D

> 
> I guess you do see the value of my proposal to the "reverse lookup"
> workload. The question is what is the cost (in code complexity/
> maintainability) of special casing directory parent pointer format
> and whether it is worth the benefits of "reverse lookup" performance.
> 
> Thanks,
> Amir.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 00/17] Parent Pointers V3
  2017-10-23 17:14                   ` Darrick J. Wong
@ 2017-10-23 19:20                     ` Amir Goldstein
  0 siblings, 0 replies; 66+ messages in thread
From: Amir Goldstein @ 2017-10-23 19:20 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Dave Chinner, Allison Henderson, linux-xfs

On Mon, Oct 23, 2017 at 8:14 PM, Darrick J. Wong
<darrick.wong@oracle.com> wrote:
> On Mon, Oct 23, 2017 at 12:06:20PM +0300, Amir Goldstein wrote:
>> On Mon, Oct 23, 2017 at 11:40 AM, Dave Chinner <david@fromorbit.com> wrote:
>> > On Mon, Oct 23, 2017 at 09:48:24AM +0300, Amir Goldstein wrote:
>> >> On Mon, Oct 23, 2017 at 8:32 AM, Dave Chinner <david@fromorbit.com> wrote:
>> >> > On Mon, Oct 23, 2017 at 07:30:01AM +0300, Amir Goldstein wrote:
>> >> >> If there is no such reason I would like to propose to store parent
>> >> >> pointers of directory inodes with an xattr name that is diroffset agnostic,
>> >> >> namely:
>> >> >>
>> >> >> struct xfs_dir_parent_name_rec {
>> >> >>       __be64  p_ino;
>> >> >>       __be32  p_gen;
>> >> >> } __attribute__((packed));
>> >> >>
>> >> >> It makes sense for metadata consistency anyway, but the reason I am
>> >> >> proposing this is so xfs_fs_get_name() could be implemented at O(1).
>> >
>> > [...]
>> >
>> >> But you missed my question completely. I understand why diroffset
>> >> is needed for non-dir. But directories are not allowed to have 2 differnt
>> >> parents nor 2 entries in the parent dir, so I was asking if we can
>> >> special case parent pointer stored in *directory inodes*.
>> >
>> > I did answer it - not directly, but the answers are there.
>> >
>> >> Am I missing something?
>> >
>> > Because you are just looking at it from a "reverse lookup"
>> > perspective, I suspect you're not seeing the "detect and
>> > reconstruct" broken directory structure side of the picture.
>> >
>> > The directory offset is redundant information which needed to fully
>> > cross-check and, if necessary, reconstruct broken direcotry
>> > structures. E.g. if a directory is missing a block due to a bad
>> > sector, we can reconstruct that block exactly from the parent
>> > pointer information because we know exactly what inodes had dirents
>> > in the block that was lost, and we know exactly what order they
>> > appeared in the directory block...
>> >
>> > If we don't have that info for all child inodes (directory or
>> > regular file), we can't tell the difference between "lost/stale
>> > child that we should ignore" and "child that was referenced by the
>> > block we lost". So even directories need to have the diroffset in
>> > their parent pointer....
>> >
>>
>> I see. Thanks for explaining that point.
>> In that case, I would like to re-phrase my proposal to store parent
>> diroffset in the value rather than in the key of xattr, i.e.:
>>         name={parent inode #, parent inode generation}
>>         value={dirent filename, dirent offset}
>> OR   value={dirent offset, dirent filename}
>
> Not possible, because the attribute name must be unique:

Yes, I'm well aware of that, only the uniqueness is guaranties for
directories even without diroffset.

>
> $ cd /root
> $ echo hi > a
> $ ln a b
>
> Produces in a:
>
> (rootino, rootinogen) -> (0, 'a')
> (rootino, rootinogen) -> (1, 'b')
>
> We must store both pptrs, but we're not allowed to have duplicate attr
> names.
>
> In reference to where this thread went over the weekend -- I don't look
> favorably on the idea of having two different parent pointer disk
> formats -- there has to be a very good justification for forcing
> everyone to remember which format applies in which cases.
>

Yes. I can relate to that.

Anyway, I have a workload that may be sensitive to dentry reconnect
performance on large dir trees.
After parent pointers settle down, I'll run a benchmark to see how much
get_name() with xattr iteration improves performance.

Thanks,
Amir.

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 02/17] Set up infastructure for deferred attribute operations
  2017-10-09 22:51       ` Allison Henderson
@ 2017-10-09 23:27         ` Dave Chinner
  0 siblings, 0 replies; 66+ messages in thread
From: Dave Chinner @ 2017-10-09 23:27 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs

On Mon, Oct 09, 2017 at 03:51:42PM -0700, Allison Henderson wrote:
> On 10/09/2017 02:25 PM, Allison Henderson wrote:
> >On 10/8/2017 9:20 PM, Dave Chinner wrote:
> >>On Fri, Oct 06, 2017 at 03:05:33PM -0700, Allison Henderson wrote:
> >>>Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> >>>    /*
> >>>   * Inode Log Item Format definitions.
> >>>@@ -863,4 +872,45 @@ struct xfs_icreate_log {
> >>>      __be32        icl_gen;    /* inode generation number to use */
> >>>  };
> >>>  +/* Flags for defered attribute operations */
> >>>+#define ATTR_OP_FLAGS_SET    0x01    /* Set the attribute */
> >>>+#define ATTR_OP_FLAGS_REMOVE    0x02    /* Remove the attribute */
> >>>+#define ATTR_OP_FLAGS_MAX    0x02    /* Max flags */
> >>>+
> >>>+/*
> >>>+ * ATTRI/ATTRD log format definitions
> >>>+ */
> >>>+struct xfs_attr {
> >>>+    xfs_ino_t    attr_ino;
> >>>+    uint32_t    attr_op_flags;
> >>>+    uint32_t    attr_nameval_len;
> >>>+    uint32_t    attr_name_len;
> >>>+    uint32_t        attr_flags;
> >>>+    uint8_t        attr_nameval[MAX_NAMEVAL_LEN];
> >>>+};
> >>
> >>"struct xfs_attr" is very generic. This ends up in the log on disk,
> >>right? So it's a log format structure? struct xfs_attr_log_format?
> >>
> >>This also needs padding to ensure it's size is 64bit aligned.
> >>
> Hmm, if this structure is meant to be stored on disk, is it really a
> good idea to put pointers in here as you mentioned in your
> conclusion below?  If we get remounted or rebooted and loose the
> pointer that may not work.  Or am I not understanding what you
> meant?

The log format structure on disk needs to be encoded correctly - it
will contain data, not pointers. The in-memory log item will contain
the pointers to the xattr name/value buffers. i.e.

struct xfs_attr_log_item {
	struct xfs_log_item	item;
	....
	int			namelen;
	void			*name_buf;
	int			valuelen;
	void			*value_buf;
	struct xfs_attr_log_format alf;
};

And the log format structure looks like:

struct xfs_attr_log_format {
	be64		ino;
	be32		gen;
	be32		intent_id;
	be32		op_flags;
	be32		attr_flags;
	be32		namelen;
	be32		valuelen;
	/* name gets copied here into first trailing log iovec */
	/* value gets copied here into second trailing log iovec */
};

note that the intent id is used by recovery to make the intent item
to the done item - that's probably what the "id" variable in the
original structure you had was supposed to be used for. :P

So essentially the ->item_size code does:

	/* log format header */
	nvecs++;
	nbytes += sizeof(struct xfs_attr_log_format);
	if (opflags == remove ||
	    opflags == flipflags)
		return;

	/* add/set */
	nvecs += 2;
	nbytes += namelen;
	nbytes += valuelen;
	return;

And the ->item_format code does something like:

	xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTR_FORMAT,
	                &alip->alf, sizeof(struct xfs_attr_log_format));

	if (opflags == remove ||
	    opflags == flipflags)
		return;

	xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTR_NAME,
	                alip->name_buf, alip->namelen);
	xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTR_VALUE,
	                alip->value_buf, alip->valuelen);

Make a bit more sense now?

I suspect taht because we won't ever relog a "set" intent, we can
clear the name_buf/value_buf pointers from the log item once we've
copied them into the log vector. That means we can use pointers to
the buffers containing the name and value already allocated in the
attr code and so won't need to allocate another set of buffers just
for the log item.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 02/17] Set up infastructure for deferred attribute operations
  2017-10-09 21:25     ` Allison Henderson
@ 2017-10-09 22:51       ` Allison Henderson
  2017-10-09 23:27         ` Dave Chinner
  0 siblings, 1 reply; 66+ messages in thread
From: Allison Henderson @ 2017-10-09 22:51 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs



On 10/09/2017 02:25 PM, Allison Henderson wrote:
>
>
> On 10/8/2017 9:20 PM, Dave Chinner wrote:
>>
>> On Fri, Oct 06, 2017 at 03:05:33PM -0700, Allison Henderson wrote:
>>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
>> hi Allison,
>>
>> I'm just having a quick browse, not a complete review. This is a
>> really good start for deferred attributes, but I think there's bits
>> we'll have to redesign slightly for performance reasons.
>>
>> First: needs a commit message to describe the design and structure,
>> so the reviewer is not left to guess. :P
>>
>>> ---
>>> :100644 100644 b3edd66... 8d2c152... M    fs/xfs/Makefile
>>> :100644 100644 b00ec1f... 5325ec2... M fs/xfs/libxfs/xfs_attr.c
>>> :100644 100644 2ea26b1... 38ae64a... M fs/xfs/libxfs/xfs_attr_leaf.c
>>> :100644 100644 d4f046d... ef0f8bf... M fs/xfs/libxfs/xfs_defer.h
>>> :100644 100644 8372e9b... 3778c8e... M fs/xfs/libxfs/xfs_log_format.h
>>> :100644 100644 0220159... 5372063... M fs/xfs/libxfs/xfs_types.h
>>> :100644 100644 8542606... 06c4081... M    fs/xfs/xfs_attr.h
>>> :000000 100644 0000000... 419f90a... A fs/xfs/xfs_attr_item.c
>>> :000000 100644 0000000... aec854f... A fs/xfs/xfs_attr_item.h
>>> :100644 100644 d9a3a55... a206d51... M    fs/xfs/xfs_super.c
>>> :100644 100644 815b53d2.. 66c3c5f... M    fs/xfs/xfs_trans.h
>>> :000000 100644 0000000... 183c841... A fs/xfs/xfs_trans_attr.c
>>
>> This info isn't needed. Diffstat is sufficient.
>>
>>> @@ -254,7 +261,9 @@ typedef struct xfs_trans_header {
>>>       { XFS_LI_CUI,        "XFS_LI_CUI" }, \
>>>       { XFS_LI_CUD,        "XFS_LI_CUD" }, \
>>>       { XFS_LI_BUI,        "XFS_LI_BUI" }, \
>>> -    { XFS_LI_BUD,        "XFS_LI_BUD" }
>>> +    { XFS_LI_BUD,        "XFS_LI_BUD" }, \
>>> +    { XFS_LI_ATTRI,        "XFS_LI_ATTRI" }, \
>>> +    { XFS_LI_ATTRD,        "XFS_LI_ATTRD" }
>>
>> "attr intent", "attr done"?
>>
>> What object/action are we taking here? Set, flip-flags or remove? Or
>> something else?
>>
>
> Yes, "intent" and "done" was the idea I was going for. The actions are 
> set and remove. The info needed for both operations seemed similar 
> enough that it seemed excessive to make another intent/done type.  The 
> xfs_attr struct has an attr_op_flags that marks it as a set or a 
> remove action.  I will add some comments in the code to help clarify.
>
>
>>>     /*
>>>    * Inode Log Item Format definitions.
>>> @@ -863,4 +872,45 @@ struct xfs_icreate_log {
>>>       __be32        icl_gen;    /* inode generation number to use */
>>>   };
>>>   +/* Flags for defered attribute operations */
>>> +#define ATTR_OP_FLAGS_SET    0x01    /* Set the attribute */
>>> +#define ATTR_OP_FLAGS_REMOVE    0x02    /* Remove the attribute */
>>> +#define ATTR_OP_FLAGS_MAX    0x02    /* Max flags */
>>> +
>>> +/*
>>> + * ATTRI/ATTRD log format definitions
>>> + */
>>> +struct xfs_attr {
>>> +    xfs_ino_t    attr_ino;
>>> +    uint32_t    attr_op_flags;
>>> +    uint32_t    attr_nameval_len;
>>> +    uint32_t    attr_name_len;
>>> +    uint32_t        attr_flags;
>>> +    uint8_t        attr_nameval[MAX_NAMEVAL_LEN];
>>> +};
>>
>> "struct xfs_attr" is very generic. This ends up in the log on disk,
>> right? So it's a log format structure? struct xfs_attr_log_format?
>>
>> This also needs padding to ensure it's size is 64bit aligned.
>>
Hmm, if this structure is meant to be stored on disk, is it really a 
good idea to put pointers in here as you mentioned in your conclusion 
below?  If we get remounted or rebooted and loose the pointer that may 
not work.  Or am I not understanding what you meant?

>>> +/*
>>> + * This is the structure used to lay out an attri log item in the
>>> + * log.  The attri_attrs field is a variable size array whose
>>> + * size is given by attri_nattrs.
>>> + */
>>> +struct xfs_attri_log_format {
>>> +    uint16_t        attri_type;    /* attri log item type */
>>> +    uint16_t        attri_size;    /* size of this item */
>>> +    uint64_t        attri_id;    /* attri identifier */
>>> +    struct xfs_attr        attri_attr;    /* attribute */
>>> +};
>>
>> That's got a 4 byte hole in it between attri_size and attri_id,
>> so needs explicit padding. What's attri_id supposed to be and how is
>> it used?
>>
>> Also, i'd drop the "attri" from these, so.....
>
>
> Hmm, I don't think attri_id is used.  I had used the extent free 
> intent code as a sort of template for this and probably missed culling 
> out the id.  I will get this struct cleaned up and padded out.
>
>>
>>> +
>>> +/*
>>> + * This is the structure used to lay out an attrd log item in the
>>> + * log.  The attrd_attrs array is a variable size array whose
>>> + * size is given by attrd_nattrs;
>>> + */
>>> +struct xfs_attrd_log_format {
>>> +    uint16_t        attrd_type;    /* attrd log item type */
>>> +    uint16_t        attrd_size;    /* size of this item */
>>> +    uint64_t        attrd_attri_id;    /* id of corresponding attri */
>>> +    struct xfs_attr        attrd_attr;    /* attribute */
>>> +};
>>
>> .... these can use the same struct xfs_attr_log_format structure.
>>
>
> Alrighty
>
>>>   #endif /* __XFS_LOG_FORMAT_H__ */
>>> diff --git a/fs/xfs/libxfs/xfs_types.h b/fs/xfs/libxfs/xfs_types.h
>>> index 0220159..5372063 100644
>>> --- a/fs/xfs/libxfs/xfs_types.h
>>> +++ b/fs/xfs/libxfs/xfs_types.h
>>> @@ -23,6 +23,7 @@ typedef uint32_t    prid_t;        /* project ID */
>>>   typedef uint32_t    xfs_agblock_t;    /* blockno in alloc. group */
>>>   typedef uint32_t    xfs_agino_t;    /* inode # within allocation 
>>> grp */
>>>   typedef uint32_t    xfs_extlen_t;    /* extent length in blocks */
>>> +typedef uint32_t    xfs_attrlen_t;    /* attr length */
>>>   typedef uint32_t    xfs_agnumber_t;    /* allocation group number */
>>>   typedef int32_t        xfs_extnum_t;    /* # of extents in a file */
>>>   typedef int16_t        xfs_aextnum_t;    /* # extents in an 
>>> attribute fork */
>>> diff --git a/fs/xfs/xfs_attr.h b/fs/xfs/xfs_attr.h
>>> index 8542606..06c4081 100644
>>> --- a/fs/xfs/xfs_attr.h
>>> +++ b/fs/xfs/xfs_attr.h
>>> @@ -18,6 +18,8 @@
>>>   #ifndef __XFS_ATTR_H__
>>>   #define    __XFS_ATTR_H__
>>>   +#include "libxfs/xfs_defer.h"
>>> +
>>>   struct xfs_inode;
>>>   struct xfs_da_args;
>>>   struct xfs_attr_list_context;
>>> @@ -65,6 +67,10 @@ struct xfs_attr_list_context;
>>>    */
>>>   #define    ATTR_MAX_VALUELEN    (64*1024)    /* max length of a 
>>> value */
>>>   +/* Max name length in the xfs_attr_item */
>>> +#define MAX_NAME_LEN        255
>>
>> Should be defined in xfs_da_format.h where the entries and
>> name length types are defined. SHould also try to derive it from
>> the namelen variable of one of the types rather than hard code it.
>>
>>> +#define MAX_NAMEVAL_LEN (MAX_NAME_LEN + ATTR_MAX_VALUELEN)
>>
>> as should this, I think.
>>> +
>>>   /*
>>>    * Define how lists of attribute names are returned to the user from
>>>    * the attr_list() call.  A large, 32bit aligned, buffer is passed in
>>> @@ -87,6 +93,19 @@ typedef struct attrlist_ent {    /* data from 
>>> attr_list() */
>>>   } attrlist_ent_t;
>>>     /*
>>> + * List of attrs to commit later.
>>> + */
>>> +struct xfs_attr_item {
>>> +    xfs_ino_t      xattri_ino;
>>> +    uint32_t      xattri_op_flags;
>>> +    uint32_t      xattri_nameval_len; /* length of name and val */
>>> +    uint32_t      xattri_name_len;    /* length of name */
>>> +    uint32_t      xattri_flags;       /* attr flags */
>>> +    char          xattri_nameval[MAX_NAMEVAL_LEN];
>>> +    struct list_head  xattri_list;
>>> +};
>>
>> Ok, that's a ~65kB structure.
>>
>> Oh, that means the ATTRI/ATTRD log format structures are also 65kB
>> structures. That's going to need fixing - that far too big an
>> allocation to be doing for tiny little xattrs like parent pointers.
>>
>
> Ok, I will see if I can come up with something more dynamic.
>
>>
>>
>>> +xfs_attri_item_free(
>>> +    struct xfs_attri_log_item    *attrip)
>>> +{
>>> +    kmem_free(attrip->attri_item.li_lv_shadow);
>>> +    kmem_free(attrip);
>>> +}
>>> +
>>> +/*
>>> + * This returns the number of iovecs needed to log the given attri 
>>> item.
>>> + * We only need 1 iovec for an attri item.  It just logs the 
>>> attri_log_format
>>> + * structure.
>>> + */
>>> +static inline int
>>> +xfs_attri_item_sizeof(
>>> +    struct xfs_attri_log_item *attrip)
>>> +{
>>> +    return sizeof(struct xfs_attri_log_format);
>>> +}
>>> +
>>> +STATIC void
>>> +xfs_attri_item_size(
>>> +    struct xfs_log_item    *lip,
>>> +    int            *nvecs,
>>> +    int            *nbytes)
>>> +{
>>> +    *nvecs += 1;
>>> +    *nbytes += xfs_attri_item_sizeof(ATTRI_ITEM(lip));
>>> +}
>>
>> This will trigger 65kB allocations.....
>>
>>> +
>>> +/*
>>> + * This is called to fill in the vector of log iovecs for the
>>> + * given attri log item. We use only 1 iovec, and we point that
>>> + * at the attri_log_format structure embedded in the attri item.
>>> + * It is at this point that we assert that all of the attr
>>> + * slots in the attri item have been filled.
>>> + */
>>> +STATIC void
>>> +xfs_attri_item_format(
>>> +    struct xfs_log_item    *lip,
>>> +    struct xfs_log_vec    *lv)
>>> +{
>>> +    struct xfs_attri_log_item    *attrip = ATTRI_ITEM(lip);
>>> +    struct xfs_log_iovec    *vecp = NULL;
>>> +
>>> +    attrip->attri_format.attri_type = XFS_LI_ATTRI;
>>> +    attrip->attri_format.attri_size = 1;
>>> +
>>> +    xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTRI_FORMAT,
>>> +            &attrip->attri_format,
>>> +            xfs_attri_item_sizeof(attrip));
>>> +}
>>
>> ANd we'll always copy 65kB structures here even if the attribute
>> is only a few tens of bytes. That's just going to burn through log
>> bandwidth and really needs fixing.
>>
>> THe log item (and log format) structures really need to point to the
>> attribute name/value information rather than contain copies of them.
>> That way the information that is logged and the allocations required
>> are sized exactly for the attribute being created/removed. The cost
>> of dynamically allocating the buffers is less than the cost of
>> unnecessarily copying and logging 64k on eveery attribute operation.
>>
>> Indeed, for a remove operation there is no value, so we should only
>> be logging an intent with a name (a few tens of bytes), not a 65kb
>> structure....
>>
>> I'll stop here for the moment, because most of this code is going to
>> change to support dynamic allocation of name/value buffers, anyway.
>>
>> Cheers,
>>
>> Dave.
>>
>
> Alrighty, thanks for the review Dave.  I will work on getting these 
> things updated and send out another set once I get it working.
>
> Thank you!
>
> Allison Henderson
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at 
> https://urldefense.proofpoint.com/v2/url?u=http-3A__vger.kernel.org_majordomo-2Dinfo.html&d=DwICaQ&c=RoP1YumCXCgaWHvlZYR8PQcxBKCX5YTpkKY057SbK10&r=XFp4B05bcXkJ0dhYaFjd3F8telP01COkBp9cI7mKLb4&m=nw_o5VNTP1pgN6tnmVUK8si19OGGbovdt9cIHgFXjww&s=TVRphCUEAJj2sjtglo4hDCrG8TqgKrNeG3GP5bVT2Oo&e= 



^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 02/17] Set up infastructure for deferred attribute operations
  2017-10-09  4:20   ` Dave Chinner
@ 2017-10-09 21:25     ` Allison Henderson
  2017-10-09 22:51       ` Allison Henderson
  0 siblings, 1 reply; 66+ messages in thread
From: Allison Henderson @ 2017-10-09 21:25 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs



On 10/8/2017 9:20 PM, Dave Chinner wrote:
> 
> On Fri, Oct 06, 2017 at 03:05:33PM -0700, Allison Henderson wrote:
>> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
> hi Allison,
> 
> I'm just having a quick browse, not a complete review. This is a
> really good start for deferred attributes, but I think there's bits
> we'll have to redesign slightly for performance reasons.
> 
> First: needs a commit message to describe the design and structure,
> so the reviewer is not left to guess. :P
> 
>> ---
>> :100644 100644 b3edd66... 8d2c152... M	fs/xfs/Makefile
>> :100644 100644 b00ec1f... 5325ec2... M	fs/xfs/libxfs/xfs_attr.c
>> :100644 100644 2ea26b1... 38ae64a... M	fs/xfs/libxfs/xfs_attr_leaf.c
>> :100644 100644 d4f046d... ef0f8bf... M	fs/xfs/libxfs/xfs_defer.h
>> :100644 100644 8372e9b... 3778c8e... M	fs/xfs/libxfs/xfs_log_format.h
>> :100644 100644 0220159... 5372063... M	fs/xfs/libxfs/xfs_types.h
>> :100644 100644 8542606... 06c4081... M	fs/xfs/xfs_attr.h
>> :000000 100644 0000000... 419f90a... A	fs/xfs/xfs_attr_item.c
>> :000000 100644 0000000... aec854f... A	fs/xfs/xfs_attr_item.h
>> :100644 100644 d9a3a55... a206d51... M	fs/xfs/xfs_super.c
>> :100644 100644 815b53d2.. 66c3c5f... M	fs/xfs/xfs_trans.h
>> :000000 100644 0000000... 183c841... A	fs/xfs/xfs_trans_attr.c
> 
> This info isn't needed. Diffstat is sufficient.
> 
>> @@ -254,7 +261,9 @@ typedef struct xfs_trans_header {
>>   	{ XFS_LI_CUI,		"XFS_LI_CUI" }, \
>>   	{ XFS_LI_CUD,		"XFS_LI_CUD" }, \
>>   	{ XFS_LI_BUI,		"XFS_LI_BUI" }, \
>> -	{ XFS_LI_BUD,		"XFS_LI_BUD" }
>> +	{ XFS_LI_BUD,		"XFS_LI_BUD" }, \
>> +	{ XFS_LI_ATTRI,		"XFS_LI_ATTRI" }, \
>> +	{ XFS_LI_ATTRD,		"XFS_LI_ATTRD" }
> 
> "attr intent", "attr done"?
> 
> What object/action are we taking here? Set, flip-flags or remove? Or
> something else?
> 

Yes, "intent" and "done" was the idea I was going for. The actions are 
set and remove. The info needed for both operations seemed similar 
enough that it seemed excessive to make another intent/done type.  The 
xfs_attr struct has an attr_op_flags that marks it as a set or a remove 
action.  I will add some comments in the code to help clarify.


>>   
>>   /*
>>    * Inode Log Item Format definitions.
>> @@ -863,4 +872,45 @@ struct xfs_icreate_log {
>>   	__be32		icl_gen;	/* inode generation number to use */
>>   };
>>   
>> +/* Flags for defered attribute operations */
>> +#define ATTR_OP_FLAGS_SET	0x01	/* Set the attribute */
>> +#define ATTR_OP_FLAGS_REMOVE	0x02	/* Remove the attribute */
>> +#define ATTR_OP_FLAGS_MAX	0x02	/* Max flags */
>> +
>> +/*
>> + * ATTRI/ATTRD log format definitions
>> + */
>> +struct xfs_attr {
>> +	xfs_ino_t	attr_ino;
>> +	uint32_t	attr_op_flags;
>> +	uint32_t	attr_nameval_len;
>> +	uint32_t	attr_name_len;
>> +	uint32_t        attr_flags;
>> +	uint8_t		attr_nameval[MAX_NAMEVAL_LEN];
>> +};
> 
> "struct xfs_attr" is very generic. This ends up in the log on disk,
> right? So it's a log format structure? struct xfs_attr_log_format?
> 
> This also needs padding to ensure it's size is 64bit aligned.
> 
>> +/*
>> + * This is the structure used to lay out an attri log item in the
>> + * log.  The attri_attrs field is a variable size array whose
>> + * size is given by attri_nattrs.
>> + */
>> +struct xfs_attri_log_format {
>> +	uint16_t		attri_type;	/* attri log item type */
>> +	uint16_t		attri_size;	/* size of this item */
>> +	uint64_t		attri_id;	/* attri identifier */
>> +	struct xfs_attr		attri_attr;	/* attribute */
>> +};
> 
> That's got a 4 byte hole in it between attri_size and attri_id,
> so needs explicit padding. What's attri_id supposed to be and how is
> it used?
> 
> Also, i'd drop the "attri" from these, so.....


Hmm, I don't think attri_id is used.  I had used the extent free intent 
code as a sort of template for this and probably missed culling out the 
id.  I will get this struct cleaned up and padded out.

> 
>> +
>> +/*
>> + * This is the structure used to lay out an attrd log item in the
>> + * log.  The attrd_attrs array is a variable size array whose
>> + * size is given by attrd_nattrs;
>> + */
>> +struct xfs_attrd_log_format {
>> +	uint16_t		attrd_type;	/* attrd log item type */
>> +	uint16_t		attrd_size;	/* size of this item */
>> +	uint64_t		attrd_attri_id;	/* id of corresponding attri */
>> +	struct xfs_attr		attrd_attr;	/* attribute */
>> +};
> 
> .... these can use the same struct xfs_attr_log_format structure.
> 

Alrighty

>>   #endif /* __XFS_LOG_FORMAT_H__ */
>> diff --git a/fs/xfs/libxfs/xfs_types.h b/fs/xfs/libxfs/xfs_types.h
>> index 0220159..5372063 100644
>> --- a/fs/xfs/libxfs/xfs_types.h
>> +++ b/fs/xfs/libxfs/xfs_types.h
>> @@ -23,6 +23,7 @@ typedef uint32_t	prid_t;		/* project ID */
>>   typedef uint32_t	xfs_agblock_t;	/* blockno in alloc. group */
>>   typedef uint32_t	xfs_agino_t;	/* inode # within allocation grp */
>>   typedef uint32_t	xfs_extlen_t;	/* extent length in blocks */
>> +typedef uint32_t	xfs_attrlen_t;	/* attr length */
>>   typedef uint32_t	xfs_agnumber_t;	/* allocation group number */
>>   typedef int32_t		xfs_extnum_t;	/* # of extents in a file */
>>   typedef int16_t		xfs_aextnum_t;	/* # extents in an attribute fork */
>> diff --git a/fs/xfs/xfs_attr.h b/fs/xfs/xfs_attr.h
>> index 8542606..06c4081 100644
>> --- a/fs/xfs/xfs_attr.h
>> +++ b/fs/xfs/xfs_attr.h
>> @@ -18,6 +18,8 @@
>>   #ifndef __XFS_ATTR_H__
>>   #define	__XFS_ATTR_H__
>>   
>> +#include "libxfs/xfs_defer.h"
>> +
>>   struct xfs_inode;
>>   struct xfs_da_args;
>>   struct xfs_attr_list_context;
>> @@ -65,6 +67,10 @@ struct xfs_attr_list_context;
>>    */
>>   #define	ATTR_MAX_VALUELEN	(64*1024)	/* max length of a value */
>>   
>> +/* Max name length in the xfs_attr_item */
>> +#define MAX_NAME_LEN		255
> 
> Should be defined in xfs_da_format.h where the entries and
> name length types are defined. SHould also try to derive it from
> the namelen variable of one of the types rather than hard code it.
> 
>> +#define MAX_NAMEVAL_LEN (MAX_NAME_LEN + ATTR_MAX_VALUELEN)
> 
> as should this, I think.
>> +
>>   /*
>>    * Define how lists of attribute names are returned to the user from
>>    * the attr_list() call.  A large, 32bit aligned, buffer is passed in
>> @@ -87,6 +93,19 @@ typedef struct attrlist_ent {	/* data from attr_list() */
>>   } attrlist_ent_t;
>>   
>>   /*
>> + * List of attrs to commit later.
>> + */
>> +struct xfs_attr_item {
>> +	xfs_ino_t	  xattri_ino;
>> +	uint32_t	  xattri_op_flags;
>> +	uint32_t	  xattri_nameval_len; /* length of name and val */
>> +	uint32_t	  xattri_name_len;    /* length of name */
>> +	uint32_t	  xattri_flags;       /* attr flags */
>> +	char		  xattri_nameval[MAX_NAMEVAL_LEN];
>> +	struct list_head  xattri_list;
>> +};
> 
> Ok, that's a ~65kB structure.
> 
> Oh, that means the ATTRI/ATTRD log format structures are also 65kB
> structures. That's going to need fixing - that far too big an
> allocation to be doing for tiny little xattrs like parent pointers.
> 

Ok, I will see if I can come up with something more dynamic.

> 
> 
>> +xfs_attri_item_free(
>> +	struct xfs_attri_log_item	*attrip)
>> +{
>> +	kmem_free(attrip->attri_item.li_lv_shadow);
>> +	kmem_free(attrip);
>> +}
>> +
>> +/*
>> + * This returns the number of iovecs needed to log the given attri item.
>> + * We only need 1 iovec for an attri item.  It just logs the attri_log_format
>> + * structure.
>> + */
>> +static inline int
>> +xfs_attri_item_sizeof(
>> +	struct xfs_attri_log_item *attrip)
>> +{
>> +	return sizeof(struct xfs_attri_log_format);
>> +}
>> +
>> +STATIC void
>> +xfs_attri_item_size(
>> +	struct xfs_log_item	*lip,
>> +	int			*nvecs,
>> +	int			*nbytes)
>> +{
>> +	*nvecs += 1;
>> +	*nbytes += xfs_attri_item_sizeof(ATTRI_ITEM(lip));
>> +}
> 
> This will trigger 65kB allocations.....
> 
>> +
>> +/*
>> + * This is called to fill in the vector of log iovecs for the
>> + * given attri log item. We use only 1 iovec, and we point that
>> + * at the attri_log_format structure embedded in the attri item.
>> + * It is at this point that we assert that all of the attr
>> + * slots in the attri item have been filled.
>> + */
>> +STATIC void
>> +xfs_attri_item_format(
>> +	struct xfs_log_item	*lip,
>> +	struct xfs_log_vec	*lv)
>> +{
>> +	struct xfs_attri_log_item	*attrip = ATTRI_ITEM(lip);
>> +	struct xfs_log_iovec	*vecp = NULL;
>> +
>> +	attrip->attri_format.attri_type = XFS_LI_ATTRI;
>> +	attrip->attri_format.attri_size = 1;
>> +
>> +	xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTRI_FORMAT,
>> +			&attrip->attri_format,
>> +			xfs_attri_item_sizeof(attrip));
>> +}
> 
> ANd we'll always copy 65kB structures here even if the attribute
> is only a few tens of bytes. That's just going to burn through log
> bandwidth and really needs fixing.
> 
> THe log item (and log format) structures really need to point to the
> attribute name/value information rather than contain copies of them.
> That way the information that is logged and the allocations required
> are sized exactly for the attribute being created/removed. The cost
> of dynamically allocating the buffers is less than the cost of
> unnecessarily copying and logging 64k on eveery attribute operation.
> 
> Indeed, for a remove operation there is no value, so we should only
> be logging an intent with a name (a few tens of bytes), not a 65kb
> structure....
> 
> I'll stop here for the moment, because most of this code is going to
> change to support dynamic allocation of name/value buffers, anyway.
> 
> Cheers,
> 
> Dave.
> 

Alrighty, thanks for the review Dave.  I will work on getting these 
things updated and send out another set once I get it working.

Thank you!

Allison Henderson

^ permalink raw reply	[flat|nested] 66+ messages in thread

* Re: [PATCH 02/17] Set up infastructure for deferred attribute operations
  2017-10-06 22:05 ` [PATCH 02/17] Set up infastructure for deferred attribute operations Allison Henderson
@ 2017-10-09  4:20   ` Dave Chinner
  2017-10-09 21:25     ` Allison Henderson
  0 siblings, 1 reply; 66+ messages in thread
From: Dave Chinner @ 2017-10-09  4:20 UTC (permalink / raw)
  To: Allison Henderson; +Cc: linux-xfs


On Fri, Oct 06, 2017 at 03:05:33PM -0700, Allison Henderson wrote:
> Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
hi Allison, 

I'm just having a quick browse, not a complete review. This is a
really good start for deferred attributes, but I think there's bits
we'll have to redesign slightly for performance reasons.

First: needs a commit message to describe the design and structure,
so the reviewer is not left to guess. :P

> ---
> :100644 100644 b3edd66... 8d2c152... M	fs/xfs/Makefile
> :100644 100644 b00ec1f... 5325ec2... M	fs/xfs/libxfs/xfs_attr.c
> :100644 100644 2ea26b1... 38ae64a... M	fs/xfs/libxfs/xfs_attr_leaf.c
> :100644 100644 d4f046d... ef0f8bf... M	fs/xfs/libxfs/xfs_defer.h
> :100644 100644 8372e9b... 3778c8e... M	fs/xfs/libxfs/xfs_log_format.h
> :100644 100644 0220159... 5372063... M	fs/xfs/libxfs/xfs_types.h
> :100644 100644 8542606... 06c4081... M	fs/xfs/xfs_attr.h
> :000000 100644 0000000... 419f90a... A	fs/xfs/xfs_attr_item.c
> :000000 100644 0000000... aec854f... A	fs/xfs/xfs_attr_item.h
> :100644 100644 d9a3a55... a206d51... M	fs/xfs/xfs_super.c
> :100644 100644 815b53d2.. 66c3c5f... M	fs/xfs/xfs_trans.h
> :000000 100644 0000000... 183c841... A	fs/xfs/xfs_trans_attr.c

This info isn't needed. Diffstat is sufficient.

> @@ -254,7 +261,9 @@ typedef struct xfs_trans_header {
>  	{ XFS_LI_CUI,		"XFS_LI_CUI" }, \
>  	{ XFS_LI_CUD,		"XFS_LI_CUD" }, \
>  	{ XFS_LI_BUI,		"XFS_LI_BUI" }, \
> -	{ XFS_LI_BUD,		"XFS_LI_BUD" }
> +	{ XFS_LI_BUD,		"XFS_LI_BUD" }, \
> +	{ XFS_LI_ATTRI,		"XFS_LI_ATTRI" }, \
> +	{ XFS_LI_ATTRD,		"XFS_LI_ATTRD" }

"attr intent", "attr done"?

What object/action are we taking here? Set, flip-flags or remove? Or
something else?

>  
>  /*
>   * Inode Log Item Format definitions.
> @@ -863,4 +872,45 @@ struct xfs_icreate_log {
>  	__be32		icl_gen;	/* inode generation number to use */
>  };
>  
> +/* Flags for defered attribute operations */
> +#define ATTR_OP_FLAGS_SET	0x01	/* Set the attribute */
> +#define ATTR_OP_FLAGS_REMOVE	0x02	/* Remove the attribute */
> +#define ATTR_OP_FLAGS_MAX	0x02	/* Max flags */
> +
> +/*
> + * ATTRI/ATTRD log format definitions
> + */
> +struct xfs_attr {
> +	xfs_ino_t	attr_ino;
> +	uint32_t	attr_op_flags;
> +	uint32_t	attr_nameval_len;
> +	uint32_t	attr_name_len;
> +	uint32_t        attr_flags;
> +	uint8_t		attr_nameval[MAX_NAMEVAL_LEN];
> +};

"struct xfs_attr" is very generic. This ends up in the log on disk,
right? So it's a log format structure? struct xfs_attr_log_format?

This also needs padding to ensure it's size is 64bit aligned.

> +/*
> + * This is the structure used to lay out an attri log item in the
> + * log.  The attri_attrs field is a variable size array whose
> + * size is given by attri_nattrs.
> + */
> +struct xfs_attri_log_format {
> +	uint16_t		attri_type;	/* attri log item type */
> +	uint16_t		attri_size;	/* size of this item */
> +	uint64_t		attri_id;	/* attri identifier */
> +	struct xfs_attr		attri_attr;	/* attribute */
> +};

That's got a 4 byte hole in it between attri_size and attri_id,
so needs explicit padding. What's attri_id supposed to be and how is
it used?

Also, i'd drop the "attri" from these, so.....

> +
> +/*
> + * This is the structure used to lay out an attrd log item in the
> + * log.  The attrd_attrs array is a variable size array whose
> + * size is given by attrd_nattrs;
> + */
> +struct xfs_attrd_log_format {
> +	uint16_t		attrd_type;	/* attrd log item type */
> +	uint16_t		attrd_size;	/* size of this item */
> +	uint64_t		attrd_attri_id;	/* id of corresponding attri */
> +	struct xfs_attr		attrd_attr;	/* attribute */
> +};

.... these can use the same struct xfs_attr_log_format structure.

>  #endif /* __XFS_LOG_FORMAT_H__ */
> diff --git a/fs/xfs/libxfs/xfs_types.h b/fs/xfs/libxfs/xfs_types.h
> index 0220159..5372063 100644
> --- a/fs/xfs/libxfs/xfs_types.h
> +++ b/fs/xfs/libxfs/xfs_types.h
> @@ -23,6 +23,7 @@ typedef uint32_t	prid_t;		/* project ID */
>  typedef uint32_t	xfs_agblock_t;	/* blockno in alloc. group */
>  typedef uint32_t	xfs_agino_t;	/* inode # within allocation grp */
>  typedef uint32_t	xfs_extlen_t;	/* extent length in blocks */
> +typedef uint32_t	xfs_attrlen_t;	/* attr length */
>  typedef uint32_t	xfs_agnumber_t;	/* allocation group number */
>  typedef int32_t		xfs_extnum_t;	/* # of extents in a file */
>  typedef int16_t		xfs_aextnum_t;	/* # extents in an attribute fork */
> diff --git a/fs/xfs/xfs_attr.h b/fs/xfs/xfs_attr.h
> index 8542606..06c4081 100644
> --- a/fs/xfs/xfs_attr.h
> +++ b/fs/xfs/xfs_attr.h
> @@ -18,6 +18,8 @@
>  #ifndef __XFS_ATTR_H__
>  #define	__XFS_ATTR_H__
>  
> +#include "libxfs/xfs_defer.h"
> +
>  struct xfs_inode;
>  struct xfs_da_args;
>  struct xfs_attr_list_context;
> @@ -65,6 +67,10 @@ struct xfs_attr_list_context;
>   */
>  #define	ATTR_MAX_VALUELEN	(64*1024)	/* max length of a value */
>  
> +/* Max name length in the xfs_attr_item */
> +#define MAX_NAME_LEN		255

Should be defined in xfs_da_format.h where the entries and
name length types are defined. SHould also try to derive it from
the namelen variable of one of the types rather than hard code it.

> +#define MAX_NAMEVAL_LEN (MAX_NAME_LEN + ATTR_MAX_VALUELEN)

as should this, I think.
> +
>  /*
>   * Define how lists of attribute names are returned to the user from
>   * the attr_list() call.  A large, 32bit aligned, buffer is passed in
> @@ -87,6 +93,19 @@ typedef struct attrlist_ent {	/* data from attr_list() */
>  } attrlist_ent_t;
>  
>  /*
> + * List of attrs to commit later.
> + */
> +struct xfs_attr_item {
> +	xfs_ino_t	  xattri_ino;
> +	uint32_t	  xattri_op_flags;
> +	uint32_t	  xattri_nameval_len; /* length of name and val */
> +	uint32_t	  xattri_name_len;    /* length of name */
> +	uint32_t	  xattri_flags;       /* attr flags */
> +	char		  xattri_nameval[MAX_NAMEVAL_LEN];
> +	struct list_head  xattri_list;
> +};

Ok, that's a ~65kB structure.

Oh, that means the ATTRI/ATTRD log format structures are also 65kB
structures. That's going to need fixing - that far too big an
allocation to be doing for tiny little xattrs like parent pointers.



> +xfs_attri_item_free(
> +	struct xfs_attri_log_item	*attrip)
> +{
> +	kmem_free(attrip->attri_item.li_lv_shadow);
> +	kmem_free(attrip);
> +}
> +
> +/*
> + * This returns the number of iovecs needed to log the given attri item.
> + * We only need 1 iovec for an attri item.  It just logs the attri_log_format
> + * structure.
> + */
> +static inline int
> +xfs_attri_item_sizeof(
> +	struct xfs_attri_log_item *attrip)
> +{
> +	return sizeof(struct xfs_attri_log_format);
> +}
> +
> +STATIC void
> +xfs_attri_item_size(
> +	struct xfs_log_item	*lip,
> +	int			*nvecs,
> +	int			*nbytes)
> +{
> +	*nvecs += 1;
> +	*nbytes += xfs_attri_item_sizeof(ATTRI_ITEM(lip));
> +}

This will trigger 65kB allocations.....

> +
> +/*
> + * This is called to fill in the vector of log iovecs for the
> + * given attri log item. We use only 1 iovec, and we point that
> + * at the attri_log_format structure embedded in the attri item.
> + * It is at this point that we assert that all of the attr
> + * slots in the attri item have been filled.
> + */
> +STATIC void
> +xfs_attri_item_format(
> +	struct xfs_log_item	*lip,
> +	struct xfs_log_vec	*lv)
> +{
> +	struct xfs_attri_log_item	*attrip = ATTRI_ITEM(lip);
> +	struct xfs_log_iovec	*vecp = NULL;
> +
> +	attrip->attri_format.attri_type = XFS_LI_ATTRI;
> +	attrip->attri_format.attri_size = 1;
> +
> +	xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTRI_FORMAT,
> +			&attrip->attri_format,
> +			xfs_attri_item_sizeof(attrip));
> +}

ANd we'll always copy 65kB structures here even if the attribute
is only a few tens of bytes. That's just going to burn through log
bandwidth and really needs fixing.

THe log item (and log format) structures really need to point to the
attribute name/value information rather than contain copies of them.
That way the information that is logged and the allocations required
are sized exactly for the attribute being created/removed. The cost
of dynamically allocating the buffers is less than the cost of
unnecessarily copying and logging 64k on eveery attribute operation.

Indeed, for a remove operation there is no value, so we should only
be logging an intent with a name (a few tens of bytes), not a 65kb
structure....

I'll stop here for the moment, because most of this code is going to
change to support dynamic allocation of name/value buffers, anyway.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 66+ messages in thread

* [PATCH 02/17] Set up infastructure for deferred attribute operations
  2017-10-06 22:05 [PATCH 00/17] Parent Pointers V2 Allison Henderson
@ 2017-10-06 22:05 ` Allison Henderson
  2017-10-09  4:20   ` Dave Chinner
  0 siblings, 1 reply; 66+ messages in thread
From: Allison Henderson @ 2017-10-06 22:05 UTC (permalink / raw)
  To: linux-xfs; +Cc: Allison Henderson

Signed-off-by: Allison Henderson <allison.henderson@oracle.com>
---
:100644 100644 b3edd66... 8d2c152... M	fs/xfs/Makefile
:100644 100644 b00ec1f... 5325ec2... M	fs/xfs/libxfs/xfs_attr.c
:100644 100644 2ea26b1... 38ae64a... M	fs/xfs/libxfs/xfs_attr_leaf.c
:100644 100644 d4f046d... ef0f8bf... M	fs/xfs/libxfs/xfs_defer.h
:100644 100644 8372e9b... 3778c8e... M	fs/xfs/libxfs/xfs_log_format.h
:100644 100644 0220159... 5372063... M	fs/xfs/libxfs/xfs_types.h
:100644 100644 8542606... 06c4081... M	fs/xfs/xfs_attr.h
:000000 100644 0000000... 419f90a... A	fs/xfs/xfs_attr_item.c
:000000 100644 0000000... aec854f... A	fs/xfs/xfs_attr_item.h
:100644 100644 d9a3a55... a206d51... M	fs/xfs/xfs_super.c
:100644 100644 815b53d2.. 66c3c5f... M	fs/xfs/xfs_trans.h
:000000 100644 0000000... 183c841... A	fs/xfs/xfs_trans_attr.c
 fs/xfs/Makefile                |   2 +
 fs/xfs/libxfs/xfs_attr.c       |   2 +-
 fs/xfs/libxfs/xfs_attr_leaf.c  |   1 +
 fs/xfs/libxfs/xfs_defer.h      |   1 +
 fs/xfs/libxfs/xfs_log_format.h |  54 ++++-
 fs/xfs/libxfs/xfs_types.h      |   1 +
 fs/xfs/xfs_attr.h              |  23 +-
 fs/xfs/xfs_attr_item.c         | 476 +++++++++++++++++++++++++++++++++++++++++
 fs/xfs/xfs_attr_item.h         | 104 +++++++++
 fs/xfs/xfs_super.c             |   1 +
 fs/xfs/xfs_trans.h             |  13 ++
 fs/xfs/xfs_trans_attr.c        | 293 +++++++++++++++++++++++++
 12 files changed, 967 insertions(+), 4 deletions(-)

diff --git a/fs/xfs/Makefile b/fs/xfs/Makefile
index b3edd66..8d2c152 100644
--- a/fs/xfs/Makefile
+++ b/fs/xfs/Makefile
@@ -106,6 +106,7 @@ xfs-y				+= xfs_log.o \
 				   xfs_bmap_item.o \
 				   xfs_buf_item.o \
 				   xfs_extfree_item.o \
+				   xfs_attr_item.o \
 				   xfs_icreate_item.o \
 				   xfs_inode_item.o \
 				   xfs_refcount_item.o \
@@ -115,6 +116,7 @@ xfs-y				+= xfs_log.o \
 				   xfs_trans_bmap.o \
 				   xfs_trans_buf.o \
 				   xfs_trans_extfree.o \
+				   xfs_trans_attr.o \
 				   xfs_trans_inode.o \
 				   xfs_trans_refcount.o \
 				   xfs_trans_rmap.o \
diff --git a/fs/xfs/libxfs/xfs_attr.c b/fs/xfs/libxfs/xfs_attr.c
index b00ec1f..5325ec2 100644
--- a/fs/xfs/libxfs/xfs_attr.c
+++ b/fs/xfs/libxfs/xfs_attr.c
@@ -74,7 +74,7 @@ STATIC int xfs_attr_fillstate(xfs_da_state_t *state);
 STATIC int xfs_attr_refillstate(xfs_da_state_t *state);
 
 
-STATIC int
+int
 xfs_attr_args_init(
 	struct xfs_da_args	*args,
 	struct xfs_inode	*dp,
diff --git a/fs/xfs/libxfs/xfs_attr_leaf.c b/fs/xfs/libxfs/xfs_attr_leaf.c
index 2ea26b1..38ae64a 100644
--- a/fs/xfs/libxfs/xfs_attr_leaf.c
+++ b/fs/xfs/libxfs/xfs_attr_leaf.c
@@ -579,6 +579,7 @@ xfs_attr_shortform_add(xfs_da_args_t *args, int forkoff)
 	memcpy(&sfe->nameval[args->namelen], args->value, args->valuelen);
 	sf->hdr.count++;
 	be16_add_cpu(&sf->hdr.totsize, size);
+
 	xfs_trans_log_inode(args->trans, dp, XFS_ILOG_CORE | XFS_ILOG_ADATA);
 
 	xfs_sbversion_add_attr2(mp, args->trans);
diff --git a/fs/xfs/libxfs/xfs_defer.h b/fs/xfs/libxfs/xfs_defer.h
index d4f046d..ef0f8bf 100644
--- a/fs/xfs/libxfs/xfs_defer.h
+++ b/fs/xfs/libxfs/xfs_defer.h
@@ -55,6 +55,7 @@ enum xfs_defer_ops_type {
 	XFS_DEFER_OPS_TYPE_REFCOUNT,
 	XFS_DEFER_OPS_TYPE_RMAP,
 	XFS_DEFER_OPS_TYPE_FREE,
+	XFS_DEFER_OPS_TYPE_ATTR,
 	XFS_DEFER_OPS_TYPE_MAX,
 };
 
diff --git a/fs/xfs/libxfs/xfs_log_format.h b/fs/xfs/libxfs/xfs_log_format.h
index 8372e9b..3778c8e 100644
--- a/fs/xfs/libxfs/xfs_log_format.h
+++ b/fs/xfs/libxfs/xfs_log_format.h
@@ -18,6 +18,8 @@
 #ifndef	__XFS_LOG_FORMAT_H__
 #define __XFS_LOG_FORMAT_H__
 
+#include "xfs_attr.h"
+
 struct xfs_mount;
 struct xfs_trans_res;
 
@@ -116,7 +118,10 @@ static inline uint xlog_get_cycle(char *ptr)
 #define XLOG_REG_TYPE_CUD_FORMAT	24
 #define XLOG_REG_TYPE_BUI_FORMAT	25
 #define XLOG_REG_TYPE_BUD_FORMAT	26
-#define XLOG_REG_TYPE_MAX		26
+#define XLOG_REG_TYPE_ATTRI_FORMAT	27
+#define XLOG_REG_TYPE_ATTRD_FORMAT	28
+#define XLOG_REG_TYPE_MAX		29
+
 
 /*
  * Flags to log operation header
@@ -239,6 +244,8 @@ typedef struct xfs_trans_header {
 #define	XFS_LI_CUD		0x1243
 #define	XFS_LI_BUI		0x1244	/* bmbt update intent */
 #define	XFS_LI_BUD		0x1245
+#define	XFS_LI_ATTRI		0x1246
+#define	XFS_LI_ATTRD		0x1247
 
 #define XFS_LI_TYPE_DESC \
 	{ XFS_LI_EFI,		"XFS_LI_EFI" }, \
@@ -254,7 +261,9 @@ typedef struct xfs_trans_header {
 	{ XFS_LI_CUI,		"XFS_LI_CUI" }, \
 	{ XFS_LI_CUD,		"XFS_LI_CUD" }, \
 	{ XFS_LI_BUI,		"XFS_LI_BUI" }, \
-	{ XFS_LI_BUD,		"XFS_LI_BUD" }
+	{ XFS_LI_BUD,		"XFS_LI_BUD" }, \
+	{ XFS_LI_ATTRI,		"XFS_LI_ATTRI" }, \
+	{ XFS_LI_ATTRD,		"XFS_LI_ATTRD" }
 
 /*
  * Inode Log Item Format definitions.
@@ -863,4 +872,45 @@ struct xfs_icreate_log {
 	__be32		icl_gen;	/* inode generation number to use */
 };
 
+/* Flags for defered attribute operations */
+#define ATTR_OP_FLAGS_SET	0x01	/* Set the attribute */
+#define ATTR_OP_FLAGS_REMOVE	0x02	/* Remove the attribute */
+#define ATTR_OP_FLAGS_MAX	0x02	/* Max flags */
+
+/*
+ * ATTRI/ATTRD log format definitions
+ */
+struct xfs_attr {
+	xfs_ino_t	attr_ino;
+	uint32_t	attr_op_flags;
+	uint32_t	attr_nameval_len;
+	uint32_t	attr_name_len;
+	uint32_t        attr_flags;
+	uint8_t		attr_nameval[MAX_NAMEVAL_LEN];
+};
+
+/*
+ * This is the structure used to lay out an attri log item in the
+ * log.  The attri_attrs field is a variable size array whose
+ * size is given by attri_nattrs.
+ */
+struct xfs_attri_log_format {
+	uint16_t		attri_type;	/* attri log item type */
+	uint16_t		attri_size;	/* size of this item */
+	uint64_t		attri_id;	/* attri identifier */
+	struct xfs_attr		attri_attr;	/* attribute */
+};
+
+/*
+ * This is the structure used to lay out an attrd log item in the
+ * log.  The attrd_attrs array is a variable size array whose
+ * size is given by attrd_nattrs;
+ */
+struct xfs_attrd_log_format {
+	uint16_t		attrd_type;	/* attrd log item type */
+	uint16_t		attrd_size;	/* size of this item */
+	uint64_t		attrd_attri_id;	/* id of corresponding attri */
+	struct xfs_attr		attrd_attr;	/* attribute */
+};
+
 #endif /* __XFS_LOG_FORMAT_H__ */
diff --git a/fs/xfs/libxfs/xfs_types.h b/fs/xfs/libxfs/xfs_types.h
index 0220159..5372063 100644
--- a/fs/xfs/libxfs/xfs_types.h
+++ b/fs/xfs/libxfs/xfs_types.h
@@ -23,6 +23,7 @@ typedef uint32_t	prid_t;		/* project ID */
 typedef uint32_t	xfs_agblock_t;	/* blockno in alloc. group */
 typedef uint32_t	xfs_agino_t;	/* inode # within allocation grp */
 typedef uint32_t	xfs_extlen_t;	/* extent length in blocks */
+typedef uint32_t	xfs_attrlen_t;	/* attr length */
 typedef uint32_t	xfs_agnumber_t;	/* allocation group number */
 typedef int32_t		xfs_extnum_t;	/* # of extents in a file */
 typedef int16_t		xfs_aextnum_t;	/* # extents in an attribute fork */
diff --git a/fs/xfs/xfs_attr.h b/fs/xfs/xfs_attr.h
index 8542606..06c4081 100644
--- a/fs/xfs/xfs_attr.h
+++ b/fs/xfs/xfs_attr.h
@@ -18,6 +18,8 @@
 #ifndef __XFS_ATTR_H__
 #define	__XFS_ATTR_H__
 
+#include "libxfs/xfs_defer.h"
+
 struct xfs_inode;
 struct xfs_da_args;
 struct xfs_attr_list_context;
@@ -65,6 +67,10 @@ struct xfs_attr_list_context;
  */
 #define	ATTR_MAX_VALUELEN	(64*1024)	/* max length of a value */
 
+/* Max name length in the xfs_attr_item */
+#define MAX_NAME_LEN		255
+#define MAX_NAMEVAL_LEN (MAX_NAME_LEN + ATTR_MAX_VALUELEN)
+
 /*
  * Define how lists of attribute names are returned to the user from
  * the attr_list() call.  A large, 32bit aligned, buffer is passed in
@@ -87,6 +93,19 @@ typedef struct attrlist_ent {	/* data from attr_list() */
 } attrlist_ent_t;
 
 /*
+ * List of attrs to commit later.
+ */
+struct xfs_attr_item {
+	xfs_ino_t	  xattri_ino;
+	uint32_t	  xattri_op_flags;
+	uint32_t	  xattri_nameval_len; /* length of name and val */
+	uint32_t	  xattri_name_len;    /* length of name */
+	uint32_t	  xattri_flags;       /* attr flags */
+	char		  xattri_nameval[MAX_NAMEVAL_LEN];
+	struct list_head  xattri_list;
+};
+
+/*
  * Given a pointer to the (char*) buffer containing the attr_list() result,
  * and an index, return a pointer to the indicated attribute in the buffer.
  */
@@ -154,6 +173,8 @@ int xfs_attr_remove(struct xfs_inode *dp, const unsigned char *name, int flags);
 int xfs_attr_remove_args(struct xfs_da_args *args, int flags);
 int xfs_attr_list(struct xfs_inode *dp, char *buffer, int bufsize,
 		  int flags, struct attrlist_cursor_kern *cursor);
-
+int xfs_attr_args_init(struct xfs_da_args *args, struct xfs_inode *dp,
+		       const unsigned char *name, int flags);
+int xfs_attr_calc_size(struct xfs_da_args *args, int *local);
 
 #endif	/* __XFS_ATTR_H__ */
diff --git a/fs/xfs/xfs_attr_item.c b/fs/xfs/xfs_attr_item.c
new file mode 100644
index 0000000..419f90a
--- /dev/null
+++ b/fs/xfs/xfs_attr_item.c
@@ -0,0 +1,476 @@
+/*
+ * Copyright (c) 2017 Oracle, Inc.
+ * All Rights Reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation Inc.
+ */
+#include "xfs.h"
+#include "xfs_fs.h"
+#include "xfs_format.h"
+#include "xfs_log_format.h"
+#include "xfs_trans_resv.h"
+#include "xfs_bit.h"
+#include "xfs_mount.h"
+#include "xfs_trans.h"
+#include "xfs_trans_priv.h"
+#include "xfs_buf_item.h"
+#include "xfs_attr_item.h"
+#include "xfs_log.h"
+#include "xfs_btree.h"
+#include "xfs_rmap.h"
+
+
+static inline struct xfs_attri_log_item *ATTRI_ITEM(struct xfs_log_item *lip)
+{
+	return container_of(lip, struct xfs_attri_log_item, attri_item);
+}
+
+void
+xfs_attri_item_free(
+	struct xfs_attri_log_item	*attrip)
+{
+	kmem_free(attrip->attri_item.li_lv_shadow);
+	kmem_free(attrip);
+}
+
+/*
+ * This returns the number of iovecs needed to log the given attri item.
+ * We only need 1 iovec for an attri item.  It just logs the attri_log_format
+ * structure.
+ */
+static inline int
+xfs_attri_item_sizeof(
+	struct xfs_attri_log_item *attrip)
+{
+	return sizeof(struct xfs_attri_log_format);
+}
+
+STATIC void
+xfs_attri_item_size(
+	struct xfs_log_item	*lip,
+	int			*nvecs,
+	int			*nbytes)
+{
+	*nvecs += 1;
+	*nbytes += xfs_attri_item_sizeof(ATTRI_ITEM(lip));
+}
+
+/*
+ * This is called to fill in the vector of log iovecs for the
+ * given attri log item. We use only 1 iovec, and we point that
+ * at the attri_log_format structure embedded in the attri item.
+ * It is at this point that we assert that all of the attr
+ * slots in the attri item have been filled.
+ */
+STATIC void
+xfs_attri_item_format(
+	struct xfs_log_item	*lip,
+	struct xfs_log_vec	*lv)
+{
+	struct xfs_attri_log_item	*attrip = ATTRI_ITEM(lip);
+	struct xfs_log_iovec	*vecp = NULL;
+
+	attrip->attri_format.attri_type = XFS_LI_ATTRI;
+	attrip->attri_format.attri_size = 1;
+
+	xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTRI_FORMAT,
+			&attrip->attri_format,
+			xfs_attri_item_sizeof(attrip));
+}
+
+
+/*
+ * Pinning has no meaning for an attri item, so just return.
+ */
+STATIC void
+xfs_attri_item_pin(
+	struct xfs_log_item	*lip)
+{
+}
+
+/*
+ * The unpin operation is the last place an ATTRI is manipulated in the log. It
+ * is either inserted in the AIL or aborted in the event of a log I/O error. In
+ * either case, the EFI transaction has been successfully committed to make it
+ * this far. Therefore, we expect whoever committed the ATTRI to either
+ * construct and commit the ATTRD or drop the ATTRD's reference in the event of
+ * error. Simply drop the log's ATTRI reference now that the log is done with
+ * it.
+ */
+STATIC void
+xfs_attri_item_unpin(
+	struct xfs_log_item	*lip,
+	int			remove)
+{
+	struct xfs_attri_log_item	*attrip = ATTRI_ITEM(lip);
+
+	xfs_attri_release(attrip);
+}
+
+/*
+ * attri items have no locking or pushing.  However, since ATTRIs are pulled
+ * from the AIL when their corresponding ATTRDs are committed to disk, their
+ * situation is very similar to being pinned.  Return XFS_ITEM_PINNED so that
+ * the caller will eventually flush the log.  This should help in getting the
+ * ATTRI out of the AIL.
+ */
+STATIC uint
+xfs_attri_item_push(
+	struct xfs_log_item	*lip,
+	struct list_head	*buffer_list)
+{
+	return XFS_ITEM_PINNED;
+}
+
+/*
+ * The ATTRI has been either committed or aborted if the transaction has been
+ * cancelled. If the transaction was cancelled, an ATTRD isn't going to be
+ * constructed and thus we free the ATTRI here directly.
+ */
+STATIC void
+xfs_attri_item_unlock(
+	struct xfs_log_item	*lip)
+{
+	if (lip->li_flags & XFS_LI_ABORTED)
+		xfs_attri_item_free(ATTRI_ITEM(lip));
+}
+
+/*
+ * The ATTRI is logged only once and cannot be moved in the log, so simply
+ * return the lsn at which it's been logged.
+ */
+STATIC xfs_lsn_t
+xfs_attri_item_committed(
+	struct xfs_log_item	*lip,
+	xfs_lsn_t		lsn)
+{
+	return lsn;
+}
+
+STATIC void
+xfs_attri_item_committing(
+	struct xfs_log_item	*lip,
+	xfs_lsn_t		lsn)
+{
+}
+
+/*
+ * This is the ops vector shared by all attri log items.
+ */
+static const struct xfs_item_ops xfs_attri_item_ops = {
+	.iop_size	= xfs_attri_item_size,
+	.iop_format	= xfs_attri_item_format,
+	.iop_pin	= xfs_attri_item_pin,
+	.iop_unpin	= xfs_attri_item_unpin,
+	.iop_unlock	= xfs_attri_item_unlock,
+	.iop_committed	= xfs_attri_item_committed,
+	.iop_push	= xfs_attri_item_push,
+	.iop_committing = xfs_attri_item_committing
+};
+
+
+/*
+ * Allocate and initialize an attri item
+ */
+struct xfs_attri_log_item *
+xfs_attri_init(
+	struct xfs_mount	*mp)
+
+{
+	struct xfs_attri_log_item	*attrip;
+	uint			size;
+
+	size = (uint)(sizeof(struct xfs_attri_log_item));
+	attrip = kmem_zalloc(size, KM_SLEEP);
+
+	xfs_log_item_init(mp, &(attrip->attri_item), XFS_LI_ATTRI,
+			  &xfs_attri_item_ops);
+	attrip->attri_format.attri_id = (uintptr_t)(void *)attrip;
+	atomic_set(&attrip->attri_next_attr, 0);
+	atomic_set(&attrip->attri_refcount, 2);
+
+	return attrip;
+}
+
+/*
+ * Copy an ATTRI format buffer from the given buf, and into the destination
+ * ATTRI format structure.
+ * The given buffer can be in 32 bit or 64 bit form (which has different
+ * padding), one of which will be the native format for this kernel.
+ * It will handle the conversion of formats if necessary.
+ */
+int
+xfs_attri_copy_format(struct xfs_log_iovec *buf,
+		      struct xfs_attri_log_format *dst_attri_fmt)
+{
+	struct xfs_attri_log_format *src_attri_fmt = buf->i_addr;
+	uint len = sizeof(struct xfs_attri_log_format);
+
+	if (buf->i_len == len) {
+		memcpy((char *)dst_attri_fmt, (char *)src_attri_fmt, len);
+		return 0;
+	}
+	return -EFSCORRUPTED;
+}
+
+/*
+ * Freeing the attri requires that we remove it from the AIL if it has already
+ * been placed there. However, the ATTRI may not yet have been placed in the
+ * AIL when called by xfs_attri_release() from ATTRD processing due to the
+ * ordering of committed vs unpin operations in bulk insert operations. Hence
+ * the reference count to ensure only the last caller frees the ATTRI.
+ */
+void
+xfs_attri_release(
+	struct xfs_attri_log_item	*attrip)
+{
+	ASSERT(atomic_read(&attrip->attri_refcount) > 0);
+	if (atomic_dec_and_test(&attrip->attri_refcount)) {
+		xfs_trans_ail_remove(&attrip->attri_item,
+				     SHUTDOWN_LOG_IO_ERROR);
+		xfs_attri_item_free(attrip);
+	}
+}
+
+static inline struct xfs_attrd_log_item *ATTRD_ITEM(struct xfs_log_item *lip)
+{
+	return container_of(lip, struct xfs_attrd_log_item, attrd_item);
+}
+
+STATIC void
+xfs_attrd_item_free(struct xfs_attrd_log_item *attrdp)
+{
+	kmem_free(attrdp->attrd_item.li_lv_shadow);
+	kmem_free(attrdp);
+}
+
+/*
+ * This returns the number of iovecs needed to log the given attrd item.
+ * We only need 1 iovec for an attrd item.  It just logs the attrd_log_format
+ * structure.
+ */
+static inline int
+xfs_attrd_item_sizeof(
+	struct xfs_attrd_log_item *attrdp)
+{
+	return sizeof(struct xfs_attrd_log_format);
+}
+
+STATIC void
+xfs_attrd_item_size(
+	struct xfs_log_item	*lip,
+	int			*nvecs,
+	int			*nbytes)
+{
+	*nvecs += 1;
+	*nbytes += xfs_attrd_item_sizeof(ATTRD_ITEM(lip));
+}
+
+/*
+ * This is called to fill in the vector of log iovecs for the
+ * given attrd log item. We use only 1 iovec, and we point that
+ * at the attrd_log_format structure embedded in the attrd item.
+ * It is at this point that we assert that all of the attr
+ * slots in the attrd item have been filled.
+ */
+STATIC void
+xfs_attrd_item_format(
+	struct xfs_log_item	*lip,
+	struct xfs_log_vec	*lv)
+{
+	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
+	struct xfs_log_iovec	*vecp = NULL;
+
+	attrdp->attrd_format.attrd_type = XFS_LI_ATTRD;
+	attrdp->attrd_format.attrd_size = 1;
+
+	xlog_copy_iovec(lv, &vecp, XLOG_REG_TYPE_ATTRD_FORMAT,
+			&attrdp->attrd_format,
+			xfs_attrd_item_sizeof(attrdp));
+}
+
+/*
+ * Pinning has no meaning for an attrd item, so just return.
+ */
+STATIC void
+xfs_attrd_item_pin(
+	struct xfs_log_item	*lip)
+{
+}
+
+/*
+ * Since pinning has no meaning for an attrd item, unpinning does
+ * not either.
+ */
+STATIC void
+xfs_attrd_item_unpin(
+	struct xfs_log_item	*lip,
+	int			remove)
+{
+}
+
+/*
+ * There isn't much you can do to push on an attrd item.  It is simply stuck
+ * waiting for the log to be flushed to disk.
+ */
+STATIC uint
+xfs_attrd_item_push(
+	struct xfs_log_item	*lip,
+	struct list_head	*buffer_list)
+{
+	return XFS_ITEM_PINNED;
+}
+
+/*
+ * The ATTRD is either committed or aborted if the transaction is cancelled. If
+ * the transaction is cancelled, drop our reference to the ATTRI and free the
+ * ATTRD.
+ */
+STATIC void
+xfs_attrd_item_unlock(
+	struct xfs_log_item	*lip)
+{
+	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
+
+	if (lip->li_flags & XFS_LI_ABORTED) {
+		xfs_attri_release(attrdp->attrd_attrip);
+		xfs_attrd_item_free(attrdp);
+	}
+}
+
+/*
+ * When the attrd item is committed to disk, all we need to do is delete our
+ * reference to our partner attri item and then free ourselves. Since we're
+ * freeing ourselves we must return -1 to keep the transaction code from
+ * further referencing this item.
+ */
+STATIC xfs_lsn_t
+xfs_attrd_item_committed(
+	struct xfs_log_item	*lip,
+	xfs_lsn_t		lsn)
+{
+	struct xfs_attrd_log_item	*attrdp = ATTRD_ITEM(lip);
+
+	/*
+	 * Drop the ATTRI reference regardless of whether the ATTRD has been
+	 * aborted. Once the ATTRD transaction is constructed, it is the sole
+	 * responsibility of the ATTRD to release the ATTRI (even if the ATTRI
+	 * is aborted due to log I/O error).
+	 */
+	xfs_attri_release(attrdp->attrd_attrip);
+	xfs_attrd_item_free(attrdp);
+
+	return (xfs_lsn_t)-1;
+}
+
+STATIC void
+xfs_attrd_item_committing(
+	struct xfs_log_item	*lip,
+	xfs_lsn_t		lsn)
+{
+}
+
+/*
+ * This is the ops vector shared by all attrd log items.
+ */
+static const struct xfs_item_ops xfs_attrd_item_ops = {
+	.iop_size	= xfs_attrd_item_size,
+	.iop_format	= xfs_attrd_item_format,
+	.iop_pin	= xfs_attrd_item_pin,
+	.iop_unpin	= xfs_attrd_item_unpin,
+	.iop_unlock	= xfs_attrd_item_unlock,
+	.iop_committed	= xfs_attrd_item_committed,
+	.iop_push	= xfs_attrd_item_push,
+	.iop_committing = xfs_attrd_item_committing
+};
+
+/*
+ * Allocate and initialize an attrd item
+ */
+struct xfs_attrd_log_item *
+xfs_attrd_init(
+	struct xfs_mount	*mp,
+	struct xfs_attri_log_item	*attrip)
+
+{
+	struct xfs_attrd_log_item	*attrdp;
+	uint			size;
+
+	size = (uint)(sizeof(struct xfs_attrd_log_item));
+	attrdp = kmem_zalloc(size, KM_SLEEP);
+
+	xfs_log_item_init(mp, &attrdp->attrd_item, XFS_LI_ATTRD,
+			  &xfs_attrd_item_ops);
+	attrdp->attrd_attrip = attrip;
+	attrdp->attrd_format.attrd_attri_id = attrip->attri_format.attri_id;
+
+	return attrdp;
+}
+
+/*
+ * Process an attr intent item that was recovered from
+ * the log.  We need to delete the attr that it describes.
+ */
+int
+xfs_attri_recover(
+	struct xfs_mount	*mp,
+	struct xfs_attri_log_item	*attrip)
+{
+	struct xfs_attrd_log_item	*attrdp;
+	struct xfs_trans	*tp;
+	int			error = 0;
+	struct xfs_attr		*attrp;
+
+	ASSERT(!test_bit(XFS_ATTRI_RECOVERED, &attrip->attri_flags));
+
+	/*
+	 * First check the validity of the attr described by the
+	 * ATTRI.  If any are bad, then assume that all are bad and
+	 * just toss the ATTRI.
+	 */
+	attrp = &attrip->attri_format.attri_attr;
+	if (attrp->attr_nameval_len == 0 ||
+	    attrp->attr_name_len == 0 ||
+	    attrp->attr_op_flags > ATTR_OP_FLAGS_MAX) {
+		/*
+		 * This will pull the ATTRI from the AIL and
+		 * free the memory associated with it.
+		 */
+		set_bit(XFS_ATTRI_RECOVERED, &attrip->attri_flags);
+		xfs_attri_release(attrip);
+		return -EIO;
+	}
+
+	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_itruncate, 0, 0, 0, &tp);
+	if (error)
+		return error;
+	attrdp = xfs_trans_get_attrd(tp, attrip);
+	attrp = &attrip->attri_format.attri_attr;
+
+	error = xfs_trans_attr(tp, attrdp, attrp->attr_ino,
+				attrp->attr_op_flags,
+				attrp->attr_nameval_len,
+				attrp->attr_name_len, attrp->attr_flags,
+				attrp->attr_nameval);
+	if (error)
+		goto abort_error;
+
+
+	set_bit(XFS_ATTRI_RECOVERED, &attrip->attri_flags);
+	error = xfs_trans_commit(tp);
+	return error;
+
+abort_error:
+	xfs_trans_cancel(tp);
+	return error;
+}
diff --git a/fs/xfs/xfs_attr_item.h b/fs/xfs/xfs_attr_item.h
new file mode 100644
index 0000000..aec854f
--- /dev/null
+++ b/fs/xfs/xfs_attr_item.h
@@ -0,0 +1,104 @@
+/*
+ * Copyright (c) 2017 Oracle, Inc.
+ * All Rights Reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation Inc.
+ */
+#ifndef	__XFS_ATTR_ITEM_H__
+#define	__XFS_ATTR_ITEM_H__
+
+/* kernel only ATTRI/ATTRD definitions */
+
+struct xfs_mount;
+struct kmem_zone;
+
+/*
+ * Max number of attrs in fast allocation path.
+ */
+#define XFS_ATTRI_MAX_FAST_ATTRS        16
+
+
+/*
+ * Define ATTR flag bits. Manipulated by set/clear/test_bit operators.
+ */
+#define	XFS_ATTRI_RECOVERED	1
+
+/*
+ * This is the "attr intention" log item.  It is used to log the fact
+ * that some attrs need to be processed.  It is used in conjunction with the
+ * "attr done" log item described below.
+ *
+ * The ATTRI is reference counted so that it is not freed prior to both the
+ * ATTRI and ATTRD being committed and unpinned. This ensures the ATTRI is
+ * inserted into the AIL even in the event of out of order ATTRI/ATTRD
+ * processing. In other words, an ATTRI is born with two references:
+ *
+ *      1.) an ATTRI held reference to track ATTRI AIL insertion
+ *      2.) an ATTRD held reference to track ATTRD commit
+ *
+ * On allocation, both references are the responsibility of the caller. Once
+ * the ATTRI is added to and dirtied in a transaction, ownership of reference
+ * one transfers to the transaction. The reference is dropped once the ATTRI is
+ * inserted to the AIL or in the event of failure along the way (e.g., commit
+ * failure, log I/O error, etc.). Note that the caller remains responsible for
+ * the ATTRD reference under all circumstances to this point. The caller has no
+ * means to detect failure once the transaction is committed, however.
+ * Therefore, an ATTRD is required after this point, even in the event of
+ * unrelated failure.
+ *
+ * Once an ATTRD is allocated and dirtied in a transaction, reference two
+ * transfers to the transaction. The ATTRD reference is dropped once it reaches
+ * the unpin handler. Similar to the ATTRI, the reference also drops in the
+ *event of commit failure or log I/O errors. Note that the ATTRD is not
+ * inserted in the AIL, so at this point both the ATTI and ATTRD are freed.
+ */
+struct xfs_attri_log_item {
+	xfs_log_item_t			attri_item;
+	atomic_t			attri_refcount;
+	atomic_t			attri_next_attr;
+	unsigned long			attri_flags;	/* misc flags */
+	struct xfs_attri_log_format	attri_format;
+};
+
+/*
+ * This is the "attr done" log item.  It is used to log
+ * the fact that some attrs earlier mentioned in an attri item
+ * have been freed.
+ */
+struct xfs_attrd_log_item {
+	struct xfs_log_item		attrd_item;
+	struct xfs_attri_log_item	*attrd_attrip;
+	uint				attrd_next_attr;
+	struct xfs_attrd_log_format	attrd_format;
+};
+
+/*
+ * Max number of attrs in fast allocation path.
+ */
+#define	XFS_ATTRD_MAX_FAST_ATTRS	16
+
+extern struct kmem_zone	*xfs_attri_zone;
+extern struct kmem_zone	*xfs_attrd_zone;
+
+struct xfs_attri_log_item	*xfs_attri_init(struct xfs_mount *mp);
+struct xfs_attrd_log_item	*xfs_attrd_init(struct xfs_mount *mp,
+					struct xfs_attri_log_item *attrip);
+int xfs_attri_copy_format(struct xfs_log_iovec *buf,
+			  struct xfs_attri_log_format *dst_attri_fmt);
+void			xfs_attri_item_free(struct xfs_attri_log_item *attrip);
+void			xfs_attri_release(struct xfs_attri_log_item *attrip);
+
+int			xfs_attri_recover(struct xfs_mount *mp,
+					struct xfs_attri_log_item *attrip);
+
+#endif	/* __XFS_ATTR_ITEM_H__ */
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index d9a3a55..a206d51 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -2051,6 +2051,7 @@ init_xfs_fs(void)
 	xfs_rmap_update_init_defer_op();
 	xfs_refcount_update_init_defer_op();
 	xfs_bmap_update_init_defer_op();
+	xfs_attr_init_defer_op();
 
 	xfs_dir_startup();
 
diff --git a/fs/xfs/xfs_trans.h b/fs/xfs/xfs_trans.h
index 815b53d2..66c3c5f 100644
--- a/fs/xfs/xfs_trans.h
+++ b/fs/xfs/xfs_trans.h
@@ -40,6 +40,9 @@ struct xfs_cud_log_item;
 struct xfs_defer_ops;
 struct xfs_bui_log_item;
 struct xfs_bud_log_item;
+struct xfs_attrd_log_item;
+struct xfs_attri_log_item;
+
 
 typedef struct xfs_log_item {
 	struct list_head		li_ail;		/* AIL pointers */
@@ -223,12 +226,22 @@ void		xfs_trans_dirty_buf(struct xfs_trans *, struct xfs_buf *);
 void		xfs_trans_log_inode(xfs_trans_t *, struct xfs_inode *, uint);
 
 void		xfs_extent_free_init_defer_op(void);
+void            xfs_attr_init_defer_op(void);
+
 struct xfs_efd_log_item	*xfs_trans_get_efd(struct xfs_trans *,
 				  struct xfs_efi_log_item *,
 				  uint);
 int		xfs_trans_free_extent(struct xfs_trans *,
 				      struct xfs_efd_log_item *, xfs_fsblock_t,
 				      xfs_extlen_t, struct xfs_owner_info *);
+struct xfs_attrd_log_item *
+xfs_trans_get_attrd(struct xfs_trans *tp,
+		    struct xfs_attri_log_item *attrip);
+int xfs_trans_attr(struct xfs_trans *tp, struct xfs_attrd_log_item *attrdp,
+			xfs_ino_t ino, uint32_t attr_op_flags,
+			uint32_t nameval_len, uint32_t name_len,
+			uint32_t flags, char *nameval);
+
 int		xfs_trans_commit(struct xfs_trans *);
 int		xfs_trans_roll(struct xfs_trans **);
 int		xfs_trans_roll_inode(struct xfs_trans **, struct xfs_inode *);
diff --git a/fs/xfs/xfs_trans_attr.c b/fs/xfs/xfs_trans_attr.c
new file mode 100644
index 0000000..183c841
--- /dev/null
+++ b/fs/xfs/xfs_trans_attr.c
@@ -0,0 +1,293 @@
+/*
+ * Copyright (c) 2017, Oracle Inc.
+ * All Rights Reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it would be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write the Free Software Foundation Inc.
+ */
+#include "xfs.h"
+#include "xfs_fs.h"
+#include "xfs_shared.h"
+#include "xfs_format.h"
+#include "xfs_log_format.h"
+#include "xfs_trans_resv.h"
+#include "xfs_bit.h"
+#include "xfs_mount.h"
+#include "xfs_defer.h"
+#include "xfs_trans.h"
+#include "xfs_trans_priv.h"
+#include "xfs_attr_item.h"
+#include "xfs_alloc.h"
+#include "xfs_bmap.h"
+#include "xfs_trace.h"
+#include "libxfs/xfs_da_format.h"
+#include "xfs_da_btree.h"
+#include "xfs_attr.h"
+#include "xfs_inode.h"
+#include "xfs_icache.h"
+
+/*
+ * This routine is called to allocate an "extent free done"
+ * log item that will hold nextents worth of extents.  The
+ * caller must use all nextents extents, because we are not
+ * flexible about this at all.
+ */
+struct xfs_attrd_log_item *
+xfs_trans_get_attrd(struct xfs_trans		*tp,
+		  struct xfs_attri_log_item	*attrip)
+{
+	struct xfs_attrd_log_item			*attrdp;
+
+	ASSERT(tp != NULL);
+
+	attrdp = xfs_attrd_init(tp->t_mountp, attrip);
+	ASSERT(attrdp != NULL);
+
+	/*
+	 * Get a log_item_desc to point at the new item.
+	 */
+	xfs_trans_add_item(tp, &attrdp->attrd_item);
+	return attrdp;
+}
+
+/*
+ * Delete an attr and log it to the ATTRD. Note that the transaction is marked
+ * dirty regardless of whether the attr delete succeeds or fails to support the
+ * ATTRI/ATTRD lifecycle rules.
+ */
+int
+xfs_trans_attr(
+	struct xfs_trans	*tp,
+	struct xfs_attrd_log_item	*attrdp,
+	xfs_ino_t		ino,
+	uint32_t		op_flags,
+	uint32_t		nameval_len,
+	uint32_t		name_len,
+	uint32_t                flags,
+	char			*nameval)
+{
+	uint			next_attr;
+	struct xfs_attr	*attrp;
+	int			error;
+	int                     local;
+	int			val_len = nameval_len - name_len;
+	char			name[name_len + 1];
+	char			value[val_len + 1];
+	struct xfs_da_args      args;
+	struct xfs_inode	*dp;
+	struct xfs_defer_ops    dfops;
+	xfs_fsblock_t		firstblock = NULLFSBLOCK;
+	struct xfs_mount	*mp = tp->t_mountp;
+
+	error = xfs_iget(mp, tp, ino, flags, 0, &dp);
+	if (error)
+		return error;
+
+	memcpy(name, nameval, name_len);
+	name[name_len] = 0;
+
+	memcpy(value, &nameval[name_len], val_len);
+	value[val_len] = 0;
+
+	ASSERT(XFS_IFORK_Q((dp)));
+	tp->t_flags |= XFS_TRANS_RESERVE;
+
+	error = xfs_attr_args_init(&args, dp, name, flags);
+	if (error)
+		return error;
+
+	args.name = (char *)name;
+	args.namelen = name_len;
+	args.hashval = xfs_da_hashname(args.name, args.namelen);
+	args.value = (char *)value;
+	args.valuelen = val_len;
+	args.dfops = &dfops;
+	args.firstblock = &firstblock;
+	args.op_flags = XFS_DA_OP_OKNOENT;
+	args.total = xfs_attr_calc_size(&args, &local);
+	args.trans = tp;
+	ASSERT(local);
+
+	xfs_ilock(dp, XFS_ILOCK_EXCL);
+	xfs_defer_init(args.dfops, args.firstblock);
+
+	if (op_flags & ATTR_OP_FLAGS_SET) {
+		args.op_flags |= XFS_DA_OP_ADDNAME;
+		error = xfs_attr_set_args(&args, flags, false);
+	} else if (op_flags & ATTR_OP_FLAGS_REMOVE) {
+		error = xfs_attr_remove_args(&args, flags);
+	} else {
+		ASSERT(0);
+	}
+
+	if (error)
+		xfs_defer_cancel(&dfops);
+
+	xfs_iunlock(dp, XFS_ILOCK_EXCL);
+
+	/*
+	 * Mark the transaction dirty, even on error. This ensures the
+	 * transaction is aborted, which:
+	 *
+	 * 1.) releases the ATTRI and frees the ATTRD
+	 * 2.) shuts down the filesystem
+	 */
+	tp->t_flags |= XFS_TRANS_DIRTY;
+	attrdp->attrd_item.li_desc->lid_flags |= XFS_LID_DIRTY;
+
+	next_attr = attrdp->attrd_next_attr;
+	attrp = &(attrdp->attrd_format.attrd_attr);
+	attrp->attr_ino = ino;
+	attrp->attr_op_flags = op_flags;
+	attrp->attr_nameval_len = nameval_len;
+	attrp->attr_name_len = name_len;
+	attrp->attr_flags = flags;
+	memcpy(attrp->attr_nameval, nameval, nameval_len);
+	attrdp->attrd_next_attr++;
+
+	return error;
+}
+
+static int
+xfs_attr_diff_items(
+	void				*priv,
+	struct list_head		*a,
+	struct list_head		*b)
+{
+	return 0;
+}
+
+/* Get an ATTRI. */
+STATIC void *
+xfs_attr_create_intent(
+	struct xfs_trans		*tp,
+	unsigned int			count)
+{
+	struct xfs_attri_log_item		*attrip;
+
+	ASSERT(tp != NULL);
+	ASSERT(count > 0);
+
+	attrip = xfs_attri_init(tp->t_mountp);
+	ASSERT(attrip != NULL);
+
+	/*
+	 * Get a log_item_desc to point at the new item.
+	 */
+	xfs_trans_add_item(tp, &attrip->attri_item);
+	return attrip;
+}
+
+/* Log an attr to the intent item. */
+STATIC void
+xfs_attr_log_item(
+	struct xfs_trans		*tp,
+	void				*intent,
+	struct list_head		*item)
+{
+	struct xfs_attri_log_item		*attrip = intent;
+	struct xfs_attr_item	*free;
+	uint				next_attr;
+	struct xfs_attr		*attrp;
+
+	free = container_of(item, struct xfs_attr_item, xattri_list);
+
+	tp->t_flags |= XFS_TRANS_DIRTY;
+	attrip->attri_item.li_desc->lid_flags |= XFS_LID_DIRTY;
+
+	/*
+	 * atomic_inc_return gives us the value after the increment;
+	 * we want to use it as an array index so we need to subtract 1 from
+	 * it.
+	 */
+	next_attr = atomic_inc_return(&attrip->attri_next_attr) - 1;
+	attrp = &attrip->attri_format.attri_attr;
+	attrp->attr_ino = free->xattri_ino;
+	attrp->attr_op_flags = free->xattri_op_flags;
+	attrp->attr_nameval_len = free->xattri_nameval_len;
+	attrp->attr_name_len = free->xattri_name_len;
+	attrp->attr_flags = free->xattri_flags;
+	memcpy(attrp->attr_nameval, free->xattri_nameval,
+	       free->xattri_nameval_len);
+}
+
+/* Get an ATTRD so we can process all the attrs. */
+STATIC void *
+xfs_attr_create_done(
+	struct xfs_trans		*tp,
+	void				*intent,
+	unsigned int			count)
+{
+	return xfs_trans_get_attrd(tp, intent);
+}
+
+/* Process an attr. */
+STATIC int
+xfs_attr_finish_item(
+	struct xfs_trans		*tp,
+	struct xfs_defer_ops		*dop,
+	struct list_head		*item,
+	void				*done_item,
+	void				**state)
+{
+	struct xfs_attr_item	*free;
+	int				error;
+
+	free = container_of(item, struct xfs_attr_item, xattri_list);
+	error = xfs_trans_attr(tp, done_item,
+			free->xattri_ino,
+			free->xattri_op_flags,
+			free->xattri_nameval_len,
+			free->xattri_name_len,
+			free->xattri_flags,
+			free->xattri_nameval);
+	kmem_free(free);
+	return error;
+}
+
+/* Abort all pending EFIs. */
+STATIC void
+xfs_attr_abort_intent(
+	void				*intent)
+{
+	xfs_attri_release(intent);
+}
+
+/* Cancel an attr */
+STATIC void
+xfs_attr_cancel_item(
+	struct list_head		*item)
+{
+	struct xfs_attr_item	*free;
+
+	free = container_of(item, struct xfs_attr_item, xattri_list);
+	kmem_free(free);
+}
+
+static const struct xfs_defer_op_type xfs_attr_defer_type = {
+	.type		= XFS_DEFER_OPS_TYPE_ATTR,
+	.max_items	= XFS_ATTRI_MAX_FAST_ATTRS,
+	.diff_items	= xfs_attr_diff_items,
+	.create_intent	= xfs_attr_create_intent,
+	.abort_intent	= xfs_attr_abort_intent,
+	.log_item	= xfs_attr_log_item,
+	.create_done	= xfs_attr_create_done,
+	.finish_item	= xfs_attr_finish_item,
+	.cancel_item	= xfs_attr_cancel_item,
+};
+
+/* Register the deferred op type. */
+void
+xfs_attr_init_defer_op(void)
+{
+	xfs_defer_init_op_type(&xfs_attr_defer_type);
+}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 66+ messages in thread

end of thread, other threads:[~2017-10-23 19:20 UTC | newest]

Thread overview: 66+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-18 22:55 [PATCH 00/17] Parent Pointers V3 Allison Henderson
2017-10-18 22:55 ` [PATCH 01/17] Add helper functions xfs_attr_set_args and xfs_attr_remove_args Allison Henderson
2017-10-19 20:03   ` Darrick J. Wong
2017-10-21  1:14     ` Allison Henderson
2017-10-18 22:55 ` [PATCH 02/17] Set up infastructure for deferred attribute operations Allison Henderson
2017-10-19 19:02   ` Darrick J. Wong
2017-10-21  1:08     ` Allison Henderson
2017-10-18 22:55 ` [PATCH 03/17] Add xfs_attr_set_defered and xfs_attr_remove_defered Allison Henderson
2017-10-19 19:13   ` Darrick J. Wong
2017-10-21  1:08     ` Allison Henderson
2017-10-18 22:55 ` [PATCH 04/17] Remove all strlen calls in all xfs_attr_* functions for attr names Allison Henderson
2017-10-19 19:15   ` Darrick J. Wong
2017-10-21  1:10     ` Allison Henderson
2017-10-18 22:55 ` [PATCH 05/17] xfs: get directory offset when adding directory name Allison Henderson
2017-10-18 22:55 ` [PATCH 06/17] xfs: get directory offset when removing " Allison Henderson
2017-10-19 19:17   ` Darrick J. Wong
2017-10-21  1:11     ` Allison Henderson
2017-10-18 22:55 ` [PATCH 07/17] xfs: get directory offset when replacing a " Allison Henderson
2017-10-18 22:55 ` [PATCH 08/17] xfs: add parent pointer support to attribute code Allison Henderson
2017-10-18 22:55 ` [PATCH 09/17] xfs: define parent pointer xattr format Allison Henderson
2017-10-18 22:55 ` [PATCH 10/17] :xfs: extent transaction reservations for parent attributes Allison Henderson
2017-10-19 18:24   ` Darrick J. Wong
     [not found]     ` <8680e0c1-ada8-06e3-e397-61a5076030be@oracle.com>
2017-10-20 23:45       ` Darrick J. Wong
2017-10-21  0:12         ` Allison Henderson
2017-10-21  1:07     ` Allison Henderson
2017-10-18 22:55 ` [PATCH 11/17] Add the extra space requirements for parent pointer attributes when calculating the minimum log size during mkfs Allison Henderson
2017-10-19 18:13   ` Darrick J. Wong
2017-10-21  1:07     ` Allison Henderson
2017-10-18 22:55 ` [PATCH 12/17] xfs: parent pointer attribute creation Allison Henderson
2017-10-19 19:36   ` Darrick J. Wong
     [not found]     ` <9185d3e8-4b41-b2d8-294b-934f7d3409f0@oracle.com>
2017-10-21  0:03       ` Darrick J. Wong
2017-10-21  1:11     ` Allison Henderson
2017-10-18 22:55 ` [PATCH 13/17] xfs: add parent attributes to link Allison Henderson
2017-10-19 19:40   ` Darrick J. Wong
2017-10-21  1:12     ` Allison Henderson
2017-10-18 22:55 ` [PATCH 14/17] xfs: remove parent pointers in unlink Allison Henderson
2017-10-19 19:43   ` Darrick J. Wong
2017-10-21  1:12     ` Allison Henderson
2017-10-18 22:55 ` [PATCH 15/17] xfs_bmap_add_attrfork(): re-add error handling from set_attrforkoff() call Allison Henderson
2017-10-19 19:43   ` Darrick J. Wong
2017-10-21  1:13     ` Allison Henderson
2017-10-18 22:55 ` [PATCH 16/17] Add parent pointers to rename Allison Henderson
2017-10-18 22:55 ` [PATCH 17/17] Add the parent pointer support to the superblock version 5 Allison Henderson
2017-10-19  3:57   ` Amir Goldstein
2017-10-19 20:06     ` Darrick J. Wong
2017-10-20  3:18       ` Amir Goldstein
2017-10-19 19:45   ` Darrick J. Wong
2017-10-21  1:13     ` Allison Henderson
2017-10-19  4:11 ` [PATCH 00/17] Parent Pointers V3 Amir Goldstein
2017-10-20  3:22   ` Amir Goldstein
2017-10-21  1:06     ` Allison Henderson
2017-10-20 22:41   ` Dave Chinner
2017-10-21  7:34     ` Amir Goldstein
2017-10-22 23:27       ` Dave Chinner
2017-10-23  4:30         ` Amir Goldstein
2017-10-23  5:32           ` Dave Chinner
2017-10-23  6:48             ` Amir Goldstein
2017-10-23  8:40               ` Dave Chinner
2017-10-23  9:06                 ` Amir Goldstein
2017-10-23 17:14                   ` Darrick J. Wong
2017-10-23 19:20                     ` Amir Goldstein
  -- strict thread matches above, loose matches on Subject: below --
2017-10-06 22:05 [PATCH 00/17] Parent Pointers V2 Allison Henderson
2017-10-06 22:05 ` [PATCH 02/17] Set up infastructure for deferred attribute operations Allison Henderson
2017-10-09  4:20   ` Dave Chinner
2017-10-09 21:25     ` Allison Henderson
2017-10-09 22:51       ` Allison Henderson
2017-10-09 23:27         ` Dave Chinner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.