All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] [RFC] xfs_fsr: Improve handling of attribute forks
@ 2010-03-05  4:40 Dave Chinner
  2010-03-06 10:43 ` Christoph Hellwig
  0 siblings, 1 reply; 3+ messages in thread
From: Dave Chinner @ 2010-03-05  4:40 UTC (permalink / raw)
  To: xfs

From: Dave Chinner <dchinner@redhat.com>

If the file being defragmented has attributes, then fsr puts a dummy
attribute on the temporary file to try to ensure that the inode
attribute fork offset is set correctly. This works perfectly well
for the old style of attributes that use a fixed fork offset - the
presence of any attribute of any size or shape will result in fsr
doing the correct thing.

However, for attr2 filesystems, the attribute fork offset is
dependent on the size and shape of both the data and attribute
forks. Hence setting a small attribute on the file does not
guarantee that the two inodes have the same fork offset and
therefore compatible for a data fork swap.

This patch improves the attribute fork handling of fsr. It checks
the filesystem version to see if the old style attributes are in
use, and if so uses the current method.

If attr2 is in use, fsr uses bulkstat output to determine what the
fork offset is. If the attribute fork offsets differ then fsr will
try to create attributes that will result in the correct offset. If
that fails, or the attribute fork is too large, it will give up and just
attempt the swap.

This fork offset value in bulkstat new functionality in the kernel,
so if there are attributes and a zero fork offset, then the kernel
does not support this feature and we simply fall back to the existing,
less effective code.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fsr/xfs_fsr.c    |  152 +++++++++++++++++++++++++++++++++++++++++++++++++++---
 include/xfs_fs.h |    3 +-
 2 files changed, 146 insertions(+), 9 deletions(-)

diff --git a/fsr/xfs_fsr.c b/fsr/xfs_fsr.c
index 1f933c7..0c22d68 100644
--- a/fsr/xfs_fsr.c
+++ b/fsr/xfs_fsr.c
@@ -17,8 +17,13 @@
  */
 
 #include <xfs/xfs.h>
+#include <libxfs.h>
+#include <xfs/xfs_types.h>
 #include <xfs/jdm.h>
 #include <xfs/xfs_dfrag.h>
+#include <xfs/xfs_bmap_btree.h>
+#include <xfs/xfs_dinode.h>
+#include <xfs/xfs_attr_sf.h>
 
 #include <fcntl.h>
 #include <errno.h>
@@ -946,6 +951,141 @@ fsrfile_common(
 	return -1; /* no error */
 }
 
+/*
+ * Attempt to set the attr fork up correctly. This is simple for attr1
+ * filesystems as they have a fixed inode fork offset. In that case
+ * just create an attribute and that's all we need to do.
+ *
+ * For attr2 filesystems, see if we have the actual fork offset in
+ * the bstat structure. If so, just create additional attributes on
+ * the temporary inode until the offset matches.
+ *
+ * If it doesn't exist, we can only do best effort. Remove any existing
+ * attributes from the temporary inode and copy the attributes across.
+ * This should hopefully put the fork offset in the right place. It's not
+ * a big deal if we don't get it right - the kernel will reject it when
+ * we try to swap extents.
+ */
+static int
+fsr_setup_attr_fork(
+	int		fd,
+	int		tfd,
+	xfs_bstat_t	*bstatp)
+{
+	struct stat64	tstatbuf;
+	int		i;
+
+	if (!(bstatp->bs_xflags & XFS_XFLAG_HASATTR))
+		return 0;
+
+	/*
+	 * use the old method if we have attr1 or the kernel does not yet
+	 * support passing the fork offset in the bulkstat data.
+	 */
+	if (!(fsgeom.flags & XFS_FSOP_GEOM_FLAGS_ATTR2) ||
+	    bstatp->bs_forkoff == 0) {
+		/* attr1 */
+		if (fsetxattr(tfd, "user.X", "X", 1, XATTR_CREATE) != 0) {
+			fsrprintf(_("could not set ATTR\n"));
+			return -1;
+		}
+		goto out;
+	}
+
+	/* attr2 w/ fork offsets */
+
+	/*
+	 * first, bulkstat the inode behind the tfd to see what the forkoff is
+	 * there. If they already match (e.g. default acls in use), we're done.
+	 * If not, we go the hard way.
+	 */
+	if (fstat64(tfd, &tstatbuf) < 0) {
+		fsrprintf(_("unable to stat temp file: %s\n"),
+					strerror(errno));
+		return -1;
+	}
+
+	i = 0;
+	do {
+		xfs_bstat_t	tbstat;
+		xfs_ino_t	ino;
+		char		buf[65536];
+		char		name[64];
+		int		diff;
+
+		ino = tstatbuf.st_ino;
+		if ((xfs_bulkstat_single(tfd, &ino, &tbstat)) < 0) {
+			fsrprintf(_("unable to get bstat on temp file: %s\n"),
+						strerror(errno));
+			return -1;
+		}
+
+
+		if (dflag)
+			fsrprintf(_("orig forkoff %d, temp forkoff %d\n"),
+					bstatp->bs_forkoff, tbstat.bs_forkoff);
+		/*
+		 * calculate difference in size,
+		 * If the temporary inode has a larger fork offset, then
+		 * creating a new attr will move it towards the correct size.
+		 *
+		 * If the temporary inode has a smaller fork offset, then it
+		 * becomes difficult - moving the fork offset the other way is
+		 * not so simple, so simply break out and hope that the swap
+		 * extents operation will succeed. (XXX: can probably do it
+		 * with single block sparse synchronous writes to the temporary
+		 * file that we truncate away again later)
+		 *
+		 * If the temporary inode does not have a fork offset set yet,
+		 * then just create an roughly the correct size and then
+		 * reset for another first pass.
+		 */
+		if (!tbstat.bs_forkoff) {
+			diff = bstatp->bs_forkoff -
+					sizeof(struct xfs_attr_sf_hdr);
+			i--;
+		} else
+			diff = tbstat.bs_forkoff - bstatp->bs_forkoff;
+		if (diff <= 0) /* same or temp inode forkoff smaller */
+			goto out;
+		if (diff > fsgeom.inodesize - sizeof(struct xfs_dinode)) {
+			fsrprintf(_("forkoff diff %d too large!\n"), diff);
+			return -1;
+		}
+
+		/*
+		 * take into account attr header sizes and name length.  If
+		 * creating 10 attributes is not enough, then we'll abort
+		 * because we assume single character attribute names.  This
+		 * attempts an attribute shortform addition first time through,
+		 * and if that doesn't work, it assumes we are in extent or
+		 * btree form so adds a block sized attribute at a time. If
+		 * adding 10 attributes doesn't get us there, then we give up.
+		 */
+		snprintf(name, sizeof(name), "user.%d", i);
+		if (i <= 0)
+			diff -= sizeof(struct xfs_attr_sf_entry) + 1;
+		else
+			diff = fsgeom.blocksize - 1 -
+					sizeof(struct xfs_attr_leaf_entry);
+
+		if (diff < 0)
+			goto out;
+
+		memset(buf, 'X', diff);
+		if (fsetxattr(tfd, name, buf, diff, XATTR_CREATE) != 0) {
+			fsrprintf(_("could not set ATTR\n"));
+			return -1;
+		}
+		goto out;
+
+	} while (++i < 100);
+
+out:
+	if (dflag)
+		fsrprintf(_("set temp attr\n"));
+	return 0;
+}
 
 /*
  * Do the defragmentation of a single file.
@@ -1000,14 +1140,10 @@ packfile(char *fname, char *tname, int fd,
 	unlink(tname);
 
 	/* Setup extended attributes */
-	if (statp->bs_xflags & XFS_XFLAG_HASATTR) {
-		if (fsetxattr(tfd, "user.X", "X", 1, XATTR_CREATE) != 0) {
-			fsrprintf(_("could not set ATTR on tmp: %s:\n"), tname);
-			close(tfd);
-			return -1;
-		}
-		if (dflag)
-			fsrprintf(_("%s set temp attr\n"), tname);
+	if (fsr_setup_attr_fork(fd, tfd, statp) != 0) {
+		fsrprintf(_("failed to set ATTR fork on tmp: %s:\n"), tname);
+		close(tfd);
+		return -1;
 	}
 
 	/* Setup extended inode flags, project identifier, etc */
diff --git a/include/xfs_fs.h b/include/xfs_fs.h
index 9aff7bb..2376abb 100644
--- a/include/xfs_fs.h
+++ b/include/xfs_fs.h
@@ -300,7 +300,8 @@ typedef struct xfs_bstat {
 	__s32		bs_extents;	/* number of extents		*/
 	__u32		bs_gen;		/* generation count		*/
 	__u16		bs_projid;	/* project id			*/
-	unsigned char	bs_pad[14];	/* pad space, unused		*/
+	__u16		bs_forkoff;	/* inode fork offset in bytes	*/
+	unsigned char	bs_pad[12];	/* pad space, unused		*/
 	__u32		bs_dmevmask;	/* DMIG event mask		*/
 	__u16		bs_dmstate;	/* DMIG state info		*/
 	__u16		bs_aextents;	/* attribute number of extents	*/
-- 
1.6.5

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] [RFC] xfs_fsr: Improve handling of attribute forks
  2010-03-05  4:40 [PATCH] [RFC] xfs_fsr: Improve handling of attribute forks Dave Chinner
@ 2010-03-06 10:43 ` Christoph Hellwig
  2010-03-06 11:40   ` Dave Chinner
  0 siblings, 1 reply; 3+ messages in thread
From: Christoph Hellwig @ 2010-03-06 10:43 UTC (permalink / raw)
  To: Dave Chinner; +Cc: xfs

On Fri, Mar 05, 2010 at 03:40:49PM +1100, Dave Chinner wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> If the file being defragmented has attributes, then fsr puts a dummy
> attribute on the temporary file to try to ensure that the inode
> attribute fork offset is set correctly. This works perfectly well
> for the old style of attributes that use a fixed fork offset - the
> presence of any attribute of any size or shape will result in fsr
> doing the correct thing.
> 
> However, for attr2 filesystems, the attribute fork offset is
> dependent on the size and shape of both the data and attribute
> forks. Hence setting a small attribute on the file does not
> guarantee that the two inodes have the same fork offset and
> therefore compatible for a data fork swap.
> 
> This patch improves the attribute fork handling of fsr. It checks
> the filesystem version to see if the old style attributes are in
> use, and if so uses the current method.
> 
> If attr2 is in use, fsr uses bulkstat output to determine what the
> fork offset is. If the attribute fork offsets differ then fsr will
> try to create attributes that will result in the correct offset. If
> that fails, or the attribute fork is too large, it will give up and just
> attempt the swap.
> 
> This fork offset value in bulkstat new functionality in the kernel,
> so if there are attributes and a zero fork offset, then the kernel
> does not support this feature and we simply fall back to the existing,
> less effective code.

Looks reasonable.  It would be good to have a testcase for this in
xfsqa to verify this works.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] [RFC] xfs_fsr: Improve handling of attribute forks
  2010-03-06 10:43 ` Christoph Hellwig
@ 2010-03-06 11:40   ` Dave Chinner
  0 siblings, 0 replies; 3+ messages in thread
From: Dave Chinner @ 2010-03-06 11:40 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: xfs

On Sat, Mar 06, 2010 at 05:43:57AM -0500, Christoph Hellwig wrote:
> On Fri, Mar 05, 2010 at 03:40:49PM +1100, Dave Chinner wrote:
> > From: Dave Chinner <dchinner@redhat.com>
> > 
> > If the file being defragmented has attributes, then fsr puts a dummy
> > attribute on the temporary file to try to ensure that the inode
> > attribute fork offset is set correctly. This works perfectly well
> > for the old style of attributes that use a fixed fork offset - the
> > presence of any attribute of any size or shape will result in fsr
> > doing the correct thing.
> > 
> > However, for attr2 filesystems, the attribute fork offset is
> > dependent on the size and shape of both the data and attribute
> > forks. Hence setting a small attribute on the file does not
> > guarantee that the two inodes have the same fork offset and
> > therefore compatible for a data fork swap.
> > 
> > This patch improves the attribute fork handling of fsr. It checks
> > the filesystem version to see if the old style attributes are in
> > use, and if so uses the current method.
> > 
> > If attr2 is in use, fsr uses bulkstat output to determine what the
> > fork offset is. If the attribute fork offsets differ then fsr will
> > try to create attributes that will result in the correct offset. If
> > that fails, or the attribute fork is too large, it will give up and just
> > attempt the swap.
> > 
> > This fork offset value in bulkstat new functionality in the kernel,
> > so if there are attributes and a zero fork offset, then the kernel
> > does not support this feature and we simply fall back to the existing,
> > less effective code.
> 
> Looks reasonable.  It would be good to have a testcase for this in
> xfsqa to verify this works.

Yeah, so far I've tested by running './check -g auto' and finding
the inodes that failed with the standard fsr, then running the fixed
fsr on them. It's not particularly efficient....

FWIW, there's enough information from fsr and xfs_db to be able to
recreate the failure scenario. e.g. from 'xfs_fsr -d -v ...' on a
failed inode:

ino=149
ino=149 extents=21 can_save=2 tmp=/mnt/test/.fsr/ag0/tmp18175
set temp attr
DEBUG: fsize=2873344 blsz_dio=2873344 d_min=512 d_max=2147483136
pgsz=4096
Temporary file has 18 extents (21 in original)
XFS_IOC_SWAPEXT failed: ino=149: Invalid argument

That inode:

$ sudo xfs_db -r -c "inode 149" -c "p" /dev/sdb1
core.magic = 0x494e                                                                      
core.mode = 0100666                                                                      
core.version = 2
core.format = 3 (btree)
....
core.size = 2873344
core.nblocks = 194
core.extsize = 0
core.nextents = 12
core.naextents = 3
core.forkoff = 15
core.aformat = 3 (btree)
....

has 12 data extents and 3 attribute extents. I should be able to
create this with xfs_io without too much trouble.

Actaully the patch I posted fails on this inode - the previous ones
that were created were btree data fork and extent attribute fork,
so there's more work on this to be done.

More importantly, I discovered a neat option to xfs_fsr yesterday:

$ export FSRXFSTEST=true
$ xfs_fsr -C 18 ....

which allows you to control the number of extents in the temporary
file, so I should be able to recreate the exact conditions that
triggered a failure as well....

I'll see what I can come up with and do some more rigorous testing
of the patch before moving forward with it.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2010-03-06 11:38 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-05  4:40 [PATCH] [RFC] xfs_fsr: Improve handling of attribute forks Dave Chinner
2010-03-06 10:43 ` Christoph Hellwig
2010-03-06 11:40   ` Dave Chinner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.