All of lore.kernel.org
 help / color / mirror / Atom feed
* xfs: validate inode numbers in file handles correctly
@ 2010-06-18  7:32 Dave Chinner
  2010-06-18  7:32 ` [PATCH 1/4] xfs: always use iget in bulkstat Dave Chinner
                   ` (4 more replies)
  0 siblings, 5 replies; 21+ messages in thread
From: Dave Chinner @ 2010-06-18  7:32 UTC (permalink / raw)
  To: xfs; +Cc: security

This series closes a recently discovered problem in XFS filehandle conversion.
On systems where inodes are dynamically deleted, XFS does not adequately verify
the inode numbers in the filehandles, which results in reading stale inodes
from disk and potentially returning them as valid files. Because these unlinked
inodes were never zeroed out when the chunk was deallocated, some inodes in the
chunk can still appear to have to data extents attached to them. This can lead
to stale data exposure, exposure of active data and potentially overwriting of
active data if the stale extents referenced in the unlinked inodes have been
re-allocated.

Both NFS filehandles and local filehandles provided through libhandle have this
same problem. libhandle requires root permissions to use the interface, so it
is not exposing information that you can't get more easily with other means
(e.g. xfs_db or reading directly form the block device), so there isn't really
an issue here.

For NFS, we may incorrectly accept stale file handles for unlinked inodes after
a server reboot if the unlinked inodes have not been overwritten leading to the
above issues being triggered if multiple NFS clients are accessing the some
files.

Christoph's make-bulkstat-coherent patch is the basis for this series as
bulkstat can also expose unlinked inodes and information about them back to
userspace because it makes the same assumptions about inode lookups as the file
handle interfaces.

As a result, the first two patches of the series make up the real bug fix. The
last two patches make it clear we are looking up untrusted inode numbers and
remove a shortcut that these interfaces used that we do not want used any
more. Hence for backports to other kernels, only the first two patches are
necessary.

More information and the test program that demonstrates the issue via the
open_by_handle interface can be found here:

http://oss.sgi.com/archives/xfs/2010-06/msg00191.html

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH 1/4] xfs: always use iget in bulkstat
  2010-06-18  7:32 xfs: validate inode numbers in file handles correctly Dave Chinner
@ 2010-06-18  7:32 ` Dave Chinner
  2010-06-18  7:32 ` [PATCH 2/4] xfs: validate untrusted inode numbers during lookup Dave Chinner
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 21+ messages in thread
From: Dave Chinner @ 2010-06-18  7:32 UTC (permalink / raw)
  To: xfs; +Cc: security

From: Christoph Hellwig <hch@lst.de>

The non-coherent bulkstat versionsthat look directly at the inode
buffers causes various problems with performance optimizations that
make increased use of just logging inodes.  This patch makes bulkstat
always use iget, which should be fast enough for normal use with the
radix-tree based inode cache introduced a while ago.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/linux-2.6/xfs_ioctl.c   |    7 +-
 fs/xfs/linux-2.6/xfs_ioctl32.c |   12 +-
 fs/xfs/quota/xfs_qm.c          |   11 +-
 fs/xfs/quota/xfs_qm_syscalls.c |   16 +--
 fs/xfs/xfs_itable.c            |  281 ++++++----------------------------------
 fs/xfs/xfs_itable.h            |   14 --
 6 files changed, 59 insertions(+), 282 deletions(-)

diff --git a/fs/xfs/linux-2.6/xfs_ioctl.c b/fs/xfs/linux-2.6/xfs_ioctl.c
index faafd94..a12ddda 100644
--- a/fs/xfs/linux-2.6/xfs_ioctl.c
+++ b/fs/xfs/linux-2.6/xfs_ioctl.c
@@ -670,10 +670,9 @@ xfs_ioc_bulkstat(
 		error = xfs_bulkstat_single(mp, &inlast,
 						bulkreq.ubuffer, &done);
 	else	/* XFS_IOC_FSBULKSTAT */
-		error = xfs_bulkstat(mp, &inlast, &count,
-			(bulkstat_one_pf)xfs_bulkstat_one, NULL,
-			sizeof(xfs_bstat_t), bulkreq.ubuffer,
-			BULKSTAT_FG_QUICK, &done);
+		error = xfs_bulkstat(mp, &inlast, &count, xfs_bulkstat_one,
+				     sizeof(xfs_bstat_t), bulkreq.ubuffer,
+				     &done);
 
 	if (error)
 		return -error;
diff --git a/fs/xfs/linux-2.6/xfs_ioctl32.c b/fs/xfs/linux-2.6/xfs_ioctl32.c
index 40118b6..48c1aba 100644
--- a/fs/xfs/linux-2.6/xfs_ioctl32.c
+++ b/fs/xfs/linux-2.6/xfs_ioctl32.c
@@ -233,15 +233,13 @@ xfs_bulkstat_one_compat(
 	xfs_ino_t	ino,		/* inode number to get data for */
 	void		__user *buffer,	/* buffer to place output in */
 	int		ubsize,		/* size of buffer */
-	void		*private_data,	/* my private data */
 	xfs_daddr_t	bno,		/* starting bno of inode cluster */
 	int		*ubused,	/* bytes used by me */
-	void		*dibuff,	/* on-disk inode buffer */
 	int		*stat)		/* BULKSTAT_RV_... */
 {
 	return xfs_bulkstat_one_int(mp, ino, buffer, ubsize,
 				    xfs_bulkstat_one_fmt_compat, bno,
-				    ubused, dibuff, stat);
+				    ubused, stat);
 }
 
 /* copied from xfs_ioctl.c */
@@ -294,13 +292,11 @@ xfs_compat_ioc_bulkstat(
 		int res;
 
 		error = xfs_bulkstat_one_compat(mp, inlast, bulkreq.ubuffer,
-				sizeof(compat_xfs_bstat_t),
-				NULL, 0, NULL, NULL, &res);
+				sizeof(compat_xfs_bstat_t), 0, NULL, &res);
 	} else if (cmd == XFS_IOC_FSBULKSTAT_32) {
 		error = xfs_bulkstat(mp, &inlast, &count,
-			xfs_bulkstat_one_compat, NULL,
-			sizeof(compat_xfs_bstat_t), bulkreq.ubuffer,
-			BULKSTAT_FG_QUICK, &done);
+			xfs_bulkstat_one_compat, sizeof(compat_xfs_bstat_t),
+			bulkreq.ubuffer, &done);
 	} else
 		error = XFS_ERROR(EINVAL);
 	if (error)
diff --git a/fs/xfs/quota/xfs_qm.c b/fs/xfs/quota/xfs_qm.c
index 7a231ff..25bc7da 100644
--- a/fs/xfs/quota/xfs_qm.c
+++ b/fs/xfs/quota/xfs_qm.c
@@ -1625,10 +1625,8 @@ xfs_qm_dqusage_adjust(
 	xfs_ino_t	ino,		/* inode number to get data for */
 	void		__user *buffer,	/* not used */
 	int		ubsize,		/* not used */
-	void		*private_data,	/* not used */
 	xfs_daddr_t	bno,		/* starting block of inode cluster */
 	int		*ubused,	/* not used */
-	void		*dip,		/* on-disk inode pointer (not used) */
 	int		*res)		/* result code value */
 {
 	xfs_inode_t	*ip;
@@ -1789,12 +1787,13 @@ xfs_qm_quotacheck(
 		 * Iterate thru all the inodes in the file system,
 		 * adjusting the corresponding dquot counters in core.
 		 */
-		if ((error = xfs_bulkstat(mp, &lastino, &count,
-				     xfs_qm_dqusage_adjust, NULL,
-				     structsz, NULL, BULKSTAT_FG_IGET, &done)))
+		error = xfs_bulkstat(mp, &lastino, &count,
+				     xfs_qm_dqusage_adjust,
+				     structsz, NULL, &done);
+		if (error)
 			break;
 
-	} while (! done);
+	} while (!done);
 
 	/*
 	 * We've made all the changes that we need to make incore.
diff --git a/fs/xfs/quota/xfs_qm_syscalls.c b/fs/xfs/quota/xfs_qm_syscalls.c
index 19f218c..aa44c58 100644
--- a/fs/xfs/quota/xfs_qm_syscalls.c
+++ b/fs/xfs/quota/xfs_qm_syscalls.c
@@ -1095,10 +1095,8 @@ xfs_qm_internalqcheck_adjust(
 	xfs_ino_t	ino,		/* inode number to get data for */
 	void		__user *buffer,	/* not used */
 	int		ubsize,		/* not used */
-	void		*private_data,	/* not used */
 	xfs_daddr_t	bno,		/* starting block of inode cluster */
 	int		*ubused,	/* not used */
-	void		*dip,		/* not used */
 	int		*res)		/* bulkstat result code */
 {
 	xfs_inode_t		*ip;
@@ -1191,15 +1189,15 @@ xfs_qm_internalqcheck(
 		 * Iterate thru all the inodes in the file system,
 		 * adjusting the corresponding dquot counters
 		 */
-		if ((error = xfs_bulkstat(mp, &lastino, &count,
-				 xfs_qm_internalqcheck_adjust, NULL,
-				 0, NULL, BULKSTAT_FG_IGET, &done))) {
+		error = xfs_bulkstat(mp, &lastino, &count,
+				 xfs_qm_internalqcheck_adjust,
+				 0, NULL, &done);
+		if (error) {
+			cmn_err(CE_DEBUG, "Bulkstat returned error 0x%x", error);
 			break;
 		}
-	} while (! done);
-	if (error) {
-		cmn_err(CE_DEBUG, "Bulkstat returned error 0x%x", error);
-	}
+	} while (!done);
+
 	cmn_err(CE_DEBUG, "Checking results against system dquots");
 	for (i = 0; i < qmtest_hashmask; i++) {
 		xfs_dqtest_t	*d, *n;
diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c
index 8314386..f554bd9 100644
--- a/fs/xfs/xfs_itable.c
+++ b/fs/xfs/xfs_itable.c
@@ -45,24 +45,41 @@ xfs_internal_inum(
 		 (ino == mp->m_sb.sb_uquotino || ino == mp->m_sb.sb_gquotino)));
 }
 
-STATIC int
-xfs_bulkstat_one_iget(
-	xfs_mount_t	*mp,		/* mount point for filesystem */
-	xfs_ino_t	ino,		/* inode number to get data for */
-	xfs_daddr_t	bno,		/* starting bno of inode cluster */
-	xfs_bstat_t	*buf,		/* return buffer */
-	int		*stat)		/* BULKSTAT_RV_... */
+/*
+ * Return stat information for one inode.
+ * Return 0 if ok, else errno.
+ */
+int
+xfs_bulkstat_one_int(
+	struct xfs_mount	*mp,		/* mount point for filesystem */
+	xfs_ino_t		ino,		/* inode to get data for */
+	void __user		*buffer,	/* buffer to place output in */
+	int			ubsize,		/* size of buffer */
+	bulkstat_one_fmt_pf	formatter,	/* formatter, copy to user */
+	xfs_daddr_t		bno,		/* starting bno of cluster */
+	int			*ubused,	/* bytes used by me */
+	int			*stat)		/* BULKSTAT_RV_... */
 {
-	xfs_icdinode_t	*dic;	/* dinode core info pointer */
-	xfs_inode_t	*ip;		/* incore inode pointer */
-	struct inode	*inode;
-	int		error;
+	struct xfs_icdinode	*dic;		/* dinode core info pointer */
+	struct xfs_inode	*ip;		/* incore inode pointer */
+	struct inode		*inode;
+	struct xfs_bstat	*buf;		/* return buffer */
+	int			error = 0;	/* error value */
+
+	*stat = BULKSTAT_RV_NOTHING;
+
+	if (!buffer || xfs_internal_inum(mp, ino))
+		return XFS_ERROR(EINVAL);
+
+	buf = kmem_alloc(sizeof(*buf), KM_SLEEP | KM_MAYFAIL);
+	if (!buf)
+		return XFS_ERROR(ENOMEM);
 
 	error = xfs_iget(mp, NULL, ino,
 			 XFS_IGET_BULKSTAT, XFS_ILOCK_SHARED, &ip, bno);
 	if (error) {
 		*stat = BULKSTAT_RV_NOTHING;
-		return error;
+		goto out_free;
 	}
 
 	ASSERT(ip != NULL);
@@ -123,77 +140,16 @@ xfs_bulkstat_one_iget(
 		buf->bs_blocks = dic->di_nblocks + ip->i_delayed_blks;
 		break;
 	}
-
 	xfs_iput(ip, XFS_ILOCK_SHARED);
-	return error;
-}
 
-STATIC void
-xfs_bulkstat_one_dinode(
-	xfs_mount_t	*mp,		/* mount point for filesystem */
-	xfs_ino_t	ino,		/* inode number to get data for */
-	xfs_dinode_t	*dic,		/* dinode inode pointer */
-	xfs_bstat_t	*buf)		/* return buffer */
-{
-	/*
-	 * The inode format changed when we moved the link count and
-	 * made it 32 bits long.  If this is an old format inode,
-	 * convert it in memory to look like a new one.  If it gets
-	 * flushed to disk we will convert back before flushing or
-	 * logging it.  We zero out the new projid field and the old link
-	 * count field.  We'll handle clearing the pad field (the remains
-	 * of the old uuid field) when we actually convert the inode to
-	 * the new format. We don't change the version number so that we
-	 * can distinguish this from a real new format inode.
-	 */
-	if (dic->di_version == 1) {
-		buf->bs_nlink = be16_to_cpu(dic->di_onlink);
-		buf->bs_projid = 0;
-	} else {
-		buf->bs_nlink = be32_to_cpu(dic->di_nlink);
-		buf->bs_projid = be16_to_cpu(dic->di_projid);
-	}
+	error = formatter(buffer, ubsize, ubused, buf);
 
-	buf->bs_ino = ino;
-	buf->bs_mode = be16_to_cpu(dic->di_mode);
-	buf->bs_uid = be32_to_cpu(dic->di_uid);
-	buf->bs_gid = be32_to_cpu(dic->di_gid);
-	buf->bs_size = be64_to_cpu(dic->di_size);
-	buf->bs_atime.tv_sec = be32_to_cpu(dic->di_atime.t_sec);
-	buf->bs_atime.tv_nsec = be32_to_cpu(dic->di_atime.t_nsec);
-	buf->bs_mtime.tv_sec = be32_to_cpu(dic->di_mtime.t_sec);
-	buf->bs_mtime.tv_nsec = be32_to_cpu(dic->di_mtime.t_nsec);
-	buf->bs_ctime.tv_sec = be32_to_cpu(dic->di_ctime.t_sec);
-	buf->bs_ctime.tv_nsec = be32_to_cpu(dic->di_ctime.t_nsec);
-	buf->bs_xflags = xfs_dic2xflags(dic);
-	buf->bs_extsize = be32_to_cpu(dic->di_extsize) << mp->m_sb.sb_blocklog;
-	buf->bs_extents = be32_to_cpu(dic->di_nextents);
-	buf->bs_gen = be32_to_cpu(dic->di_gen);
-	memset(buf->bs_pad, 0, sizeof(buf->bs_pad));
-	buf->bs_dmevmask = be32_to_cpu(dic->di_dmevmask);
-	buf->bs_dmstate = be16_to_cpu(dic->di_dmstate);
-	buf->bs_aextents = be16_to_cpu(dic->di_anextents);
-	buf->bs_forkoff = XFS_DFORK_BOFF(dic);
+	if (!error)
+		*stat = BULKSTAT_RV_DIDONE;
 
-	switch (dic->di_format) {
-	case XFS_DINODE_FMT_DEV:
-		buf->bs_rdev = xfs_dinode_get_rdev(dic);
-		buf->bs_blksize = BLKDEV_IOSIZE;
-		buf->bs_blocks = 0;
-		break;
-	case XFS_DINODE_FMT_LOCAL:
-	case XFS_DINODE_FMT_UUID:
-		buf->bs_rdev = 0;
-		buf->bs_blksize = mp->m_sb.sb_blocksize;
-		buf->bs_blocks = 0;
-		break;
-	case XFS_DINODE_FMT_EXTENTS:
-	case XFS_DINODE_FMT_BTREE:
-		buf->bs_rdev = 0;
-		buf->bs_blksize = mp->m_sb.sb_blocksize;
-		buf->bs_blocks = be64_to_cpu(dic->di_nblocks);
-		break;
-	}
+ out_free:
+	kmem_free(buf);
+	return error;
 }
 
 /* Return 0 on success or positive error */
@@ -213,118 +169,19 @@ xfs_bulkstat_one_fmt(
 	return 0;
 }
 
-/*
- * Return stat information for one inode.
- * Return 0 if ok, else errno.
- */
-int		   	    		/* error status */
-xfs_bulkstat_one_int(
-	xfs_mount_t	*mp,		/* mount point for filesystem */
-	xfs_ino_t	ino,		/* inode number to get data for */
-	void		__user *buffer,	/* buffer to place output in */
-	int		ubsize,		/* size of buffer */
-	bulkstat_one_fmt_pf formatter,	/* formatter, copy to user */
-	xfs_daddr_t	bno,		/* starting bno of inode cluster */
-	int		*ubused,	/* bytes used by me */
-	void		*dibuff,	/* on-disk inode buffer */
-	int		*stat)		/* BULKSTAT_RV_... */
-{
-	xfs_bstat_t	*buf;		/* return buffer */
-	int		error = 0;	/* error value */
-	xfs_dinode_t	*dip;		/* dinode inode pointer */
-
-	dip = (xfs_dinode_t *)dibuff;
-	*stat = BULKSTAT_RV_NOTHING;
-
-	if (!buffer || xfs_internal_inum(mp, ino))
-		return XFS_ERROR(EINVAL);
-
-	buf = kmem_alloc(sizeof(*buf), KM_SLEEP);
-
-	if (dip == NULL) {
-		/* We're not being passed a pointer to a dinode.  This happens
-		 * if BULKSTAT_FG_IGET is selected.  Do the iget.
-		 */
-		error = xfs_bulkstat_one_iget(mp, ino, bno, buf, stat);
-		if (error)
-			goto out_free;
-	} else {
-		xfs_bulkstat_one_dinode(mp, ino, dip, buf);
-	}
-
-	error = formatter(buffer, ubsize, ubused, buf);
-	if (error)
-		goto out_free;
-
-	*stat = BULKSTAT_RV_DIDONE;
-
- out_free:
-	kmem_free(buf);
-	return error;
-}
-
 int
 xfs_bulkstat_one(
 	xfs_mount_t	*mp,		/* mount point for filesystem */
 	xfs_ino_t	ino,		/* inode number to get data for */
 	void		__user *buffer,	/* buffer to place output in */
 	int		ubsize,		/* size of buffer */
-	void		*private_data,	/* my private data */
 	xfs_daddr_t	bno,		/* starting bno of inode cluster */
 	int		*ubused,	/* bytes used by me */
-	void		*dibuff,	/* on-disk inode buffer */
 	int		*stat)		/* BULKSTAT_RV_... */
 {
 	return xfs_bulkstat_one_int(mp, ino, buffer, ubsize,
 				    xfs_bulkstat_one_fmt, bno,
-				    ubused, dibuff, stat);
-}
-
-/*
- * Test to see whether we can use the ondisk inode directly, based
- * on the given bulkstat flags, filling in dipp accordingly.
- * Returns zero if the inode is dodgey.
- */
-STATIC int
-xfs_bulkstat_use_dinode(
-	xfs_mount_t	*mp,
-	int		flags,
-	xfs_buf_t	*bp,
-	int		clustidx,
-	xfs_dinode_t	**dipp)
-{
-	xfs_dinode_t	*dip;
-	unsigned int	aformat;
-
-	*dipp = NULL;
-	if (!bp || (flags & BULKSTAT_FG_IGET))
-		return 1;
-	dip = (xfs_dinode_t *)
-			xfs_buf_offset(bp, clustidx << mp->m_sb.sb_inodelog);
-	/*
-	 * Check the buffer containing the on-disk inode for di_mode == 0.
-	 * This is to prevent xfs_bulkstat from picking up just reclaimed
-	 * inodes that have their in-core state initialized but not flushed
-	 * to disk yet. This is a temporary hack that would require a proper
-	 * fix in the future.
-	 */
-	if (be16_to_cpu(dip->di_magic) != XFS_DINODE_MAGIC ||
-	    !XFS_DINODE_GOOD_VERSION(dip->di_version) ||
-	    !dip->di_mode)
-		return 0;
-	if (flags & BULKSTAT_FG_QUICK) {
-		*dipp = dip;
-		return 1;
-	}
-	/* BULKSTAT_FG_INLINE: if attr fork is local, or not there, use it */
-	aformat = dip->di_aformat;
-	if ((XFS_DFORK_Q(dip) == 0) ||
-	    (aformat == XFS_DINODE_FMT_LOCAL) ||
-	    (aformat == XFS_DINODE_FMT_EXTENTS && !dip->di_anextents)) {
-		*dipp = dip;
-		return 1;
-	}
-	return 1;
+				    ubused, stat);
 }
 
 #define XFS_BULKSTAT_UBLEFT(ubleft)	((ubleft) >= statstruct_size)
@@ -338,10 +195,8 @@ xfs_bulkstat(
 	xfs_ino_t		*lastinop, /* last inode returned */
 	int			*ubcountp, /* size of buffer/count returned */
 	bulkstat_one_pf		formatter, /* func that'd fill a single buf */
-	void			*private_data,/* private data for formatter */
 	size_t			statstruct_size, /* sizeof struct filling */
 	char			__user *ubuffer, /* buffer with inode stats */
-	int			flags,	/* defined in xfs_itable.h */
 	int			*done)	/* 1 if there are more stats to get */
 {
 	xfs_agblock_t		agbno=0;/* allocation group block number */
@@ -376,14 +231,12 @@ xfs_bulkstat(
 	int			ubelem;	/* spaces used in user's buffer */
 	int			ubused;	/* bytes used by formatter */
 	xfs_buf_t		*bp;	/* ptr to on-disk inode cluster buf */
-	xfs_dinode_t		*dip;	/* ptr into bp for specific inode */
 
 	/*
 	 * Get the last inode value, see if there's nothing to do.
 	 */
 	ino = (xfs_ino_t)*lastinop;
 	lastino = ino;
-	dip = NULL;
 	agno = XFS_INO_TO_AGNO(mp, ino);
 	agino = XFS_INO_TO_AGINO(mp, ino);
 	if (agno >= mp->m_sb.sb_agcount ||
@@ -608,37 +461,6 @@ xfs_bulkstat(
 							irbp->ir_startino) +
 						((chunkidx & nimask) >>
 						 mp->m_sb.sb_inopblog);
-
-					if (flags & (BULKSTAT_FG_QUICK |
-						     BULKSTAT_FG_INLINE)) {
-						int offset;
-
-						ino = XFS_AGINO_TO_INO(mp, agno,
-								       agino);
-						bno = XFS_AGB_TO_DADDR(mp, agno,
-								       agbno);
-
-						/*
-						 * Get the inode cluster buffer
-						 */
-						if (bp)
-							xfs_buf_relse(bp);
-
-						error = xfs_inotobp(mp, NULL, ino, &dip,
-								    &bp, &offset,
-								    XFS_IGET_BULKSTAT);
-
-						if (!error)
-							clustidx = offset / mp->m_sb.sb_inodesize;
-						if (XFS_TEST_ERROR(error != 0,
-								   mp, XFS_ERRTAG_BULKSTAT_READ_CHUNK,
-								   XFS_RANDOM_BULKSTAT_READ_CHUNK)) {
-							bp = NULL;
-							ubleft = 0;
-							rval = error;
-							break;
-						}
-					}
 				}
 				ino = XFS_AGINO_TO_INO(mp, agno, agino);
 				bno = XFS_AGB_TO_DADDR(mp, agno, agbno);
@@ -654,35 +476,13 @@ xfs_bulkstat(
 				 * when the chunk is used up.
 				 */
 				irbp->ir_freecount++;
-				if (!xfs_bulkstat_use_dinode(mp, flags, bp,
-							     clustidx, &dip)) {
-					lastino = ino;
-					continue;
-				}
-				/*
-				 * If we need to do an iget, cannot hold bp.
-				 * Drop it, until starting the next cluster.
-				 */
-				if ((flags & BULKSTAT_FG_INLINE) && !dip) {
-					if (bp)
-						xfs_buf_relse(bp);
-					bp = NULL;
-				}
 
 				/*
 				 * Get the inode and fill in a single buffer.
-				 * BULKSTAT_FG_QUICK uses dip to fill it in.
-				 * BULKSTAT_FG_IGET uses igets.
-				 * BULKSTAT_FG_INLINE uses dip if we have an
-				 * inline attr fork, else igets.
-				 * See: xfs_bulkstat_one & xfs_dm_bulkstat_one.
-				 * This is also used to count inodes/blks, etc
-				 * in xfs_qm_quotacheck.
 				 */
 				ubused = statstruct_size;
-				error = formatter(mp, ino, ubufp,
-						ubleft, private_data,
-						bno, &ubused, dip, &fmterror);
+				error = formatter(mp, ino, ubufp, ubleft, bno,
+						  &ubused, &fmterror);
 				if (fmterror == BULKSTAT_RV_NOTHING) {
 					if (error && error != ENOENT &&
 						error != EINVAL) {
@@ -775,7 +575,7 @@ xfs_bulkstat_single(
 
 	ino = (xfs_ino_t)*lastinop;
 	error = xfs_bulkstat_one(mp, ino, buffer, sizeof(xfs_bstat_t),
-				 NULL, 0, NULL, NULL, &res);
+				 0, NULL, &res);
 	if (error) {
 		/*
 		 * Special case way failed, do it the "long" way
@@ -784,8 +584,7 @@ xfs_bulkstat_single(
 		(*lastinop)--;
 		count = 1;
 		if (xfs_bulkstat(mp, lastinop, &count, xfs_bulkstat_one,
-				NULL, sizeof(xfs_bstat_t), buffer,
-				BULKSTAT_FG_IGET, done))
+				sizeof(xfs_bstat_t), buffer, done))
 			return error;
 		if (count == 0 || (xfs_ino_t)*lastinop != ino)
 			return error == EFSCORRUPTED ?
diff --git a/fs/xfs/xfs_itable.h b/fs/xfs/xfs_itable.h
index 20792bf..fea0339 100644
--- a/fs/xfs/xfs_itable.h
+++ b/fs/xfs/xfs_itable.h
@@ -27,10 +27,8 @@ typedef int (*bulkstat_one_pf)(struct xfs_mount	*mp,
 			       xfs_ino_t	ino,
 			       void		__user *buffer,
 			       int		ubsize,
-			       void		*private_data,
 			       xfs_daddr_t	bno,
 			       int		*ubused,
-			       void		*dip,
 			       int		*stat);
 
 /*
@@ -41,13 +39,6 @@ typedef int (*bulkstat_one_pf)(struct xfs_mount	*mp,
 #define BULKSTAT_RV_GIVEUP	2
 
 /*
- * Values for bulkstat flag argument.
- */
-#define BULKSTAT_FG_IGET	0x1	/* Go through the buffer cache */
-#define BULKSTAT_FG_QUICK	0x2	/* No iget, walk the dinode cluster */
-#define BULKSTAT_FG_INLINE	0x4	/* No iget if inline attrs */
-
-/*
  * Return stat information in bulk (by-inode) for the filesystem.
  */
 int					/* error status */
@@ -56,10 +47,8 @@ xfs_bulkstat(
 	xfs_ino_t	*lastino,	/* last inode returned */
 	int		*count,		/* size of buffer/count returned */
 	bulkstat_one_pf formatter,	/* func that'd fill a single buf */
-	void		*private_data,	/* private data for formatter */
 	size_t		statstruct_size,/* sizeof struct that we're filling */
 	char		__user *ubuffer,/* buffer with inode stats */
-	int		flags,		/* flag to control access method */
 	int		*done);		/* 1 if there are more stats to get */
 
 int
@@ -84,7 +73,6 @@ xfs_bulkstat_one_int(
 	bulkstat_one_fmt_pf	formatter,
 	xfs_daddr_t		bno,
 	int			*ubused,
-	void			*dibuff,
 	int			*stat);
 
 int
@@ -93,10 +81,8 @@ xfs_bulkstat_one(
 	xfs_ino_t		ino,
 	void			__user *buffer,
 	int			ubsize,
-	void			*private_data,
 	xfs_daddr_t		bno,
 	int			*ubused,
-	void			*dibuff,
 	int			*stat);
 
 typedef int (*inumbers_fmt_pf)(
-- 
1.7.1

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 2/4] xfs: validate untrusted inode numbers during lookup
  2010-06-18  7:32 xfs: validate inode numbers in file handles correctly Dave Chinner
  2010-06-18  7:32 ` [PATCH 1/4] xfs: always use iget in bulkstat Dave Chinner
@ 2010-06-18  7:32 ` Dave Chinner
  2010-06-18 11:41   ` Christoph Hellwig
  2010-06-18  7:32 ` [PATCH 3/4] xfs: rename XFS_IGET_BULKSTAT to XFS_IGET_UNTRUSTED Dave Chinner
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 21+ messages in thread
From: Dave Chinner @ 2010-06-18  7:32 UTC (permalink / raw)
  To: xfs; +Cc: security

From: Dave Chinner <dchinner@redhat.com>

When we decode a handle or do a bulkstat lookup, we are using an
inode number we cannot trust to be valid. If we are deleting inode
chunks from disk (default noikeep mode), then we cannot trust the on
disk inode buffer for any given inode number to correctly reflect
whether the inode has been unlinked as the di_mode nor the
generation number may have been updated on disk.

This is due to the fact that when we delete an inode chunk, we do
not write the clusters back to disk when they are removed - instead
we mark them stale to avoid them being written back potentially over
the top of something that has been subsequently allocated at that
location. The result is that we can have locations of disk that look
like they contain valid inodes but in reality do not. Hence we
cannot simply convert the inode number to a block number and read
the location from disk to determine if the inode is valid or not.

As a result, and XFS_IGET_BULKSTAT lookup needs to actually look the
inode up in the inode allocation btree to determine if the inode
number is valid or not.

It should be noted even on ikeep filesystems, there is the
possibility that blocks on disk may look like valid inode clusters.
e.g. if there are filesystem images hosted on the filesystem. Hence
even for ikeep filesystems we really need to validate that the inode
number is valid before issuing the inode buffer read.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/xfs_ialloc.c |  139 +++++++++++++++++++++++++++++++++++----------------
 1 files changed, 96 insertions(+), 43 deletions(-)

diff --git a/fs/xfs/xfs_ialloc.c b/fs/xfs/xfs_ialloc.c
index 0cf49f0..06eb987 100644
--- a/fs/xfs/xfs_ialloc.c
+++ b/fs/xfs/xfs_ialloc.c
@@ -1199,6 +1199,81 @@ error0:
 	return error;
 }
 
+static int
+xfs_imap_lookup(
+	xfs_mount_t	*mp,
+	xfs_trans_t	*tp,
+	xfs_agnumber_t	agno,
+	xfs_agino_t	agino,
+	xfs_agblock_t	agbno,
+	xfs_agblock_t	*chunk_agbno,
+	xfs_agblock_t	*offset_agbno,
+	int		flags)
+{
+	xfs_inobt_rec_incore_t rec;
+	xfs_btree_cur_t	*cur;
+	xfs_buf_t	*agbp;
+	int		error;
+	int		i;
+	xfs_agino_t	startino;
+
+	error = xfs_ialloc_read_agi(mp, tp, agno, &agbp);
+	if (error) {
+		xfs_fs_cmn_err(CE_ALERT, mp, "xfs_imap: "
+				"xfs_ialloc_read_agi() returned "
+				"error %d, agno %d",
+				error, agno);
+		return error;
+	}
+
+	/*
+	 * derive and lookup the exact inode record for the given agino. If the
+	 * record cannot be found, then it's an invalid inode number and we
+	 * should abort.
+	 */
+	cur = xfs_inobt_init_cursor(mp, tp, agbp, agno);
+	startino = agino & ~(XFS_IALLOC_INODES(mp) - 1);
+	error = xfs_inobt_lookup(cur, startino, XFS_LOOKUP_EQ, &i);
+	if (error) {
+		xfs_fs_cmn_err(CE_ALERT, mp, "xfs_imap: "
+				"xfs_inobt_lookup() failed");
+		goto error0;
+	}
+	if (i == 0) {
+		error = EINVAL;
+		goto error0;
+	}
+
+	error = xfs_inobt_get_rec(cur, &rec, &i);
+	if (error) {
+		xfs_fs_cmn_err(CE_ALERT, mp, "xfs_imap: "
+				"xfs_inobt_get_rec() failed");
+		goto error0;
+	}
+	if (i == 0) {
+#ifdef DEBUG
+		xfs_fs_cmn_err(CE_ALERT, mp, "xfs_imap: "
+				"xfs_inobt_get_rec() failed");
+#endif /* DEBUG */
+		error = XFS_ERROR(EINVAL);
+	}
+error0:
+	xfs_trans_brelse(tp, agbp);
+	xfs_btree_del_cursor(cur, XFS_BTREE_NOERROR);
+
+	if (error)
+		return error;
+
+	/* for untrusted inodes check it is allocated first */
+	if ((flags & XFS_IGET_BULKSTAT) &&
+	    (rec.ir_free & XFS_INOBT_MASK(agino - rec.ir_startino)))
+		return EINVAL;
+
+	*chunk_agbno = XFS_AGINO_TO_AGBNO(mp, rec.ir_startino);
+	*offset_agbno = agbno - *chunk_agbno;
+	return 0;
+}
+
 /*
  * Return the location of the inode in imap, for mapping it into a buffer.
  */
@@ -1259,6 +1334,23 @@ xfs_imap(
 		return XFS_ERROR(EINVAL);
 	}
 
+	blks_per_cluster = XFS_INODE_CLUSTER_SIZE(mp) >> mp->m_sb.sb_blocklog;
+
+	/*
+	 * For bulkstat and handle lookups, we have an untrusted inode number
+	 * that we have to verify is valid. We cannot do this just by reading
+	 * the inode buffer as it may have been unlinked and removed leaving
+	 * inodes in stale state on disk. Hence we have to do a btree lookup
+	 * in all cases where an untrusted inode number is passed.
+	 */
+	if (flags & XFS_IGET_BULKSTAT) {
+		error = xfs_imap_lookup(mp, tp, agno, agino, agbno,
+					&chunk_agbno, &offset_agbno, flags);
+		if (error)
+			return error;
+		goto out_map;
+	}
+
 	/*
 	 * If the inode cluster size is the same as the blocksize or
 	 * smaller we get to the buffer by simple arithmetics.
@@ -1273,10 +1365,8 @@ xfs_imap(
 		return 0;
 	}
 
-	blks_per_cluster = XFS_INODE_CLUSTER_SIZE(mp) >> mp->m_sb.sb_blocklog;
-
 	/*
-	 * If we get a block number passed from bulkstat we can use it to
+	 * If we get a block number passed we can use it to
 	 * find the buffer easily.
 	 */
 	if (imap->im_blkno) {
@@ -1300,50 +1390,13 @@ xfs_imap(
 		offset_agbno = agbno & mp->m_inoalign_mask;
 		chunk_agbno = agbno - offset_agbno;
 	} else {
-		xfs_btree_cur_t	*cur;	/* inode btree cursor */
-		xfs_inobt_rec_incore_t chunk_rec;
-		xfs_buf_t	*agbp;	/* agi buffer */
-		int		i;	/* temp state */
-
-		error = xfs_ialloc_read_agi(mp, tp, agno, &agbp);
-		if (error) {
-			xfs_fs_cmn_err(CE_ALERT, mp, "xfs_imap: "
-					"xfs_ialloc_read_agi() returned "
-					"error %d, agno %d",
-					error, agno);
-			return error;
-		}
-
-		cur = xfs_inobt_init_cursor(mp, tp, agbp, agno);
-		error = xfs_inobt_lookup(cur, agino, XFS_LOOKUP_LE, &i);
-		if (error) {
-			xfs_fs_cmn_err(CE_ALERT, mp, "xfs_imap: "
-					"xfs_inobt_lookup() failed");
-			goto error0;
-		}
-
-		error = xfs_inobt_get_rec(cur, &chunk_rec, &i);
-		if (error) {
-			xfs_fs_cmn_err(CE_ALERT, mp, "xfs_imap: "
-					"xfs_inobt_get_rec() failed");
-			goto error0;
-		}
-		if (i == 0) {
-#ifdef DEBUG
-			xfs_fs_cmn_err(CE_ALERT, mp, "xfs_imap: "
-					"xfs_inobt_get_rec() failed");
-#endif /* DEBUG */
-			error = XFS_ERROR(EINVAL);
-		}
- error0:
-		xfs_trans_brelse(tp, agbp);
-		xfs_btree_del_cursor(cur, XFS_BTREE_NOERROR);
+		error = xfs_imap_lookup(mp, tp, agno, agino, agbno,
+					&chunk_agbno, &offset_agbno, flags);
 		if (error)
 			return error;
-		chunk_agbno = XFS_AGINO_TO_AGBNO(mp, chunk_rec.ir_startino);
-		offset_agbno = agbno - chunk_agbno;
 	}
 
+out_map:
 	ASSERT(agbno >= chunk_agbno);
 	cluster_agbno = chunk_agbno +
 		((offset_agbno / blks_per_cluster) * blks_per_cluster);
-- 
1.7.1

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 3/4] xfs: rename XFS_IGET_BULKSTAT to XFS_IGET_UNTRUSTED
  2010-06-18  7:32 xfs: validate inode numbers in file handles correctly Dave Chinner
  2010-06-18  7:32 ` [PATCH 1/4] xfs: always use iget in bulkstat Dave Chinner
  2010-06-18  7:32 ` [PATCH 2/4] xfs: validate untrusted inode numbers during lookup Dave Chinner
@ 2010-06-18  7:32 ` Dave Chinner
  2010-06-18 11:42   ` Christoph Hellwig
  2010-06-18  7:32 ` [PATCH 4/4] xfs: remove block number from inode lookup code Dave Chinner
  2011-11-23 13:04 ` xfs: validate inode numbers in file handles correctly Guoquan Yang
  4 siblings, 1 reply; 21+ messages in thread
From: Dave Chinner @ 2010-06-18  7:32 UTC (permalink / raw)
  To: xfs; +Cc: security

From: Dave Chinner <dchinner@redhat.com>

Inode numbers may come from somewhere external to the filesystem
(e.g. file handles, bulkstat information) and so are inherently
untrusted. Rename the flag we use for these lookups to make it
obvious we are doing a lookup of an untrusted inode number and need
to verify it completely before trying to read it from disk.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/linux-2.6/xfs_export.c |    9 ++++-----
 fs/xfs/xfs_ialloc.c           |    6 +++---
 fs/xfs/xfs_inode.c            |    2 +-
 fs/xfs/xfs_inode.h            |    2 +-
 fs/xfs/xfs_itable.c           |    2 +-
 5 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/fs/xfs/linux-2.6/xfs_export.c b/fs/xfs/linux-2.6/xfs_export.c
index e61232f..b39c05c 100644
--- a/fs/xfs/linux-2.6/xfs_export.c
+++ b/fs/xfs/linux-2.6/xfs_export.c
@@ -127,12 +127,11 @@ xfs_nfs_get_inode(
 		return ERR_PTR(-ESTALE);
 
 	/*
-	 * The XFS_IGET_BULKSTAT means that an invalid inode number is just
-	 * fine and not an indication of a corrupted filesystem.  Because
-	 * clients can send any kind of invalid file handle, e.g. after
-	 * a restore on the server we have to deal with this case gracefully.
+	 * The XFS_IGET_UNTRUSTED means that an invalid inode number is just
+	 * fine and not an indication of a corrupted filesystem as clients can
+	 * send invalid file handles and we have to handle it gracefully..
 	 */
-	error = xfs_iget(mp, NULL, ino, XFS_IGET_BULKSTAT,
+	error = xfs_iget(mp, NULL, ino, XFS_IGET_UNTRUSTED,
 			 XFS_ILOCK_SHARED, &ip, 0);
 	if (error) {
 		/*
diff --git a/fs/xfs/xfs_ialloc.c b/fs/xfs/xfs_ialloc.c
index 06eb987..4efc23b 100644
--- a/fs/xfs/xfs_ialloc.c
+++ b/fs/xfs/xfs_ialloc.c
@@ -1265,7 +1265,7 @@ error0:
 		return error;
 
 	/* for untrusted inodes check it is allocated first */
-	if ((flags & XFS_IGET_BULKSTAT) &&
+	if ((flags & XFS_IGET_UNTRUSTED) &&
 	    (rec.ir_free & XFS_INOBT_MASK(agino - rec.ir_startino)))
 		return EINVAL;
 
@@ -1307,7 +1307,7 @@ xfs_imap(
 	    ino != XFS_AGINO_TO_INO(mp, agno, agino)) {
 #ifdef DEBUG
 		/* no diagnostics for bulkstat, ino comes from userspace */
-		if (flags & XFS_IGET_BULKSTAT)
+		if (flags & XFS_IGET_UNTRUSTED)
 			return XFS_ERROR(EINVAL);
 		if (agno >= mp->m_sb.sb_agcount) {
 			xfs_fs_cmn_err(CE_ALERT, mp,
@@ -1343,7 +1343,7 @@ xfs_imap(
 	 * inodes in stale state on disk. Hence we have to do a btree lookup
 	 * in all cases where an untrusted inode number is passed.
 	 */
-	if (flags & XFS_IGET_BULKSTAT) {
+	if (flags & XFS_IGET_UNTRUSTED) {
 		error = xfs_imap_lookup(mp, tp, agno, agino, agbno,
 					&chunk_agbno, &offset_agbno, flags);
 		if (error)
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 5c2ada4..9101e79 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -173,7 +173,7 @@ xfs_imap_to_bp(
 		if (unlikely(XFS_TEST_ERROR(!di_ok, mp,
 						XFS_ERRTAG_ITOBP_INOTOBP,
 						XFS_RANDOM_ITOBP_INOTOBP))) {
-			if (iget_flags & XFS_IGET_BULKSTAT) {
+			if (iget_flags & XFS_IGET_UNTRUSTED) {
 				xfs_trans_brelse(tp, bp);
 				return XFS_ERROR(EINVAL);
 			}
diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
index 10dd2e0..7a2f347 100644
--- a/fs/xfs/xfs_inode.h
+++ b/fs/xfs/xfs_inode.h
@@ -499,7 +499,7 @@ do { \
  * Flags for xfs_iget()
  */
 #define XFS_IGET_CREATE		0x1
-#define XFS_IGET_BULKSTAT	0x2
+#define XFS_IGET_UNTRUSTED	0x2
 
 int		xfs_inotobp(struct xfs_mount *, struct xfs_trans *,
 			    xfs_ino_t, struct xfs_dinode **,
diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c
index f554bd9..5fccd84 100644
--- a/fs/xfs/xfs_itable.c
+++ b/fs/xfs/xfs_itable.c
@@ -76,7 +76,7 @@ xfs_bulkstat_one_int(
 		return XFS_ERROR(ENOMEM);
 
 	error = xfs_iget(mp, NULL, ino,
-			 XFS_IGET_BULKSTAT, XFS_ILOCK_SHARED, &ip, bno);
+			 XFS_IGET_UNTRUSTED, XFS_ILOCK_SHARED, &ip, bno);
 	if (error) {
 		*stat = BULKSTAT_RV_NOTHING;
 		goto out_free;
-- 
1.7.1

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH 4/4] xfs: remove block number from inode lookup code
  2010-06-18  7:32 xfs: validate inode numbers in file handles correctly Dave Chinner
                   ` (2 preceding siblings ...)
  2010-06-18  7:32 ` [PATCH 3/4] xfs: rename XFS_IGET_BULKSTAT to XFS_IGET_UNTRUSTED Dave Chinner
@ 2010-06-18  7:32 ` Dave Chinner
  2010-06-18  8:22   ` Christoph Hellwig
  2011-11-23 13:04 ` xfs: validate inode numbers in file handles correctly Guoquan Yang
  4 siblings, 1 reply; 21+ messages in thread
From: Dave Chinner @ 2010-06-18  7:32 UTC (permalink / raw)
  To: xfs; +Cc: security

From: Dave Chinner <dchinner@redhat.com>

The block number comes from bulkstat based inode lookups to shortcut
the mapping calculations. We ar enot able to trust anything from
bulkstat, so drop the block number as well so that the correct
lookups and mappings are always done.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
---
 fs/xfs/linux-2.6/xfs_export.c  |    2 +-
 fs/xfs/linux-2.6/xfs_ioctl32.c |    5 ++---
 fs/xfs/quota/xfs_qm.c          |    7 +++----
 fs/xfs/quota/xfs_qm_syscalls.c |   11 +++++------
 fs/xfs/xfs_ialloc.c            |   16 ----------------
 fs/xfs/xfs_iget.c              |   10 +++-------
 fs/xfs/xfs_inode.c             |    4 +---
 fs/xfs/xfs_inode.h             |    4 ++--
 fs/xfs/xfs_itable.c            |   12 ++++--------
 fs/xfs/xfs_itable.h            |    3 ---
 fs/xfs/xfs_log_recover.c       |    2 +-
 fs/xfs/xfs_mount.c             |    2 +-
 fs/xfs/xfs_rtalloc.c           |    4 ++--
 fs/xfs/xfs_trans_inode.c       |    2 +-
 fs/xfs/xfs_vnodeops.c          |    2 +-
 15 files changed, 27 insertions(+), 59 deletions(-)

diff --git a/fs/xfs/linux-2.6/xfs_export.c b/fs/xfs/linux-2.6/xfs_export.c
index b39c05c..55df942 100644
--- a/fs/xfs/linux-2.6/xfs_export.c
+++ b/fs/xfs/linux-2.6/xfs_export.c
@@ -132,7 +132,7 @@ xfs_nfs_get_inode(
 	 * send invalid file handles and we have to handle it gracefully..
 	 */
 	error = xfs_iget(mp, NULL, ino, XFS_IGET_UNTRUSTED,
-			 XFS_ILOCK_SHARED, &ip, 0);
+			 XFS_ILOCK_SHARED, &ip);
 	if (error) {
 		/*
 		 * EINVAL means the inode cluster doesn't exist anymore.
diff --git a/fs/xfs/linux-2.6/xfs_ioctl32.c b/fs/xfs/linux-2.6/xfs_ioctl32.c
index 48c1aba..6cd1225 100644
--- a/fs/xfs/linux-2.6/xfs_ioctl32.c
+++ b/fs/xfs/linux-2.6/xfs_ioctl32.c
@@ -233,12 +233,11 @@ xfs_bulkstat_one_compat(
 	xfs_ino_t	ino,		/* inode number to get data for */
 	void		__user *buffer,	/* buffer to place output in */
 	int		ubsize,		/* size of buffer */
-	xfs_daddr_t	bno,		/* starting bno of inode cluster */
 	int		*ubused,	/* bytes used by me */
 	int		*stat)		/* BULKSTAT_RV_... */
 {
 	return xfs_bulkstat_one_int(mp, ino, buffer, ubsize,
-				    xfs_bulkstat_one_fmt_compat, bno,
+				    xfs_bulkstat_one_fmt_compat,
 				    ubused, stat);
 }
 
@@ -292,7 +291,7 @@ xfs_compat_ioc_bulkstat(
 		int res;
 
 		error = xfs_bulkstat_one_compat(mp, inlast, bulkreq.ubuffer,
-				sizeof(compat_xfs_bstat_t), 0, NULL, &res);
+				sizeof(compat_xfs_bstat_t), 0, &res);
 	} else if (cmd == XFS_IOC_FSBULKSTAT_32) {
 		error = xfs_bulkstat(mp, &inlast, &count,
 			xfs_bulkstat_one_compat, sizeof(compat_xfs_bstat_t),
diff --git a/fs/xfs/quota/xfs_qm.c b/fs/xfs/quota/xfs_qm.c
index 25bc7da..175032f 100644
--- a/fs/xfs/quota/xfs_qm.c
+++ b/fs/xfs/quota/xfs_qm.c
@@ -1625,7 +1625,6 @@ xfs_qm_dqusage_adjust(
 	xfs_ino_t	ino,		/* inode number to get data for */
 	void		__user *buffer,	/* not used */
 	int		ubsize,		/* not used */
-	xfs_daddr_t	bno,		/* starting block of inode cluster */
 	int		*ubused,	/* not used */
 	int		*res)		/* result code value */
 {
@@ -1651,7 +1650,7 @@ xfs_qm_dqusage_adjust(
 	 * the case in all other instances. It's OK that we do this because
 	 * quotacheck is done only at mount time.
 	 */
-	if ((error = xfs_iget(mp, NULL, ino, 0, XFS_ILOCK_EXCL, &ip, bno))) {
+	if ((error = xfs_iget(mp, NULL, ino, 0, XFS_ILOCK_EXCL, &ip))) {
 		*res = BULKSTAT_RV_NOTHING;
 		return error;
 	}
@@ -1881,14 +1880,14 @@ xfs_qm_init_quotainos(
 		    mp->m_sb.sb_uquotino != NULLFSINO) {
 			ASSERT(mp->m_sb.sb_uquotino > 0);
 			if ((error = xfs_iget(mp, NULL, mp->m_sb.sb_uquotino,
-					     0, 0, &uip, 0)))
+					     0, 0, &uip)))
 				return XFS_ERROR(error);
 		}
 		if (XFS_IS_OQUOTA_ON(mp) &&
 		    mp->m_sb.sb_gquotino != NULLFSINO) {
 			ASSERT(mp->m_sb.sb_gquotino > 0);
 			if ((error = xfs_iget(mp, NULL, mp->m_sb.sb_gquotino,
-					     0, 0, &gip, 0))) {
+					     0, 0, &gip))) {
 				if (uip)
 					IRELE(uip);
 				return XFS_ERROR(error);
diff --git a/fs/xfs/quota/xfs_qm_syscalls.c b/fs/xfs/quota/xfs_qm_syscalls.c
index aa44c58..2d1abbf 100644
--- a/fs/xfs/quota/xfs_qm_syscalls.c
+++ b/fs/xfs/quota/xfs_qm_syscalls.c
@@ -252,7 +252,7 @@ xfs_qm_scall_trunc_qfiles(
 	}
 
 	if ((flags & XFS_DQ_USER) && mp->m_sb.sb_uquotino != NULLFSINO) {
-		error = xfs_iget(mp, NULL, mp->m_sb.sb_uquotino, 0, 0, &qip, 0);
+		error = xfs_iget(mp, NULL, mp->m_sb.sb_uquotino, 0, 0, &qip);
 		if (!error) {
 			error = xfs_truncate_file(mp, qip);
 			IRELE(qip);
@@ -261,7 +261,7 @@ xfs_qm_scall_trunc_qfiles(
 
 	if ((flags & (XFS_DQ_GROUP|XFS_DQ_PROJ)) &&
 	    mp->m_sb.sb_gquotino != NULLFSINO) {
-		error2 = xfs_iget(mp, NULL, mp->m_sb.sb_gquotino, 0, 0, &qip, 0);
+		error2 = xfs_iget(mp, NULL, mp->m_sb.sb_gquotino, 0, 0, &qip);
 		if (!error2) {
 			error2 = xfs_truncate_file(mp, qip);
 			IRELE(qip);
@@ -407,12 +407,12 @@ xfs_qm_scall_getqstat(
 	}
 	if (!uip && mp->m_sb.sb_uquotino != NULLFSINO) {
 		if (xfs_iget(mp, NULL, mp->m_sb.sb_uquotino,
-					0, 0, &uip, 0) == 0)
+					0, 0, &uip) == 0)
 			tempuqip = B_TRUE;
 	}
 	if (!gip && mp->m_sb.sb_gquotino != NULLFSINO) {
 		if (xfs_iget(mp, NULL, mp->m_sb.sb_gquotino,
-					0, 0, &gip, 0) == 0)
+					0, 0, &gip) == 0)
 			tempgqip = B_TRUE;
 	}
 	if (uip) {
@@ -1095,7 +1095,6 @@ xfs_qm_internalqcheck_adjust(
 	xfs_ino_t	ino,		/* inode number to get data for */
 	void		__user *buffer,	/* not used */
 	int		ubsize,		/* not used */
-	xfs_daddr_t	bno,		/* starting block of inode cluster */
 	int		*ubused,	/* not used */
 	int		*res)		/* bulkstat result code */
 {
@@ -1118,7 +1117,7 @@ xfs_qm_internalqcheck_adjust(
 	ipreleased = B_FALSE;
  again:
 	lock_flags = XFS_ILOCK_SHARED;
-	if ((error = xfs_iget(mp, NULL, ino, 0, lock_flags, &ip, bno))) {
+	if ((error = xfs_iget(mp, NULL, ino, 0, lock_flags, &ip))) {
 		*res = BULKSTAT_RV_NOTHING;
 		return (error);
 	}
diff --git a/fs/xfs/xfs_ialloc.c b/fs/xfs/xfs_ialloc.c
index 4efc23b..b4351a7 100644
--- a/fs/xfs/xfs_ialloc.c
+++ b/fs/xfs/xfs_ialloc.c
@@ -1366,22 +1366,6 @@ xfs_imap(
 	}
 
 	/*
-	 * If we get a block number passed we can use it to
-	 * find the buffer easily.
-	 */
-	if (imap->im_blkno) {
-		offset = XFS_INO_TO_OFFSET(mp, ino);
-		ASSERT(offset < mp->m_sb.sb_inopblock);
-
-		cluster_agbno = xfs_daddr_to_agbno(mp, imap->im_blkno);
-		offset += (agbno - cluster_agbno) * mp->m_sb.sb_inopblock;
-
-		imap->im_len = XFS_FSB_TO_BB(mp, blks_per_cluster);
-		imap->im_boffset = (ushort)(offset << mp->m_sb.sb_inodelog);
-		return 0;
-	}
-
-	/*
 	 * If the inode chunks are aligned then use simple maths to
 	 * find the location. Otherwise we have to do a btree
 	 * lookup to find the location.
diff --git a/fs/xfs/xfs_iget.c b/fs/xfs/xfs_iget.c
index a80b9b9..d6ef971 100644
--- a/fs/xfs/xfs_iget.c
+++ b/fs/xfs/xfs_iget.c
@@ -255,7 +255,6 @@ xfs_iget_cache_miss(
 	xfs_trans_t		*tp,
 	xfs_ino_t		ino,
 	struct xfs_inode	**ipp,
-	xfs_daddr_t		bno,
 	int			flags,
 	int			lock_flags)
 {
@@ -268,7 +267,7 @@ xfs_iget_cache_miss(
 	if (!ip)
 		return ENOMEM;
 
-	error = xfs_iread(mp, tp, ip, bno, flags);
+	error = xfs_iread(mp, tp, ip, flags);
 	if (error)
 		goto out_destroy;
 
@@ -354,8 +353,6 @@ out_destroy:
  *        within the file system for the inode being requested.
  * lock_flags -- flags indicating how to lock the inode.  See the comment
  *		 for xfs_ilock() for a list of valid values.
- * bno -- the block number starting the buffer containing the inode,
- *	  if known (as by bulkstat), else 0.
  */
 int
 xfs_iget(
@@ -364,8 +361,7 @@ xfs_iget(
 	xfs_ino_t	ino,
 	uint		flags,
 	uint		lock_flags,
-	xfs_inode_t	**ipp,
-	xfs_daddr_t	bno)
+	xfs_inode_t	**ipp)
 {
 	xfs_inode_t	*ip;
 	int		error;
@@ -393,7 +389,7 @@ again:
 		read_unlock(&pag->pag_ici_lock);
 		XFS_STATS_INC(xs_ig_missed);
 
-		error = xfs_iget_cache_miss(mp, pag, tp, ino, &ip, bno,
+		error = xfs_iget_cache_miss(mp, pag, tp, ino, &ip,
 							flags, lock_flags);
 		if (error)
 			goto out_error_or_again;
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 9101e79..d09bc53 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -783,7 +783,6 @@ xfs_iread(
 	xfs_mount_t	*mp,
 	xfs_trans_t	*tp,
 	xfs_inode_t	*ip,
-	xfs_daddr_t	bno,
 	uint		iget_flags)
 {
 	xfs_buf_t	*bp;
@@ -793,11 +792,10 @@ xfs_iread(
 	/*
 	 * Fill in the location information in the in-core inode.
 	 */
-	ip->i_imap.im_blkno = bno;
+	ip->i_imap.im_blkno = 0;
 	error = xfs_imap(mp, tp, ip->i_ino, &ip->i_imap, iget_flags);
 	if (error)
 		return error;
-	ASSERT(bno == 0 || bno == ip->i_imap.im_blkno);
 
 	/*
 	 * Get pointers to the on-disk inode and the buffer containing it.
diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
index 7a2f347..7a19d52 100644
--- a/fs/xfs/xfs_inode.h
+++ b/fs/xfs/xfs_inode.h
@@ -442,7 +442,7 @@ static inline void xfs_ifunlock(xfs_inode_t *ip)
  * xfs_iget.c prototypes.
  */
 int		xfs_iget(struct xfs_mount *, struct xfs_trans *, xfs_ino_t,
-			 uint, uint, xfs_inode_t **, xfs_daddr_t);
+			 uint, uint, xfs_inode_t **);
 void		xfs_iput(xfs_inode_t *, uint);
 void		xfs_ilock(xfs_inode_t *, uint);
 int		xfs_ilock_nowait(xfs_inode_t *, uint);
@@ -508,7 +508,7 @@ int		xfs_itobp(struct xfs_mount *, struct xfs_trans *,
 			  struct xfs_inode *, struct xfs_dinode **,
 			  struct xfs_buf **, uint);
 int		xfs_iread(struct xfs_mount *, struct xfs_trans *,
-			  struct xfs_inode *, xfs_daddr_t, uint);
+			  struct xfs_inode *, uint);
 void		xfs_dinode_to_disk(struct xfs_dinode *,
 				   struct xfs_icdinode *);
 void		xfs_idestroy_fork(struct xfs_inode *, int);
diff --git a/fs/xfs/xfs_itable.c b/fs/xfs/xfs_itable.c
index 5fccd84..200dc6f 100644
--- a/fs/xfs/xfs_itable.c
+++ b/fs/xfs/xfs_itable.c
@@ -56,7 +56,6 @@ xfs_bulkstat_one_int(
 	void __user		*buffer,	/* buffer to place output in */
 	int			ubsize,		/* size of buffer */
 	bulkstat_one_fmt_pf	formatter,	/* formatter, copy to user */
-	xfs_daddr_t		bno,		/* starting bno of cluster */
 	int			*ubused,	/* bytes used by me */
 	int			*stat)		/* BULKSTAT_RV_... */
 {
@@ -76,7 +75,7 @@ xfs_bulkstat_one_int(
 		return XFS_ERROR(ENOMEM);
 
 	error = xfs_iget(mp, NULL, ino,
-			 XFS_IGET_UNTRUSTED, XFS_ILOCK_SHARED, &ip, bno);
+			 XFS_IGET_UNTRUSTED, XFS_ILOCK_SHARED, &ip);
 	if (error) {
 		*stat = BULKSTAT_RV_NOTHING;
 		goto out_free;
@@ -175,13 +174,11 @@ xfs_bulkstat_one(
 	xfs_ino_t	ino,		/* inode number to get data for */
 	void		__user *buffer,	/* buffer to place output in */
 	int		ubsize,		/* size of buffer */
-	xfs_daddr_t	bno,		/* starting bno of inode cluster */
 	int		*ubused,	/* bytes used by me */
 	int		*stat)		/* BULKSTAT_RV_... */
 {
 	return xfs_bulkstat_one_int(mp, ino, buffer, ubsize,
-				    xfs_bulkstat_one_fmt, bno,
-				    ubused, stat);
+				    xfs_bulkstat_one_fmt, ubused, stat);
 }
 
 #define XFS_BULKSTAT_UBLEFT(ubleft)	((ubleft) >= statstruct_size)
@@ -481,7 +478,7 @@ xfs_bulkstat(
 				 * Get the inode and fill in a single buffer.
 				 */
 				ubused = statstruct_size;
-				error = formatter(mp, ino, ubufp, ubleft, bno,
+				error = formatter(mp, ino, ubufp, ubleft,
 						  &ubused, &fmterror);
 				if (fmterror == BULKSTAT_RV_NOTHING) {
 					if (error && error != ENOENT &&
@@ -574,8 +571,7 @@ xfs_bulkstat_single(
 	 */
 
 	ino = (xfs_ino_t)*lastinop;
-	error = xfs_bulkstat_one(mp, ino, buffer, sizeof(xfs_bstat_t),
-				 0, NULL, &res);
+	error = xfs_bulkstat_one(mp, ino, buffer, sizeof(xfs_bstat_t), 0, &res);
 	if (error) {
 		/*
 		 * Special case way failed, do it the "long" way
diff --git a/fs/xfs/xfs_itable.h b/fs/xfs/xfs_itable.h
index fea0339..97295d9 100644
--- a/fs/xfs/xfs_itable.h
+++ b/fs/xfs/xfs_itable.h
@@ -27,7 +27,6 @@ typedef int (*bulkstat_one_pf)(struct xfs_mount	*mp,
 			       xfs_ino_t	ino,
 			       void		__user *buffer,
 			       int		ubsize,
-			       xfs_daddr_t	bno,
 			       int		*ubused,
 			       int		*stat);
 
@@ -71,7 +70,6 @@ xfs_bulkstat_one_int(
 	void			__user *buffer,
 	int			ubsize,
 	bulkstat_one_fmt_pf	formatter,
-	xfs_daddr_t		bno,
 	int			*ubused,
 	int			*stat);
 
@@ -81,7 +79,6 @@ xfs_bulkstat_one(
 	xfs_ino_t		ino,
 	void			__user *buffer,
 	int			ubsize,
-	xfs_daddr_t		bno,
 	int			*ubused,
 	int			*stat);
 
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 1ac0b79..6f3f5fa 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -3186,7 +3186,7 @@ xlog_recover_process_one_iunlink(
 	int				error;
 
 	ino = XFS_AGINO_TO_INO(mp, agno, agino);
-	error = xfs_iget(mp, NULL, ino, 0, 0, &ip, 0);
+	error = xfs_iget(mp, NULL, ino, 0, 0, &ip);
 	if (error)
 		goto fail;
 
diff --git a/fs/xfs/xfs_mount.c b/fs/xfs/xfs_mount.c
index 4e1725b..aeb9d72 100644
--- a/fs/xfs/xfs_mount.c
+++ b/fs/xfs/xfs_mount.c
@@ -1297,7 +1297,7 @@ xfs_mountfs(
 	 * Get and sanity-check the root inode.
 	 * Save the pointer to it in the mount structure.
 	 */
-	error = xfs_iget(mp, NULL, sbp->sb_rootino, 0, XFS_ILOCK_EXCL, &rip, 0);
+	error = xfs_iget(mp, NULL, sbp->sb_rootino, 0, XFS_ILOCK_EXCL, &rip);
 	if (error) {
 		cmn_err(CE_WARN, "XFS: failed to read root inode");
 		goto out_log_dealloc;
diff --git a/fs/xfs/xfs_rtalloc.c b/fs/xfs/xfs_rtalloc.c
index 20b5eb7..891260f 100644
--- a/fs/xfs/xfs_rtalloc.c
+++ b/fs/xfs/xfs_rtalloc.c
@@ -2270,12 +2270,12 @@ xfs_rtmount_inodes(
 	sbp = &mp->m_sb;
 	if (sbp->sb_rbmino == NULLFSINO)
 		return 0;
-	error = xfs_iget(mp, NULL, sbp->sb_rbmino, 0, 0, &mp->m_rbmip, 0);
+	error = xfs_iget(mp, NULL, sbp->sb_rbmino, 0, 0, &mp->m_rbmip);
 	if (error)
 		return error;
 	ASSERT(mp->m_rbmip != NULL);
 	ASSERT(sbp->sb_rsumino != NULLFSINO);
-	error = xfs_iget(mp, NULL, sbp->sb_rsumino, 0, 0, &mp->m_rsumip, 0);
+	error = xfs_iget(mp, NULL, sbp->sb_rsumino, 0, 0, &mp->m_rsumip);
 	if (error) {
 		IRELE(mp->m_rbmip);
 		return error;
diff --git a/fs/xfs/xfs_trans_inode.c b/fs/xfs/xfs_trans_inode.c
index 10534c2..cdc53a1 100644
--- a/fs/xfs/xfs_trans_inode.c
+++ b/fs/xfs/xfs_trans_inode.c
@@ -57,7 +57,7 @@ xfs_trans_iget(
 {
 	int			error;
 
-	error = xfs_iget(mp, tp, ino, flags, lock_flags, ipp, 0);
+	error = xfs_iget(mp, tp, ino, flags, lock_flags, ipp);
 	if (!error && tp) {
 		xfs_trans_ijoin(tp, *ipp);
 		(*ipp)->i_itemp->ili_lock_flags = lock_flags;
diff --git a/fs/xfs/xfs_vnodeops.c b/fs/xfs/xfs_vnodeops.c
index df6d1b5..ad599cc 100644
--- a/fs/xfs/xfs_vnodeops.c
+++ b/fs/xfs/xfs_vnodeops.c
@@ -1233,7 +1233,7 @@ xfs_lookup(
 	if (error)
 		goto out;
 
-	error = xfs_iget(dp->i_mount, NULL, inum, 0, 0, ipp, 0);
+	error = xfs_iget(dp->i_mount, NULL, inum, 0, 0, ipp);
 	if (error)
 		goto out_free_name;
 
-- 
1.7.1

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH 4/4] xfs: remove block number from inode lookup code
  2010-06-18  7:32 ` [PATCH 4/4] xfs: remove block number from inode lookup code Dave Chinner
@ 2010-06-18  8:22   ` Christoph Hellwig
  0 siblings, 0 replies; 21+ messages in thread
From: Christoph Hellwig @ 2010-06-18  8:22 UTC (permalink / raw)
  To: Dave Chinner; +Cc: security, xfs

> -	ip->i_imap.im_blkno = bno;
> +	ip->i_imap.im_blkno = 0;
>  	error = xfs_imap(mp, tp, ip->i_ino, &ip->i_imap, iget_flags);

The im_blkno initialization before calling xfs_imap can be removed now.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/4] xfs: validate untrusted inode numbers during lookup
  2010-06-18  7:32 ` [PATCH 2/4] xfs: validate untrusted inode numbers during lookup Dave Chinner
@ 2010-06-18 11:41   ` Christoph Hellwig
  2010-06-19  0:07     ` Dave Chinner
  0 siblings, 1 reply; 21+ messages in thread
From: Christoph Hellwig @ 2010-06-18 11:41 UTC (permalink / raw)
  To: Dave Chinner; +Cc: security, xfs

On Fri, Jun 18, 2010 at 05:32:52PM +1000, Dave Chinner wrote:
> +static int
> +xfs_imap_lookup(

STATIC to keep the gcc inliner from overdoing thing?

> +	xfs_mount_t	*mp,
> +	xfs_trans_t	*tp,

> +{
> +	xfs_inobt_rec_incore_t rec;
> +	xfs_btree_cur_t	*cur;
> +	xfs_buf_t	*agbp;

Please use the struct versions of these instead of the typedefs.

> +#ifdef DEBUG
> +		xfs_fs_cmn_err(CE_ALERT, mp, "xfs_imap: "
> +				"xfs_inobt_get_rec() failed");
> +#endif /* DEBUG */
> +		error = XFS_ERROR(EINVAL);

No need to print these even for debug kernels I think.  And even then
we shouldn't do it if the untrusted flag is set.

> +	}
> +error0:

I'd just call it out, or replace the goto by and if/else

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 3/4] xfs: rename XFS_IGET_BULKSTAT to XFS_IGET_UNTRUSTED
  2010-06-18  7:32 ` [PATCH 3/4] xfs: rename XFS_IGET_BULKSTAT to XFS_IGET_UNTRUSTED Dave Chinner
@ 2010-06-18 11:42   ` Christoph Hellwig
  0 siblings, 0 replies; 21+ messages in thread
From: Christoph Hellwig @ 2010-06-18 11:42 UTC (permalink / raw)
  To: Dave Chinner; +Cc: security, xfs

On Fri, Jun 18, 2010 at 05:32:53PM +1000, Dave Chinner wrote:
>  		/* no diagnostics for bulkstat, ino comes from userspace */
> -		if (flags & XFS_IGET_BULKSTAT)
> +		if (flags & XFS_IGET_UNTRUSTED)

The comment could use an update as well.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH 2/4] xfs: validate untrusted inode numbers during lookup
  2010-06-18 11:41   ` Christoph Hellwig
@ 2010-06-19  0:07     ` Dave Chinner
  0 siblings, 0 replies; 21+ messages in thread
From: Dave Chinner @ 2010-06-19  0:07 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: security, xfs

On Fri, Jun 18, 2010 at 07:41:56AM -0400, Christoph Hellwig wrote:
> On Fri, Jun 18, 2010 at 05:32:52PM +1000, Dave Chinner wrote:
> > +static int
> > +xfs_imap_lookup(
> 
> STATIC to keep the gcc inliner from overdoing thing?

*Nod*.

> 
> > +	xfs_mount_t	*mp,
> > +	xfs_trans_t	*tp,
> 
> > +{
> > +	xfs_inobt_rec_incore_t rec;
> > +	xfs_btree_cur_t	*cur;
> > +	xfs_buf_t	*agbp;
> 
> Please use the struct versions of these instead of the typedefs.

Copy-n-paste error - my bad.

> > +#ifdef DEBUG
> > +		xfs_fs_cmn_err(CE_ALERT, mp, "xfs_imap: "
> > +				"xfs_inobt_get_rec() failed");
> > +#endif /* DEBUG */
> > +		error = XFS_ERROR(EINVAL);
> 
> No need to print these even for debug kernels I think.  And even then
> we shouldn't do it if the untrusted flag is set.

Ok, I killed all the prints - they don't tell us what inode number
the error occurred on anyway, so they aren't very useful anyway....

> > +	}
> > +error0:
> 
> I'd just call it out, or replace the goto by and if/else

Ok, I rearranged the code to kill the goto. I will repost
the series after a QA run.

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: xfs: validate inode numbers in file handles correctly
  2010-06-18  7:32 xfs: validate inode numbers in file handles correctly Dave Chinner
                   ` (3 preceding siblings ...)
  2010-06-18  7:32 ` [PATCH 4/4] xfs: remove block number from inode lookup code Dave Chinner
@ 2011-11-23 13:04 ` Guoquan Yang
  2011-11-23 14:30   ` Christoph Hellwig
  4 siblings, 1 reply; 21+ messages in thread
From: Guoquan Yang @ 2011-11-23 13:04 UTC (permalink / raw)
  To: linux-xfs

Dave Chinner <david <at> fromorbit.com> writes:

> 
> This series closes a recently discovered problem in XFS filehandle 
conversion.
> On systems where inodes are dynamically deleted, XFS does not adequately 
verify
> the inode numbers in the filehandles, which results in reading stale inodes
> from disk and potentially returning them as valid files. Because these 
unlinked
> inodes were never zeroed out when the chunk was deallocated, some inodes in 
the
> chunk can still appear to have to data extents attached to them. This can 
lead
> to stale data exposure, exposure of active data and potentially overwriting 
of
> active data if the stale extents referenced in the unlinked inodes have been
> re-allocated.
> 
> Both NFS filehandles and local filehandles provided through libhandle have 
this
> same problem. libhandle requires root permissions to use the interface, so it
> is not exposing information that you can't get more easily with other means
> (e.g. xfs_db or reading directly form the block device), so there isn't 
really
> an issue here.
> 
> For NFS, we may incorrectly accept stale file handles for unlinked inodes 
after
> a server reboot if the unlinked inodes have not been overwritten leading to 
the
> above issues being triggered if multiple NFS clients are accessing the some
> files.
> 
> Christoph's make-bulkstat-coherent patch is the basis for this series as
> bulkstat can also expose unlinked inodes and information about them back to
> userspace because it makes the same assumptions about inode lookups as the 
file
> handle interfaces.
> 
> As a result, the first two patches of the series make up the real bug fix. 
The
> last two patches make it clear we are looking up untrusted inode numbers and
> remove a shortcut that these interfaces used that we do not want used any
> more. Hence for backports to other kernels, only the first two patches are
> necessary.
> 
> More information and the test program that demonstrates the issue via the
> open_by_handle interface can be found here:
> 
> http://oss.sgi.com/archives/xfs/2010-06/msg00191.html
> 
> _______________________________________________
> xfs mailing list
> xfs <at> oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
> 
> 

I meet with a problem when using 64bit XFS and NFS,

When I access a directory from the NFS client,I get Stale NFS file Handle 
error.But it is ok when accessing on the server without NFS.

And I have checked that the inode num in the NFS file handle is the same as 
inode num on the server, Using ls -il.

I found that XFS_IGET_UNTRUSTED in xfs_imap() filtered the function 
xfs_imap_lookup(),It fails in xfs_imap_lookup() when access from NFS 
client.local access does not go into xfs_imap_lookup().

My kernel is Linux2.6.35.6 from kernel.org. please help me to find out this 
problem!

Thanks!



_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: xfs: validate inode numbers in file handles correctly
  2011-11-23 13:04 ` xfs: validate inode numbers in file handles correctly Guoquan Yang
@ 2011-11-23 14:30   ` Christoph Hellwig
       [not found]     ` <SNT135-W7F5C64C2A3F67B48EFF3AA4CE0@phx.gbl>
  2011-11-28 11:19     ` Christoph Hellwig
  0 siblings, 2 replies; 21+ messages in thread
From: Christoph Hellwig @ 2011-11-23 14:30 UTC (permalink / raw)
  To: Guoquan Yang; +Cc: linux-xfs, hank peng

On Wed, Nov 23, 2011 at 01:04:58PM +0000, Guoquan Yang wrote:
> I meet with a problem when using 64bit XFS and NFS,
> 
> When I access a directory from the NFS client,I get Stale NFS file Handle 
> error.But it is ok when accessing on the server without NFS.
> 
> And I have checked that the inode num in the NFS file handle is the same as 
> inode num on the server, Using ls -il.
> 
> I found that XFS_IGET_UNTRUSTED in xfs_imap() filtered the function 
> xfs_imap_lookup(),It fails in xfs_imap_lookup() when access from NFS 
> client.local access does not go into xfs_imap_lookup().
> 
> My kernel is Linux2.6.35.6 from kernel.org. please help me to find out this 
> problem!

It seems like you hit the same issue hank peng reported recently, and
in facr your are on the same kernel for the serve as he is.

>From a closer look it seems like the changes you mentioned above indeed
had a bug in Linux 2.6.35, which was later fixed with the following
commit

mmit 4536f2ad8b330453d7ebec0746c4374eadd649b1
Author: Dave Chinner <dchinner@redhat.com>
Date:   Tue Aug 24 11:42:30 2010 +1000

    xfs: fix untrusted inode number lookup

which should be included in Linux 2.6.35.6.  Can you make sure you
really have the commit?  Can you also verify that a recent kernel
like Linux 3.0-stable shows the same behaviour?

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: xfs: validate inode numbers in file handles correctly
       [not found]     ` <SNT135-W7F5C64C2A3F67B48EFF3AA4CE0@phx.gbl>
@ 2011-11-24 12:52       ` Christoph Hellwig
  0 siblings, 0 replies; 21+ messages in thread
From: Christoph Hellwig @ 2011-11-24 12:52 UTC (permalink / raw)
  To: yangguoquan; +Cc: hch, linux-xfs, pengxihan

On Thu, Nov 24, 2011 at 06:29:18PM +0800, yangguoquan wrote:
> 
> I have checked that Linux2.6.35.6 is up-to-date with the issue "xfs: fix untrusted inode number lookup"(you had mentioned in the former reply).
> What can I do.
> Look forward to your reply.

Can you please try a recent kernel (3.0-stable or 3.1)?  2.6.35 is so
old that we can't easily support it except for looking over the
list of commits.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: xfs: validate inode numbers in file handles correctly
  2011-11-23 14:30   ` Christoph Hellwig
       [not found]     ` <SNT135-W7F5C64C2A3F67B48EFF3AA4CE0@phx.gbl>
@ 2011-11-28 11:19     ` Christoph Hellwig
  2011-12-03  8:27       ` hank peng
  2011-12-03  9:56       ` yangguoquan
  1 sibling, 2 replies; 21+ messages in thread
From: Christoph Hellwig @ 2011-11-28 11:19 UTC (permalink / raw)
  To: Guoquan Yang; +Cc: linux-xfs, hank peng

Guoquan and hank,

are you using 32-bit or 64-bit kernels?  I just noticed we have a
problem with exporting 64-bit inodes on 32-bit kernel because the
VFS i_ino field is just 32-bits long.  The patch below would fix
that issue.

--- xfs.orig/fs/xfs/linux-2.6/xfs_export.c	2011-11-28 12:11:08.923630697 +0100
+++ xfs/fs/xfs/linux-2.6/xfs_export.c	2011-11-28 12:13:21.766244360 +0100
@@ -61,6 +61,8 @@ xfs_fs_encode_fh(
 	struct fid		*fid = (struct fid *)fh;
 	struct xfs_fid64	*fid64 = (struct xfs_fid64 *)fh;
 	struct inode		*inode = dentry->d_inode;
+	struct inode		*parent;
+	struct xfs_inode	*ip = XFS_I(inode);
 	int			fileid_type;
 	int			len;
 
@@ -98,22 +100,24 @@ xfs_fs_encode_fh(
 	switch (fileid_type) {
 	case FILEID_INO32_GEN_PARENT:
 		spin_lock(&dentry->d_lock);
-		fid->i32.parent_ino = dentry->d_parent->d_inode->i_ino;
-		fid->i32.parent_gen = dentry->d_parent->d_inode->i_generation;
+		parent = dentry->d_parent->d_inode;
+		fid->i32.parent_ino = XFS_I(parent)->i_ino;
+		fid->i32.parent_gen = parent->i_generation;
 		spin_unlock(&dentry->d_lock);
 		/*FALLTHRU*/
 	case FILEID_INO32_GEN:
-		fid->i32.ino = inode->i_ino;
+		fid->i32.ino = ip->i_ino;
 		fid->i32.gen = inode->i_generation;
 		break;
 	case FILEID_INO32_GEN_PARENT | XFS_FILEID_TYPE_64FLAG:
 		spin_lock(&dentry->d_lock);
-		fid64->parent_ino = dentry->d_parent->d_inode->i_ino;
-		fid64->parent_gen = dentry->d_parent->d_inode->i_generation;
+		parent = dentry->d_parent->d_inode;
+		fid64->parent_ino = XFS_I(parent)->i_ino;
+		fid64->parent_gen = parent->i_generation;
 		spin_unlock(&dentry->d_lock);
 		/*FALLTHRU*/
 	case FILEID_INO32_GEN | XFS_FILEID_TYPE_64FLAG:
-		fid64->ino = inode->i_ino;
+		fid64->ino = ip->i_ino;
 		fid64->gen = inode->i_generation;
 		break;
 	}

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: xfs: validate inode numbers in file handles correctly
  2011-11-28 11:19     ` Christoph Hellwig
@ 2011-12-03  8:27       ` hank peng
  2011-12-06 15:17         ` Christoph Hellwig
  2011-12-03  9:56       ` yangguoquan
  1 sibling, 1 reply; 21+ messages in thread
From: hank peng @ 2011-12-03  8:27 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs, Guoquan Yang

2011/11/28 Christoph Hellwig <hch@infradead.org>:
> Guoquan and hank,
>
> are you using 32-bit or 64-bit kernels?  I just noticed we have a
> problem with exporting 64-bit inodes on 32-bit kernel because the
> VFS i_ino field is just 32-bits long.  The patch below would fix
> that issue.
>
yes, we use inode64 on 32-bit kernel.


> --- xfs.orig/fs/xfs/linux-2.6/xfs_export.c      2011-11-28 12:11:08.923630697 +0100
> +++ xfs/fs/xfs/linux-2.6/xfs_export.c   2011-11-28 12:13:21.766244360 +0100
> @@ -61,6 +61,8 @@ xfs_fs_encode_fh(
>        struct fid              *fid = (struct fid *)fh;
>        struct xfs_fid64        *fid64 = (struct xfs_fid64 *)fh;
>        struct inode            *inode = dentry->d_inode;
> +       struct inode            *parent;
> +       struct xfs_inode        *ip = XFS_I(inode);
>        int                     fileid_type;
>        int                     len;
>
> @@ -98,22 +100,24 @@ xfs_fs_encode_fh(
>        switch (fileid_type) {
>        case FILEID_INO32_GEN_PARENT:
>                spin_lock(&dentry->d_lock);
> -               fid->i32.parent_ino = dentry->d_parent->d_inode->i_ino;
> -               fid->i32.parent_gen = dentry->d_parent->d_inode->i_generation;
> +               parent = dentry->d_parent->d_inode;
> +               fid->i32.parent_ino = XFS_I(parent)->i_ino;
> +               fid->i32.parent_gen = parent->i_generation;
>                spin_unlock(&dentry->d_lock);
>                /*FALLTHRU*/
>        case FILEID_INO32_GEN:
> -               fid->i32.ino = inode->i_ino;
> +               fid->i32.ino = ip->i_ino;
>                fid->i32.gen = inode->i_generation;
>                break;
>        case FILEID_INO32_GEN_PARENT | XFS_FILEID_TYPE_64FLAG:
>                spin_lock(&dentry->d_lock);
> -               fid64->parent_ino = dentry->d_parent->d_inode->i_ino;
> -               fid64->parent_gen = dentry->d_parent->d_inode->i_generation;
> +               parent = dentry->d_parent->d_inode;
> +               fid64->parent_ino = XFS_I(parent)->i_ino;
> +               fid64->parent_gen = parent->i_generation;
>                spin_unlock(&dentry->d_lock);
>                /*FALLTHRU*/
>        case FILEID_INO32_GEN | XFS_FILEID_TYPE_64FLAG:
> -               fid64->ino = inode->i_ino;
> +               fid64->ino = ip->i_ino;
>                fid64->gen = inode->i_generation;
>                break;
>        }

I haven't tested this patch, but I have a question now: although I use
inode64 option when mounting, my filesystem did not exceed 2T limit,
so, 32-bit inode would be no problem, right?

-- 
The simplest is not all best but the best is surely the simplest!
_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: xfs: validate inode numbers in file handles correctly
  2011-11-28 11:19     ` Christoph Hellwig
  2011-12-03  8:27       ` hank peng
@ 2011-12-03  9:56       ` yangguoquan
  2011-12-29  9:19         ` xfs: validate inode numbers in file handles correctly--NFS Stale File Handle Again yangguoquan
  1 sibling, 1 reply; 21+ messages in thread
From: yangguoquan @ 2011-12-03  9:56 UTC (permalink / raw)
  To: hch; +Cc: linux-xfs, pengxihan


[-- Attachment #1.1: Type: text/plain, Size: 2219 bytes --]


I have tried this patch, And problem resolved!
Thank you for your help!

 

> Date: Mon, 28 Nov 2011 06:19:47 -0500
> From: hch@infradead.org
> To: ygq51@hotmail.com
> Subject: Re: xfs: validate inode numbers in file handles correctly
> CC: linux-xfs@oss.sgi.com; pengxihan@gmail.com
> 
> Guoquan and hank,
> 
> are you using 32-bit or 64-bit kernels? I just noticed we have a
> problem with exporting 64-bit inodes on 32-bit kernel because the
> VFS i_ino field is just 32-bits long. The patch below would fix
> that issue.
> 
> --- xfs.orig/fs/xfs/linux-2.6/xfs_export.c 2011-11-28 12:11:08.923630697 +0100
> +++ xfs/fs/xfs/linux-2.6/xfs_export.c 2011-11-28 12:13:21.766244360 +0100
> @@ -61,6 +61,8 @@ xfs_fs_encode_fh(
> struct fid *fid = (struct fid *)fh;
> struct xfs_fid64 *fid64 = (struct xfs_fid64 *)fh;
> struct inode *inode = dentry->d_inode;
> + struct inode *parent;
> + struct xfs_inode *ip = XFS_I(inode);
> int fileid_type;
> int len;
> 
> @@ -98,22 +100,24 @@ xfs_fs_encode_fh(
> switch (fileid_type) {
> case FILEID_INO32_GEN_PARENT:
> spin_lock(&dentry->d_lock);
> - fid->i32.parent_ino = dentry->d_parent->d_inode->i_ino;
> - fid->i32.parent_gen = dentry->d_parent->d_inode->i_generation;
> + parent = dentry->d_parent->d_inode;
> + fid->i32.parent_ino = XFS_I(parent)->i_ino;
> + fid->i32.parent_gen = parent->i_generation;
> spin_unlock(&dentry->d_lock);
> /*FALLTHRU*/
> case FILEID_INO32_GEN:
> - fid->i32.ino = inode->i_ino;
> + fid->i32.ino = ip->i_ino;
> fid->i32.gen = inode->i_generation;
> break;
> case FILEID_INO32_GEN_PARENT | XFS_FILEID_TYPE_64FLAG:
> spin_lock(&dentry->d_lock);
> - fid64->parent_ino = dentry->d_parent->d_inode->i_ino;
> - fid64->parent_gen = dentry->d_parent->d_inode->i_generation;
> + parent = dentry->d_parent->d_inode;
> + fid64->parent_ino = XFS_I(parent)->i_ino;
> + fid64->parent_gen = parent->i_generation;
> spin_unlock(&dentry->d_lock);
> /*FALLTHRU*/
> case FILEID_INO32_GEN | XFS_FILEID_TYPE_64FLAG:
> - fid64->ino = inode->i_ino;
> + fid64->ino = ip->i_ino;
> fid64->gen = inode->i_generation;
> break;
> }
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
 		 	   		  

[-- Attachment #1.2: Type: text/html, Size: 2971 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: xfs: validate inode numbers in file handles correctly
  2011-12-03  8:27       ` hank peng
@ 2011-12-06 15:17         ` Christoph Hellwig
  0 siblings, 0 replies; 21+ messages in thread
From: Christoph Hellwig @ 2011-12-06 15:17 UTC (permalink / raw)
  To: hank peng; +Cc: Christoph Hellwig, linux-xfs, Guoquan Yang

On Sat, Dec 03, 2011 at 04:27:39PM +0800, hank peng wrote:
> I haven't tested this patch, but I have a question now: although I use
> inode64 option when mounting, my filesystem did not exceed 2T limit,
> so, 32-bit inode would be no problem, right?

In the normal configuration > 32bit inode numbers kick in at 1TB
filesystem sizes.

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: xfs: validate inode numbers in file handles correctly--NFS Stale File Handle Again
  2011-12-03  9:56       ` yangguoquan
@ 2011-12-29  9:19         ` yangguoquan
  2012-01-02 15:02           ` Christoph Hellwig
  0 siblings, 1 reply; 21+ messages in thread
From: yangguoquan @ 2011-12-29  9:19 UTC (permalink / raw)
  To: hch; +Cc: linux-xfs, pengxihan


[-- Attachment #1.1: Type: text/plain, Size: 5963 bytes --]


NFS Stale File Handle Again, But this happens when mount operation not during accessing file or directory.
 
My /etc/exports file:
 
/mnt/storage_pool/testnfs00     *(rw,sync,no_root_squash)
/mnt/storage_pool/testnfs01     *(rw,sync,no_root_squash)
/mnt/storage_pool/testnfs02     *(rw,sync,no_root_squash)
/mnt/storage_pool/testnfs03     *(rw,sync,no_root_squash)
/mnt/storage_pool/testnfs04     *(rw,sync,no_root_squash)
/mnt/storage_pool/testnfs05     *(rw,sync,no_root_squash)
/mnt/storage_pool/testnfs06     *(rw,sync,no_root_squash)
/mnt/storage_pool/testnfs07     *(rw,sync,no_root_squash)
/mnt/storage_pool/testnfs08     *(rw,sync,no_root_squash)
/mnt/storage_pool/testnfs09     *(rw,sync,no_root_squash)
 
Proc info as follows:
root@Dahua_Storage:~# cat /proc/net/rpc/nfsd
rc 0 5428 3350
fh 600 0 0 0 0
io 0 1195819008
th 8 0 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000
ra 32 0 0 0 0 0 0 0 0 0 0 0
net 8779 1 8771 1
rpc 8778 0 0 0 0
proc2 18 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
proc3 22 1 819 0 840 1432 0 0 4587 12 0 0 0 829 0 0 0 0 37 10 0 0 210
proc4 2 0 0
proc4ops 59 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0    
root@Dahua_Storage:~# cat /proc/net/rpc/nfsd.export/content 
#path domain(flags)
/mnt/storage_pool/testnfs02     *(rw,no_root_squash,sync,wdelay,no_subtree_check,uuid=0000fd02:00000000:00000000:00000000)
/mnt/storage_pool/testnfs04     *(rw,no_root_squash,sync,wdelay,no_subtree_check,uuid=0000fd04:00000000:00000000:00000000)
/mnt/storage_pool/testnfs00     *(rw,no_root_squash,sync,wdelay,no_subtree_check,uuid=0000fd00:00000000:00000000:00000000)
/mnt/storage_pool/testnfs05     *(rw,no_root_squash,sync,wdelay,no_subtree_check,uuid=0000fd05:00000000:00000000:00000000)
/mnt/storage_pool/testnfs01     *(rw,no_root_squash,sync,wdelay,no_subtree_check,uuid=0000fd01:00000000:00000000:00000000)
/mnt/storage_pool/testnfs03     *(rw,no_root_squash,sync,wdelay,no_subtree_check,uuid=0000fd03:00000000:00000000:00000000)
root@Dahua_Storage:~# cat /proc/net/rpc/nfsd.fh/            
channel  content  flush    
root@Dahua_Storage:~# cat /proc/net/rpc/nfsd.fh/content 
#domain fsidtype fsid [path]
* 6 0x00fd0000000000000000000000000000 /mnt/storage_pool/testnfs00
* 6 0x03fd0000000000000000000000000000 /mnt/storage_pool/testnfs03
* 6 0x02fd0000000000000000000000000000 /mnt/storage_pool/testnfs02
* 6 0x01fd0000000000000000000000000000 /mnt/storage_pool/testnfs01
# * 6 0x06fd0000000000000000000000000000
* 6 0x05fd0000000000000000000000000000 /mnt/storage_pool/testnfs05
* 6 0x04fd0000000000000000000000000000 /mnt/storage_pool/testnfs04
# * 6 0x09fd0000000000000000000000000000
# * 6 0x08fd0000000000000000000000000000
# * 6 0x07fd0000000000000000000000000000
 
When I mount "/mnt/storage_pool/testnfs07" which does not appear in the proc info , problem happens, I can not even mount from local machine.
 
More info: 
1.       I use tcpdump and find that STALE returns when FSINFO call during mount procdure, please check the attached screenshot.
2.       I debug nfsd, and find that FSINFO fail stack:
          cache_check() return -ENOENT or -EAGAIN
          exp_find_key()
          exp_find()
          rqst_exp_find()
          nfsd_set_fh_dentry()
          fh_verify()
          nfsd3_proc_fsinfo()
 
Please help me, my kernel is still 2.6.35.6.
Looking forward for your reply!
 
Thanks!
 



From: ygq51@hotmail.com
To: hch@infradead.org
Subject: RE: xfs: validate inode numbers in file handles correctly
Date: Sat, 3 Dec 2011 17:56:20 +0800
CC: linux-xfs@oss.sgi.com; pengxihan@gmail.com




I have tried this patch, And problem resolved!
Thank you for your help!

 

> Date: Mon, 28 Nov 2011 06:19:47 -0500
> From: hch@infradead.org
> To: ygq51@hotmail.com
> Subject: Re: xfs: validate inode numbers in file handles correctly
> CC: linux-xfs@oss.sgi.com; pengxihan@gmail.com
> 
> Guoquan and hank,
> 
> are you using 32-bit or 64-bit kernels? I just noticed we have a
> problem with exporting 64-bit inodes on 32-bit kernel because the
> VFS i_ino field is just 32-bits long. The patch below would fix
> that issue.
> 
> --- xfs.orig/fs/xfs/linux-2.6/xfs_export.c 2011-11-28 12:11:08.923630697 +0100
> +++ xfs/fs/xfs/linux-2.6/xfs_export.c 2011-11-28 12:13:21.766244360 +0100
> @@ -61,6 +61,8 @@ xfs_fs_encode_fh(
> struct fid *fid = (struct fid *)fh;
> struct xfs_fid64 *fid64 = (struct xfs_fid64 *)fh;
> struct inode *inode = dentry->d_inode;
> + struct inode *parent;
> + struct xfs_inode *ip = XFS_I(inode);
> int fileid_ type;
> int len;
> 
> @@ -98,22 +100,24 @@ xfs_fs_encode_fh(
> switch (fileid_type) {
> case FILEID_INO32_GEN_PARENT:
> spin_lock(&dentry->d_lock);
> - fid->i32.parent_ino = dentry->d_parent->d_inode->i_ino;
> - fid->i32.parent_gen = dentry->d_parent->d_inode->i_generation;
> + parent = dentry->d_parent->d_inode;
> + fid->i32.parent_ino = XFS_I(parent)->i_ino;
> + fid->i32.parent_gen = parent->i_generation;
> spin_unlock(&dentry->d_lock);
> /*FALLTHRU*/
> case FILEID_INO32_GEN:
> - fid->i32.ino = inode->i_ino;
> + fid->i32.ino = ip->i_ino;
> fid->i32.gen = inode->i_generation;
> break;
> case FILEID_INO32_GEN_PARENT | XFS_FILEID_TYPE_64FLAG:
> spin_lock(&dentry->d_lock);
> - fid64->parent_ino = dentry->d_parent->d_inode->i_ino;
> - fid64->parent_gen = dentry->d_parent->d_inode->i_generation;
> + parent = dentry->d_parent->d_inode;
> + fid64->parent_ino = XFS_I(parent)->i_ino;
> + fid64->parent_gen = parent->i_generation;
> spin_unlock(&dentry->d_lock);
> /*FALLTHRU*/
> case FILEID_INO32_GEN | XFS_FILEID_TYPE_64FLAG:
> - fid64->ino = inode->i_ino;
> + fid64->ino = ip->i_ino;
> fid64->gen = inode->i_generation;
> break;
> }
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs

_______________________________________________ xfs mailing list xfs@oss.sgi.com http://oss.sgi.com/mailman/listinfo/xfs 		 	   		  

[-- Attachment #1.2: Type: text/html, Size: 8241 bytes --]

[-- Attachment #2: Stale.JPG --]
[-- Type: image/jpeg, Size: 146524 bytes --]

[-- Attachment #3: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: xfs: validate inode numbers in file handles correctly--NFS Stale File Handle Again
  2011-12-29  9:19         ` xfs: validate inode numbers in file handles correctly--NFS Stale File Handle Again yangguoquan
@ 2012-01-02 15:02           ` Christoph Hellwig
  2012-01-04  2:20             ` yangguoquan
  0 siblings, 1 reply; 21+ messages in thread
From: Christoph Hellwig @ 2012-01-02 15:02 UTC (permalink / raw)
  To: yangguoquan; +Cc: hch, linux-xfs, pengxihan

On Thu, Dec 29, 2011 at 05:19:55PM +0800, yangguoquan wrote:
> 
> NFS Stale File Handle Again, But this happens when mount operation not during accessing file or directory.
>  
> My /etc/exports file:
>  
> /mnt/storage_pool/testnfs00     *(rw,sync,no_root_squash)
> /mnt/storage_pool/testnfs01     *(rw,sync,no_root_squash)
> /mnt/storage_pool/testnfs02     *(rw,sync,no_root_squash)
> /mnt/storage_pool/testnfs03     *(rw,sync,no_root_squash)
> /mnt/storage_pool/testnfs04     *(rw,sync,no_root_squash)
> /mnt/storage_pool/testnfs05     *(rw,sync,no_root_squash)
> /mnt/storage_pool/testnfs06     *(rw,sync,no_root_squash)
> /mnt/storage_pool/testnfs07     *(rw,sync,no_root_squash)
> /mnt/storage_pool/testnfs08     *(rw,sync,no_root_squash)
> /mnt/storage_pool/testnfs09     *(rw,sync,no_root_squash)

Are these mount points?  If not take a look at

	http://xfs.org/index.php/XFS_FAQ#Q:_Why_doesn.27t_NFS-exporting_subdirectories_of_inode64-mounted_filesystem_work.3F

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: xfs: validate inode numbers in file handles correctly--NFS Stale File Handle Again
  2012-01-02 15:02           ` Christoph Hellwig
@ 2012-01-04  2:20             ` yangguoquan
  2012-01-24 17:58               ` Christoph Hellwig
  0 siblings, 1 reply; 21+ messages in thread
From: yangguoquan @ 2012-01-04  2:20 UTC (permalink / raw)
  To: hch; +Cc: linux-xfs, pengxihan


[-- Attachment #1.1: Type: text/plain, Size: 2204 bytes --]


Yes, They are mount points.
 
/dev/mapper/storage_pool-testnfs00 on /mnt/storage_pool/testnfs00 type xfs (rw)
/dev/mapper/storage_pool-testnfs01 on /mnt/storage_pool/testnfs01 type xfs (rw)
/dev/mapper/storage_pool-testnfs02 on /mnt/storage_pool/testnfs02 type xfs (rw)
/dev/mapper/storage_pool-testnfs03 on /mnt/storage_pool/testnfs03 type xfs (rw)
/dev/mapper/storage_pool-testnfs04 on /mnt/storage_pool/testnfs04 type xfs (rw)
/dev/mapper/storage_pool-testnfs05 on /mnt/storage_pool/testnfs05 type xfs (rw)
/dev/mapper/storage_pool-testnfs06 on /mnt/storage_pool/testnfs06 type xfs (rw)
/dev/mapper/storage_pool-testnfs07 on /mnt/storage_pool/testnfs07 type xfs (rw)
/dev/mapper/storage_pool-testnfs08 on /mnt/storage_pool/testnfs08 type xfs (rw)
/dev/mapper/storage_pool-testnfs09 on /mnt/storage_pool/testnfs09 type xfs (rw)
 

> Date: Mon, 2 Jan 2012 10:02:01 -0500
> From: hch@infradead.org
> To: ygq51@hotmail.com
> Subject: Re: xfs: validate inode numbers in file handles correctly--NFS Stale File Handle Again
> CC: hch@infradead.org; linux-xfs@oss.sgi.com; pengxihan@gmail.com
> 
> On Thu, Dec 29, 2011 at 05:19:55PM +0800, yangguoquan wrote:
> > 
> > NFS Stale File Handle Again, But this happens when mount operation not during accessing file or directory.
> > 
> > My /etc/exports file:
> > 
> > /mnt/storage_pool/testnfs00 *(rw,sync,no_root_squash)
> > /mnt/storage_pool/testnfs01 *(rw,sync,no_root_squash)
> > /mnt/storage_pool/testnfs02 *(rw,sync,no_root_squash)
> > /mnt/storage_pool/testnfs03 *(rw,sync,no_root_squash)
> > /mnt/storage_pool/testnfs04 *(rw,sync,no_root_squash)
> > /mnt/storage_pool/testnfs05 *(rw,sync,no_root_squash)
> > /mnt/storage_pool/testnfs06 *(rw,sync,no_root_squash)
> > /mnt/storage_pool/testnfs07 *(rw,sync,no_root_squash)
> > /mnt/storage_pool/testnfs08 *(rw,sync,no_root_squash)
> > /mnt/storage_pool/testnfs09 *(rw,sync,no_root_squash)
> 
> Are these mount points? If not take a look at
> 
> http://xfs.org/index.php/XFS_FAQ#Q:_Why_doesn.27t_NFS-exporting_subdirectories_of_inode64-mounted_filesystem_work.3F
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
 		 	   		  

[-- Attachment #1.2: Type: text/html, Size: 2746 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: xfs: validate inode numbers in file handles correctly--NFS Stale File Handle Again
  2012-01-04  2:20             ` yangguoquan
@ 2012-01-24 17:58               ` Christoph Hellwig
  2012-02-01  5:46                 ` yangguoquan
  0 siblings, 1 reply; 21+ messages in thread
From: Christoph Hellwig @ 2012-01-24 17:58 UTC (permalink / raw)
  To: yangguoquan; +Cc: hch, linux-xfs, pengxihan

On Wed, Jan 04, 2012 at 10:20:31AM +0800, yangguoquan wrote:
> 
> Yes, They are mount points.
>  
> /dev/mapper/storage_pool-testnfs00 on /mnt/storage_pool/testnfs00 type xfs (rw)

In that case I really have no idea.  Did you make sure you no
outstanding "old" filehandles before the fix on clients?  How do the
handles look in wireshark traces?

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: xfs: validate inode numbers in file handles correctly--NFS Stale File Handle Again
  2012-01-24 17:58               ` Christoph Hellwig
@ 2012-02-01  5:46                 ` yangguoquan
  0 siblings, 0 replies; 21+ messages in thread
From: yangguoquan @ 2012-02-01  5:46 UTC (permalink / raw)
  To: hch; +Cc: linux-xfs, pengxihan


[-- Attachment #1.1: Type: text/plain, Size: 1739 bytes --]


The handles appear in wireshark is the same as in the /proc/net/rpc/nfsd.fh/content info
but there no corresponding directory path. just like this:
# * 6 0x06fd0000000000000000000000000000

root@Dahua_Storage:~# cat /proc/net/rpc/nfsd.fh/content 
#domain fsidtype fsid [path]
* 6 0x00fd0000000000000000000000000000 /mnt/storage_pool/testnfs00
* 6 0x03fd0000000000000000000000000000 /mnt/storage_pool/testnfs03
* 6 0x02fd0000000000000000000000000000 /mnt/storage_pool/testnfs02
* 6 0x01fd0000000000000000000000000000 /mnt/storage_pool/testnfs01
# * 6 0x06fd0000000000000000000000000000
* 6 0x05fd0000000000000000000000000000 /mnt/storage_pool/testnfs05
* 6 0x04fd0000000000000000000000000000 /mnt/storage_pool/testnfs04
# * 6 0x09fd0000000000000000000000000000
# * 6 0x08fd0000000000000000000000000000
# * 6 0x07fd0000000000000000000000000000
 
more info, this problem happens when rebooting the storage server while the nfs clients is still accessing the mounting points.


> Date: Tue, 24 Jan 2012 12:58:26 -0500
> From: hch@infradead.org
> To: ygq51@hotmail.com
> Subject: Re: xfs: validate inode numbers in file handles correctly--NFS Stale File Handle Again
> CC: hch@infradead.org; linux-xfs@oss.sgi.com; pengxihan@gmail.com
> 
> On Wed, Jan 04, 2012 at 10:20:31AM +0800, yangguoquan wrote:
> > 
> > Yes, They are mount points.
> > 
> > /dev/mapper/storage_pool-testnfs00 on /mnt/storage_pool/testnfs00 type xfs (rw)
> 
> In that case I really have no idea. Did you make sure you no
> outstanding "old" filehandles before the fix on clients? How do the
> handles look in wireshark traces?
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
 		 	   		  

[-- Attachment #1.2: Type: text/html, Size: 2403 bytes --]

[-- Attachment #2: Type: text/plain, Size: 121 bytes --]

_______________________________________________
xfs mailing list
xfs@oss.sgi.com
http://oss.sgi.com/mailman/listinfo/xfs

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2012-02-01  5:47 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-06-18  7:32 xfs: validate inode numbers in file handles correctly Dave Chinner
2010-06-18  7:32 ` [PATCH 1/4] xfs: always use iget in bulkstat Dave Chinner
2010-06-18  7:32 ` [PATCH 2/4] xfs: validate untrusted inode numbers during lookup Dave Chinner
2010-06-18 11:41   ` Christoph Hellwig
2010-06-19  0:07     ` Dave Chinner
2010-06-18  7:32 ` [PATCH 3/4] xfs: rename XFS_IGET_BULKSTAT to XFS_IGET_UNTRUSTED Dave Chinner
2010-06-18 11:42   ` Christoph Hellwig
2010-06-18  7:32 ` [PATCH 4/4] xfs: remove block number from inode lookup code Dave Chinner
2010-06-18  8:22   ` Christoph Hellwig
2011-11-23 13:04 ` xfs: validate inode numbers in file handles correctly Guoquan Yang
2011-11-23 14:30   ` Christoph Hellwig
     [not found]     ` <SNT135-W7F5C64C2A3F67B48EFF3AA4CE0@phx.gbl>
2011-11-24 12:52       ` Christoph Hellwig
2011-11-28 11:19     ` Christoph Hellwig
2011-12-03  8:27       ` hank peng
2011-12-06 15:17         ` Christoph Hellwig
2011-12-03  9:56       ` yangguoquan
2011-12-29  9:19         ` xfs: validate inode numbers in file handles correctly--NFS Stale File Handle Again yangguoquan
2012-01-02 15:02           ` Christoph Hellwig
2012-01-04  2:20             ` yangguoquan
2012-01-24 17:58               ` Christoph Hellwig
2012-02-01  5:46                 ` yangguoquan

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.