All of lore.kernel.org
 help / color / mirror / Atom feed
* dinode reading cleanups
@ 2020-05-01  8:14 Christoph Hellwig
  2020-05-01  8:14 ` [PATCH 01/12] xfs: xfs_bmapi_read doesn't take a fork id as the last argument Christoph Hellwig
                   ` (11 more replies)
  0 siblings, 12 replies; 41+ messages in thread
From: Christoph Hellwig @ 2020-05-01  8:14 UTC (permalink / raw)
  To: linux-xfs

Hi all,

while dusting off my series to move the per-fork extent count and format
into the xfs_ifork structure I found that we added a few more hacks in
the area it touched.  This series has some of the prep patch combined with
a restructure of the dinode reading path that cleans up handling of early
errors during dinode reading, and allows droppign a workaround in
xfs_bmapi_read.

Another side effect is that we can share more code with xfsprogs.

Git tree:

    git://git.infradead.org/users/hch/xfs.git xfs-inode-read-cleanup

Gitweb:

    http://git.infradead.org/users/hch/xfs.git/shortlog/refs/heads/xfs-inode-read-cleanup

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 01/12] xfs: xfs_bmapi_read doesn't take a fork id as the last argument
  2020-05-01  8:14 dinode reading cleanups Christoph Hellwig
@ 2020-05-01  8:14 ` Christoph Hellwig
  2020-05-01 13:33   ` Brian Foster
  2020-05-01  8:14 ` [PATCH 02/12] xfs: call xfs_iformat_fork from xfs_inode_from_disk Christoph Hellwig
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 41+ messages in thread
From: Christoph Hellwig @ 2020-05-01  8:14 UTC (permalink / raw)
  To: linux-xfs

The last argument to xfs_bmapi_raad contains XFS_BMAPI_* flags, not the
fork.  Given that XFS_DATA_FORK evaluates to 0 no real harm is done,
but let's fix this anyway.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_rtbitmap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/xfs/libxfs/xfs_rtbitmap.c b/fs/xfs/libxfs/xfs_rtbitmap.c
index f42c74cb8be53..9498ced947be9 100644
--- a/fs/xfs/libxfs/xfs_rtbitmap.c
+++ b/fs/xfs/libxfs/xfs_rtbitmap.c
@@ -66,7 +66,7 @@ xfs_rtbuf_get(
 
 	ip = issum ? mp->m_rsumip : mp->m_rbmip;
 
-	error = xfs_bmapi_read(ip, block, 1, &map, &nmap, XFS_DATA_FORK);
+	error = xfs_bmapi_read(ip, block, 1, &map, &nmap, 0);
 	if (error)
 		return error;
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 02/12] xfs: call xfs_iformat_fork from xfs_inode_from_disk
  2020-05-01  8:14 dinode reading cleanups Christoph Hellwig
  2020-05-01  8:14 ` [PATCH 01/12] xfs: xfs_bmapi_read doesn't take a fork id as the last argument Christoph Hellwig
@ 2020-05-01  8:14 ` Christoph Hellwig
  2020-05-01 13:33   ` Brian Foster
  2020-05-01  8:14 ` [PATCH 03/12] xfs: split xfs_iformat_fork Christoph Hellwig
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 41+ messages in thread
From: Christoph Hellwig @ 2020-05-01  8:14 UTC (permalink / raw)
  To: linux-xfs

We always need to fill out the fork structures when reading the inode,
so call xfs_iformat_fork from the tail of xfs_inode_from_disk.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_inode_buf.c | 7 ++++---
 fs/xfs/libxfs/xfs_inode_buf.h | 2 +-
 fs/xfs/xfs_log_recover.c      | 4 +---
 3 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c
index 39c5a6e24915c..02f06dec0a5a6 100644
--- a/fs/xfs/libxfs/xfs_inode_buf.c
+++ b/fs/xfs/libxfs/xfs_inode_buf.c
@@ -186,7 +186,7 @@ xfs_imap_to_bp(
 	return 0;
 }
 
-void
+int
 xfs_inode_from_disk(
 	struct xfs_inode	*ip,
 	struct xfs_dinode	*from)
@@ -247,6 +247,8 @@ xfs_inode_from_disk(
 		to->di_flags2 = be64_to_cpu(from->di_flags2);
 		to->di_cowextsize = be32_to_cpu(from->di_cowextsize);
 	}
+
+	return xfs_iformat_fork(ip, from);
 }
 
 void
@@ -647,8 +649,7 @@ xfs_iread(
 	 * Otherwise, just get the truly permanent information.
 	 */
 	if (dip->di_mode) {
-		xfs_inode_from_disk(ip, dip);
-		error = xfs_iformat_fork(ip, dip);
+		error = xfs_inode_from_disk(ip, dip);
 		if (error)  {
 #ifdef DEBUG
 			xfs_alert(mp, "%s: xfs_iformat() returned error %d",
diff --git a/fs/xfs/libxfs/xfs_inode_buf.h b/fs/xfs/libxfs/xfs_inode_buf.h
index 9b373dcf9e34d..081230faf7bdc 100644
--- a/fs/xfs/libxfs/xfs_inode_buf.h
+++ b/fs/xfs/libxfs/xfs_inode_buf.h
@@ -54,7 +54,7 @@ int	xfs_iread(struct xfs_mount *, struct xfs_trans *,
 void	xfs_dinode_calc_crc(struct xfs_mount *, struct xfs_dinode *);
 void	xfs_inode_to_disk(struct xfs_inode *ip, struct xfs_dinode *to,
 			  xfs_lsn_t lsn);
-void	xfs_inode_from_disk(struct xfs_inode *ip, struct xfs_dinode *from);
+int	xfs_inode_from_disk(struct xfs_inode *ip, struct xfs_dinode *from);
 void	xfs_log_dinode_to_disk(struct xfs_log_dinode *from,
 			       struct xfs_dinode *to);
 
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 11c3502b07b13..464388125d20b 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2870,9 +2870,7 @@ xfs_recover_inode_owner_change(
 
 	/* instantiate the inode */
 	ASSERT(dip->di_version >= 3);
-	xfs_inode_from_disk(ip, dip);
-
-	error = xfs_iformat_fork(ip, dip);
+	error = xfs_inode_from_disk(ip, dip);
 	if (error)
 		goto out_free_ip;
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 03/12] xfs: split xfs_iformat_fork
  2020-05-01  8:14 dinode reading cleanups Christoph Hellwig
  2020-05-01  8:14 ` [PATCH 01/12] xfs: xfs_bmapi_read doesn't take a fork id as the last argument Christoph Hellwig
  2020-05-01  8:14 ` [PATCH 02/12] xfs: call xfs_iformat_fork from xfs_inode_from_disk Christoph Hellwig
@ 2020-05-01  8:14 ` Christoph Hellwig
  2020-05-01 13:34   ` Brian Foster
  2020-05-01  8:14 ` [PATCH 04/12] xfs: handle unallocated inodes in xfs_inode_from_disk Christoph Hellwig
                   ` (8 subsequent siblings)
  11 siblings, 1 reply; 41+ messages in thread
From: Christoph Hellwig @ 2020-05-01  8:14 UTC (permalink / raw)
  To: linux-xfs

xfs_iformat_fork is a weird catchall.  Split it into one helper for
the data fork and one for the attr fork, and then call both helper
as well as the COW fork initialization from xfs_inode_from_disk.  Order
the COW fork initialization after the attr fork initialization given
that it can't fail to simplify the error handling.

Note that the newly split helpers are moved down the file in
xfs_inode_fork.c to avoid the need for forward declarations.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_inode_buf.c  |  20 +++-
 fs/xfs/libxfs/xfs_inode_fork.c | 186 +++++++++++++++------------------
 fs/xfs/libxfs/xfs_inode_fork.h |   3 +-
 3 files changed, 103 insertions(+), 106 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c
index 02f06dec0a5a6..983beb680e81a 100644
--- a/fs/xfs/libxfs/xfs_inode_buf.c
+++ b/fs/xfs/libxfs/xfs_inode_buf.c
@@ -193,6 +193,10 @@ xfs_inode_from_disk(
 {
 	struct xfs_icdinode	*to = &ip->i_d;
 	struct inode		*inode = VFS_I(ip);
+	int			error;
+
+	ASSERT(ip->i_cowfp == NULL);
+	ASSERT(ip->i_afp == NULL);
 
 	/*
 	 * Convert v1 inodes immediately to v2 inode format as this is the
@@ -248,7 +252,21 @@ xfs_inode_from_disk(
 		to->di_cowextsize = be32_to_cpu(from->di_cowextsize);
 	}
 
-	return xfs_iformat_fork(ip, from);
+	error = xfs_iformat_data_fork(ip, from);
+	if (error)
+		return error;
+	if (XFS_DFORK_Q(from)) {
+		error = xfs_iformat_attr_fork(ip, from);
+		if (error)
+			goto out_destroy_data_fork;
+	}
+	if (xfs_is_reflink_inode(ip))
+		xfs_ifork_init_cow(ip);
+	return 0;
+
+out_destroy_data_fork:
+	xfs_idestroy_fork(ip, XFS_DATA_FORK);
+	return error;
 }
 
 void
diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c
index 518c6f0ec3a61..f30d43364aa92 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.c
+++ b/fs/xfs/libxfs/xfs_inode_fork.c
@@ -26,110 +26,6 @@
 
 kmem_zone_t *xfs_ifork_zone;
 
-STATIC int xfs_iformat_local(xfs_inode_t *, xfs_dinode_t *, int, int);
-STATIC int xfs_iformat_extents(xfs_inode_t *, xfs_dinode_t *, int);
-STATIC int xfs_iformat_btree(xfs_inode_t *, xfs_dinode_t *, int);
-
-/*
- * Copy inode type and data and attr format specific information from the
- * on-disk inode to the in-core inode and fork structures.  For fifos, devices,
- * and sockets this means set i_rdev to the proper value.  For files,
- * directories, and symlinks this means to bring in the in-line data or extent
- * pointers as well as the attribute fork.  For a fork in B-tree format, only
- * the root is immediately brought in-core.  The rest will be read in later when
- * first referenced (see xfs_iread_extents()).
- */
-int
-xfs_iformat_fork(
-	struct xfs_inode	*ip,
-	struct xfs_dinode	*dip)
-{
-	struct inode		*inode = VFS_I(ip);
-	struct xfs_attr_shortform *atp;
-	int			size;
-	int			error = 0;
-	xfs_fsize_t             di_size;
-
-	switch (inode->i_mode & S_IFMT) {
-	case S_IFIFO:
-	case S_IFCHR:
-	case S_IFBLK:
-	case S_IFSOCK:
-		ip->i_d.di_size = 0;
-		inode->i_rdev = xfs_to_linux_dev_t(xfs_dinode_get_rdev(dip));
-		break;
-
-	case S_IFREG:
-	case S_IFLNK:
-	case S_IFDIR:
-		switch (dip->di_format) {
-		case XFS_DINODE_FMT_LOCAL:
-			di_size = be64_to_cpu(dip->di_size);
-			size = (int)di_size;
-			error = xfs_iformat_local(ip, dip, XFS_DATA_FORK, size);
-			break;
-		case XFS_DINODE_FMT_EXTENTS:
-			error = xfs_iformat_extents(ip, dip, XFS_DATA_FORK);
-			break;
-		case XFS_DINODE_FMT_BTREE:
-			error = xfs_iformat_btree(ip, dip, XFS_DATA_FORK);
-			break;
-		default:
-			xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__,
-					dip, sizeof(*dip), __this_address);
-			return -EFSCORRUPTED;
-		}
-		break;
-
-	default:
-		xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__, dip,
-				sizeof(*dip), __this_address);
-		return -EFSCORRUPTED;
-	}
-	if (error)
-		return error;
-
-	if (xfs_is_reflink_inode(ip)) {
-		ASSERT(ip->i_cowfp == NULL);
-		xfs_ifork_init_cow(ip);
-	}
-
-	if (!XFS_DFORK_Q(dip))
-		return 0;
-
-	ASSERT(ip->i_afp == NULL);
-	ip->i_afp = kmem_zone_zalloc(xfs_ifork_zone, KM_NOFS);
-
-	switch (dip->di_aformat) {
-	case XFS_DINODE_FMT_LOCAL:
-		atp = (xfs_attr_shortform_t *)XFS_DFORK_APTR(dip);
-		size = be16_to_cpu(atp->hdr.totsize);
-
-		error = xfs_iformat_local(ip, dip, XFS_ATTR_FORK, size);
-		break;
-	case XFS_DINODE_FMT_EXTENTS:
-		error = xfs_iformat_extents(ip, dip, XFS_ATTR_FORK);
-		break;
-	case XFS_DINODE_FMT_BTREE:
-		error = xfs_iformat_btree(ip, dip, XFS_ATTR_FORK);
-		break;
-	default:
-		xfs_inode_verifier_error(ip, error, __func__, dip,
-				sizeof(*dip), __this_address);
-		error = -EFSCORRUPTED;
-		break;
-	}
-	if (error) {
-		kmem_cache_free(xfs_ifork_zone, ip->i_afp);
-		ip->i_afp = NULL;
-		if (ip->i_cowfp)
-			kmem_cache_free(xfs_ifork_zone, ip->i_cowfp);
-		ip->i_cowfp = NULL;
-		xfs_idestroy_fork(ip, XFS_DATA_FORK);
-	}
-	return error;
-}
-
 void
 xfs_init_local_fork(
 	struct xfs_inode	*ip,
@@ -325,6 +221,88 @@ xfs_iformat_btree(
 	return 0;
 }
 
+int
+xfs_iformat_data_fork(
+	struct xfs_inode	*ip,
+	struct xfs_dinode	*dip)
+{
+	struct inode		*inode = VFS_I(ip);
+
+	switch (inode->i_mode & S_IFMT) {
+	case S_IFIFO:
+	case S_IFCHR:
+	case S_IFBLK:
+	case S_IFSOCK:
+		ip->i_d.di_size = 0;
+		inode->i_rdev = xfs_to_linux_dev_t(xfs_dinode_get_rdev(dip));
+		return 0;
+	case S_IFREG:
+	case S_IFLNK:
+	case S_IFDIR:
+		switch (dip->di_format) {
+		case XFS_DINODE_FMT_LOCAL:
+			return xfs_iformat_local(ip, dip, XFS_DATA_FORK,
+					be64_to_cpu(dip->di_size));
+		case XFS_DINODE_FMT_EXTENTS:
+			return xfs_iformat_extents(ip, dip, XFS_DATA_FORK);
+		case XFS_DINODE_FMT_BTREE:
+			return xfs_iformat_btree(ip, dip, XFS_DATA_FORK);
+		default:
+			xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__,
+					dip, sizeof(*dip), __this_address);
+			return -EFSCORRUPTED;
+		}
+		break;
+	default:
+		xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__, dip,
+				sizeof(*dip), __this_address);
+		return -EFSCORRUPTED;
+	}
+}
+
+static uint16_t
+xfs_dfork_attr_shortform_size(
+	struct xfs_dinode		*dip)
+{
+	struct xfs_attr_shortform	*atp =
+		(struct xfs_attr_shortform *)XFS_DFORK_APTR(dip);
+
+	return be16_to_cpu(atp->hdr.totsize);
+}
+
+int
+xfs_iformat_attr_fork(
+	struct xfs_inode	*ip,
+	struct xfs_dinode	*dip)
+{
+	int			error = 0;
+
+	ip->i_afp = kmem_zone_zalloc(xfs_ifork_zone, KM_NOFS);
+	switch (dip->di_aformat) {
+	case XFS_DINODE_FMT_LOCAL:
+		error = xfs_iformat_local(ip, dip, XFS_ATTR_FORK,
+				xfs_dfork_attr_shortform_size(dip));
+		break;
+	case XFS_DINODE_FMT_EXTENTS:
+		error = xfs_iformat_extents(ip, dip, XFS_ATTR_FORK);
+		break;
+	case XFS_DINODE_FMT_BTREE:
+		error = xfs_iformat_btree(ip, dip, XFS_ATTR_FORK);
+		break;
+	default:
+		xfs_inode_verifier_error(ip, error, __func__, dip,
+				sizeof(*dip), __this_address);
+		error = -EFSCORRUPTED;
+		break;
+	}
+
+	if (error) {
+		kmem_cache_free(xfs_ifork_zone, ip->i_afp);
+		ip->i_afp = NULL;
+	}
+	return error;
+}
+
 /*
  * Reallocate the space for if_broot based on the number of records
  * being added or deleted as indicated in rec_diff.  Move the records
diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
index 668ee942be224..8487b0c88a75e 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.h
+++ b/fs/xfs/libxfs/xfs_inode_fork.h
@@ -88,7 +88,8 @@ struct xfs_ifork {
 
 struct xfs_ifork *xfs_iext_state_to_fork(struct xfs_inode *ip, int state);
 
-int		xfs_iformat_fork(struct xfs_inode *, struct xfs_dinode *);
+int		xfs_iformat_data_fork(struct xfs_inode *, struct xfs_dinode *);
+int		xfs_iformat_attr_fork(struct xfs_inode *, struct xfs_dinode *);
 void		xfs_iflush_fork(struct xfs_inode *, struct xfs_dinode *,
 				struct xfs_inode_log_item *, int);
 void		xfs_idestroy_fork(struct xfs_inode *, int);
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 04/12] xfs: handle unallocated inodes in xfs_inode_from_disk
  2020-05-01  8:14 dinode reading cleanups Christoph Hellwig
                   ` (2 preceding siblings ...)
  2020-05-01  8:14 ` [PATCH 03/12] xfs: split xfs_iformat_fork Christoph Hellwig
@ 2020-05-01  8:14 ` Christoph Hellwig
  2020-05-01 13:34   ` Brian Foster
  2020-05-01  8:14 ` [PATCH 05/12] xfs: call xfs_dinode_verify from xfs_inode_from_disk Christoph Hellwig
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 41+ messages in thread
From: Christoph Hellwig @ 2020-05-01  8:14 UTC (permalink / raw)
  To: linux-xfs

Handle inodes with a 0 di_mode in xfs_inode_from_disk, instead of partially
duplicating inode reading in xfs_iread.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_inode_buf.c | 54 ++++++++++++-----------------------
 1 file changed, 18 insertions(+), 36 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c
index 983beb680e81a..b136f29f7d9d3 100644
--- a/fs/xfs/libxfs/xfs_inode_buf.c
+++ b/fs/xfs/libxfs/xfs_inode_buf.c
@@ -198,6 +198,21 @@ xfs_inode_from_disk(
 	ASSERT(ip->i_cowfp == NULL);
 	ASSERT(ip->i_afp == NULL);
 
+	/*
+	 * Get the truly permanent information first that is not overwritten by
+	 * xfs_ialloc first.  This also includes i_mode so that a newly read
+	 * in inode structure for an allocation is marked as already free.
+	 */
+	inode->i_generation = be32_to_cpu(from->di_gen);
+	inode->i_mode = be16_to_cpu(from->di_mode);
+	to->di_flushiter = be16_to_cpu(from->di_flushiter);
+
+	/*
+	 * Only copy the rest if the inode is actually allocated.
+	 */
+	if (!inode->i_mode)
+		return 0;
+
 	/*
 	 * Convert v1 inodes immediately to v2 inode format as this is the
 	 * minimum inode version format we support in the rest of the code.
@@ -215,7 +230,6 @@ xfs_inode_from_disk(
 	to->di_format = from->di_format;
 	i_uid_write(inode, be32_to_cpu(from->di_uid));
 	i_gid_write(inode, be32_to_cpu(from->di_gid));
-	to->di_flushiter = be16_to_cpu(from->di_flushiter);
 
 	/*
 	 * Time is signed, so need to convert to signed 32 bit before
@@ -229,8 +243,6 @@ xfs_inode_from_disk(
 	inode->i_mtime.tv_nsec = (int)be32_to_cpu(from->di_mtime.t_nsec);
 	inode->i_ctime.tv_sec = (int)be32_to_cpu(from->di_ctime.t_sec);
 	inode->i_ctime.tv_nsec = (int)be32_to_cpu(from->di_ctime.t_nsec);
-	inode->i_generation = be32_to_cpu(from->di_gen);
-	inode->i_mode = be16_to_cpu(from->di_mode);
 
 	to->di_size = be64_to_cpu(from->di_size);
 	to->di_nblocks = be64_to_cpu(from->di_nblocks);
@@ -659,39 +671,9 @@ xfs_iread(
 		goto out_brelse;
 	}
 
-	/*
-	 * If the on-disk inode is already linked to a directory
-	 * entry, copy all of the inode into the in-core inode.
-	 * xfs_iformat_fork() handles copying in the inode format
-	 * specific information.
-	 * Otherwise, just get the truly permanent information.
-	 */
-	if (dip->di_mode) {
-		error = xfs_inode_from_disk(ip, dip);
-		if (error)  {
-#ifdef DEBUG
-			xfs_alert(mp, "%s: xfs_iformat() returned error %d",
-				__func__, error);
-#endif /* DEBUG */
-			goto out_brelse;
-		}
-	} else {
-		/*
-		 * Partial initialisation of the in-core inode. Just the bits
-		 * that xfs_ialloc won't overwrite or relies on being correct.
-		 */
-		VFS_I(ip)->i_generation = be32_to_cpu(dip->di_gen);
-		ip->i_d.di_flushiter = be16_to_cpu(dip->di_flushiter);
-
-		/*
-		 * Make sure to pull in the mode here as well in
-		 * case the inode is released without being used.
-		 * This ensures that xfs_inactive() will see that
-		 * the inode is already free and not try to mess
-		 * with the uninitialized part of it.
-		 */
-		VFS_I(ip)->i_mode = 0;
-	}
+	error = xfs_inode_from_disk(ip, dip);
+	if (error)
+		goto out_brelse;
 
 	ip->i_delayed_blks = 0;
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 05/12] xfs: call xfs_dinode_verify from xfs_inode_from_disk
  2020-05-01  8:14 dinode reading cleanups Christoph Hellwig
                   ` (3 preceding siblings ...)
  2020-05-01  8:14 ` [PATCH 04/12] xfs: handle unallocated inodes in xfs_inode_from_disk Christoph Hellwig
@ 2020-05-01  8:14 ` Christoph Hellwig
  2020-05-01 13:34   ` Brian Foster
  2020-05-01  8:14 ` [PATCH 06/12] xfs: don't reset i_delayed_blks in xfs_iread Christoph Hellwig
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 41+ messages in thread
From: Christoph Hellwig @ 2020-05-01  8:14 UTC (permalink / raw)
  To: linux-xfs

Keep the code dealing with the dinode together, and also ensure we verify
the dinode in the onwer change log recovery case as well.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 .../xfs-self-describing-metadata.txt           | 10 +++++-----
 fs/xfs/libxfs/xfs_inode_buf.c                  | 18 ++++++++----------
 2 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/Documentation/filesystems/xfs-self-describing-metadata.txt b/Documentation/filesystems/xfs-self-describing-metadata.txt
index 8db0121d0980c..e912699d74301 100644
--- a/Documentation/filesystems/xfs-self-describing-metadata.txt
+++ b/Documentation/filesystems/xfs-self-describing-metadata.txt
@@ -337,11 +337,11 @@ buffer.
 
 The structure of the verifiers and the identifiers checks is very similar to the
 buffer code described above. The only difference is where they are called. For
-example, inode read verification is done in xfs_iread() when the inode is first
-read out of the buffer and the struct xfs_inode is instantiated. The inode is
-already extensively verified during writeback in xfs_iflush_int, so the only
-addition here is to add the LSN and CRC to the inode as it is copied back into
-the buffer.
+example, inode read verification is done in xfs_inode_from_disk() when the inode
+is first read out of the buffer and the struct xfs_inode is instantiated. The
+inode is already extensively verified during writeback in xfs_iflush_int, so the
+only addition here is to add the LSN and CRC to the inode as it is copied back
+into the buffer.
 
 XXX: inode unlinked list modification doesn't recalculate the inode CRC! None of
 the unlinked list modifications check or update CRCs, neither during unlink nor
diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c
index b136f29f7d9d3..a00001a2336ef 100644
--- a/fs/xfs/libxfs/xfs_inode_buf.c
+++ b/fs/xfs/libxfs/xfs_inode_buf.c
@@ -194,10 +194,18 @@ xfs_inode_from_disk(
 	struct xfs_icdinode	*to = &ip->i_d;
 	struct inode		*inode = VFS_I(ip);
 	int			error;
+	xfs_failaddr_t		fa;
 
 	ASSERT(ip->i_cowfp == NULL);
 	ASSERT(ip->i_afp == NULL);
 
+	fa = xfs_dinode_verify(ip->i_mount, ip->i_ino, from);
+	if (fa) {
+		xfs_inode_verifier_error(ip, -EFSCORRUPTED, "dinode", from,
+				sizeof(*from), fa);
+		return -EFSCORRUPTED;
+	}
+
 	/*
 	 * Get the truly permanent information first that is not overwritten by
 	 * xfs_ialloc first.  This also includes i_mode so that a newly read
@@ -637,7 +645,6 @@ xfs_iread(
 {
 	xfs_buf_t	*bp;
 	xfs_dinode_t	*dip;
-	xfs_failaddr_t	fa;
 	int		error;
 
 	/*
@@ -662,15 +669,6 @@ xfs_iread(
 	if (error)
 		return error;
 
-	/* even unallocated inodes are verified */
-	fa = xfs_dinode_verify(mp, ip->i_ino, dip);
-	if (fa) {
-		xfs_inode_verifier_error(ip, -EFSCORRUPTED, "dinode", dip,
-				sizeof(*dip), fa);
-		error = -EFSCORRUPTED;
-		goto out_brelse;
-	}
-
 	error = xfs_inode_from_disk(ip, dip);
 	if (error)
 		goto out_brelse;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 06/12] xfs: don't reset i_delayed_blks in xfs_iread
  2020-05-01  8:14 dinode reading cleanups Christoph Hellwig
                   ` (4 preceding siblings ...)
  2020-05-01  8:14 ` [PATCH 05/12] xfs: call xfs_dinode_verify from xfs_inode_from_disk Christoph Hellwig
@ 2020-05-01  8:14 ` Christoph Hellwig
  2020-05-01 13:34   ` Brian Foster
  2020-05-01  8:14 ` [PATCH 07/12] xfs: remove xfs_iread Christoph Hellwig
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 41+ messages in thread
From: Christoph Hellwig @ 2020-05-01  8:14 UTC (permalink / raw)
  To: linux-xfs

i_delayed_blks is set to 0 in xfs_inode_alloc and can't have anything
assigned to it until the inode is visible to the VFS.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_inode_buf.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c
index a00001a2336ef..0357dc4b29481 100644
--- a/fs/xfs/libxfs/xfs_inode_buf.c
+++ b/fs/xfs/libxfs/xfs_inode_buf.c
@@ -673,8 +673,6 @@ xfs_iread(
 	if (error)
 		goto out_brelse;
 
-	ip->i_delayed_blks = 0;
-
 	/*
 	 * Mark the buffer containing the inode as something to keep
 	 * around for a while.  This helps to keep recently accessed
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 07/12] xfs: remove xfs_iread
  2020-05-01  8:14 dinode reading cleanups Christoph Hellwig
                   ` (5 preceding siblings ...)
  2020-05-01  8:14 ` [PATCH 06/12] xfs: don't reset i_delayed_blks in xfs_iread Christoph Hellwig
@ 2020-05-01  8:14 ` Christoph Hellwig
  2020-05-01 15:56   ` Brian Foster
  2020-05-01  8:14 ` [PATCH 08/12] xfs: remove xfs_ifork_ops Christoph Hellwig
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 41+ messages in thread
From: Christoph Hellwig @ 2020-05-01  8:14 UTC (permalink / raw)
  To: linux-xfs

There is not much point in the xfs_iread function, as it has a single
caller and not a whole lot of code.  Move it into the only caller,
and trim down the overdocumentation to just documenting the important
"why" instead of a lot of redundant "what".

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_inode_buf.c | 73 -----------------------------------
 fs/xfs/libxfs/xfs_inode_buf.h |  2 -
 fs/xfs/xfs_icache.c           | 33 +++++++++++++++-
 3 files changed, 32 insertions(+), 76 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c
index 0357dc4b29481..698314d078b8a 100644
--- a/fs/xfs/libxfs/xfs_inode_buf.c
+++ b/fs/xfs/libxfs/xfs_inode_buf.c
@@ -624,79 +624,6 @@ xfs_dinode_calc_crc(
 	dip->di_crc = xfs_end_cksum(crc);
 }
 
-/*
- * Read the disk inode attributes into the in-core inode structure.
- *
- * For version 5 superblocks, if we are initialising a new inode and we are not
- * utilising the XFS_MOUNT_IKEEP inode cluster mode, we can simple build the new
- * inode core with a random generation number. If we are keeping inodes around,
- * we need to read the inode cluster to get the existing generation number off
- * disk. Further, if we are using version 4 superblocks (i.e. v1/v2 inode
- * format) then log recovery is dependent on the di_flushiter field being
- * initialised from the current on-disk value and hence we must also read the
- * inode off disk.
- */
-int
-xfs_iread(
-	xfs_mount_t	*mp,
-	xfs_trans_t	*tp,
-	xfs_inode_t	*ip,
-	uint		iget_flags)
-{
-	xfs_buf_t	*bp;
-	xfs_dinode_t	*dip;
-	int		error;
-
-	/*
-	 * Fill in the location information in the in-core inode.
-	 */
-	error = xfs_imap(mp, tp, ip->i_ino, &ip->i_imap, iget_flags);
-	if (error)
-		return error;
-
-	/* shortcut IO on inode allocation if possible */
-	if ((iget_flags & XFS_IGET_CREATE) &&
-	    xfs_sb_version_has_v3inode(&mp->m_sb) &&
-	    !(mp->m_flags & XFS_MOUNT_IKEEP)) {
-		VFS_I(ip)->i_generation = prandom_u32();
-		return 0;
-	}
-
-	/*
-	 * Get pointers to the on-disk inode and the buffer containing it.
-	 */
-	error = xfs_imap_to_bp(mp, tp, &ip->i_imap, &dip, &bp, 0, iget_flags);
-	if (error)
-		return error;
-
-	error = xfs_inode_from_disk(ip, dip);
-	if (error)
-		goto out_brelse;
-
-	/*
-	 * Mark the buffer containing the inode as something to keep
-	 * around for a while.  This helps to keep recently accessed
-	 * meta-data in-core longer.
-	 */
-	xfs_buf_set_ref(bp, XFS_INO_REF);
-
-	/*
-	 * Use xfs_trans_brelse() to release the buffer containing the on-disk
-	 * inode, because it was acquired with xfs_trans_read_buf() in
-	 * xfs_imap_to_bp() above.  If tp is NULL, this is just a normal
-	 * brelse().  If we're within a transaction, then xfs_trans_brelse()
-	 * will only release the buffer if it is not dirty within the
-	 * transaction.  It will be OK to release the buffer in this case,
-	 * because inodes on disk are never destroyed and we will be locking the
-	 * new in-core inode before putting it in the cache where other
-	 * processes can find it.  Thus we don't have to worry about the inode
-	 * being changed just because we released the buffer.
-	 */
- out_brelse:
-	xfs_trans_brelse(tp, bp);
-	return error;
-}
-
 /*
  * Validate di_extsize hint.
  *
diff --git a/fs/xfs/libxfs/xfs_inode_buf.h b/fs/xfs/libxfs/xfs_inode_buf.h
index 081230faf7bdc..4de8673bfa61c 100644
--- a/fs/xfs/libxfs/xfs_inode_buf.h
+++ b/fs/xfs/libxfs/xfs_inode_buf.h
@@ -49,8 +49,6 @@ struct xfs_imap {
 int	xfs_imap_to_bp(struct xfs_mount *, struct xfs_trans *,
 		       struct xfs_imap *, struct xfs_dinode **,
 		       struct xfs_buf **, uint, uint);
-int	xfs_iread(struct xfs_mount *, struct xfs_trans *,
-		  struct xfs_inode *, uint);
 void	xfs_dinode_calc_crc(struct xfs_mount *, struct xfs_dinode *);
 void	xfs_inode_to_disk(struct xfs_inode *ip, struct xfs_dinode *to,
 			  xfs_lsn_t lsn);
diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index 8bf1d15be3f6a..dd757c6614956 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -22,6 +22,7 @@
 #include "xfs_dquot_item.h"
 #include "xfs_dquot.h"
 #include "xfs_reflink.h"
+#include "xfs_ialloc.h"
 
 #include <linux/iversion.h>
 
@@ -510,10 +511,40 @@ xfs_iget_cache_miss(
 	if (!ip)
 		return -ENOMEM;
 
-	error = xfs_iread(mp, tp, ip, flags);
+	error = xfs_imap(mp, tp, ip->i_ino, &ip->i_imap, flags);
 	if (error)
 		goto out_destroy;
 
+	/*
+	 * For version 5 superblocks, if we are initialising a new inode and we
+	 * are not utilising the XFS_MOUNT_IKEEP inode cluster mode, we can
+	 * simple build the new inode core with a random generation number.
+	 *
+	 * For version 4 (and older) superblocks, log recovery is dependent on
+	 * the di_flushiter field being initialised from the current on-disk
+	 * value and hence we must also read the inode off disk even when
+	 * initializing new inodes.
+	 */
+	if (xfs_sb_version_has_v3inode(&mp->m_sb) &&
+	    (flags & XFS_IGET_CREATE) && !(mp->m_flags & XFS_MOUNT_IKEEP)) {
+		VFS_I(ip)->i_generation = prandom_u32();
+	} else {
+		struct xfs_dinode	*dip;
+		struct xfs_buf		*bp;
+
+		error = xfs_imap_to_bp(mp, tp, &ip->i_imap, &dip, &bp, 0, flags);
+		if (error)
+			goto out_destroy;
+
+		error = xfs_inode_from_disk(ip, dip);
+		if (!error)
+			xfs_buf_set_ref(bp, XFS_INO_REF);
+		xfs_trans_brelse(tp, bp);
+
+		if (error)
+			goto out_destroy;
+	}
+
 	if (!xfs_inode_verify_forks(ip)) {
 		error = -EFSCORRUPTED;
 		goto out_destroy;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 08/12] xfs: remove xfs_ifork_ops
  2020-05-01  8:14 dinode reading cleanups Christoph Hellwig
                   ` (6 preceding siblings ...)
  2020-05-01  8:14 ` [PATCH 07/12] xfs: remove xfs_iread Christoph Hellwig
@ 2020-05-01  8:14 ` Christoph Hellwig
  2020-05-01 15:56   ` Brian Foster
  2020-05-01  8:14 ` [PATCH 09/12] xfs: refactor xfs_inode_verify_forks Christoph Hellwig
                   ` (3 subsequent siblings)
  11 siblings, 1 reply; 41+ messages in thread
From: Christoph Hellwig @ 2020-05-01  8:14 UTC (permalink / raw)
  To: linux-xfs

xfs_ifork_ops add up to two indirect calls per inode read and flush,
despite just having a single instance in the kernel.  In xfsprogs
phase6 in xfs_repair overrides the verify_dir method to deal with inodes
that do not have a valid parent.  Instead of the costly indirection just
life the repair code into xfs_dir2_sf.c under a condition that ensures
it is compiled as part of a kernel build, but instantly eliminated as
it is unreachable.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_dir2_sf.c    | 64 ++++++++++++++++++++++++++++++++--
 fs/xfs/libxfs/xfs_inode_fork.c | 19 +++-------
 fs/xfs/libxfs/xfs_inode_fork.h | 15 ++------
 fs/xfs/xfs_inode.c             |  4 +--
 4 files changed, 71 insertions(+), 31 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_dir2_sf.c b/fs/xfs/libxfs/xfs_dir2_sf.c
index 7b7f6fb2ea3b2..1f6c30b68917c 100644
--- a/fs/xfs/libxfs/xfs_dir2_sf.c
+++ b/fs/xfs/libxfs/xfs_dir2_sf.c
@@ -705,8 +705,8 @@ xfs_dir2_sf_check(
 #endif	/* DEBUG */
 
 /* Verify the consistency of an inline directory. */
-xfs_failaddr_t
-xfs_dir2_sf_verify(
+static xfs_failaddr_t
+__xfs_dir2_sf_verify(
 	struct xfs_inode		*ip)
 {
 	struct xfs_mount		*mp = ip->i_mount;
@@ -804,6 +804,66 @@ xfs_dir2_sf_verify(
 	return NULL;
 }
 
+/*
+ * When we're checking directory inodes, we're allowed to set a directory's
+ * dotdot entry to zero to signal that the parent needs to be reconnected
+ * during xfs_repair phase 6.  If we're handling a shortform directory the ifork
+ * verifiers will fail, so temporarily patch out this canary so that we can
+ * verify the rest of the fork and move on to fixing the dir.
+ */
+static xfs_failaddr_t
+xfs_dir2_sf_verify_dir_check(
+	struct xfs_inode		*ip)
+{
+	struct xfs_mount		*mp = ip->i_mount;
+	struct xfs_ifork		*ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
+	struct xfs_dir2_sf_hdr		*sfp =
+		(struct xfs_dir2_sf_hdr *)ifp->if_u1.if_data;
+	int				size = ifp->if_bytes;
+	bool				parent_bypass = false;
+	xfs_ino_t			old_parent;
+	xfs_failaddr_t			fa;
+
+	/*
+	 * If this is a shortform directory, phase4 in xfs_repair may have set
+	 * the parent inode to zero to indicate that it must be fixed.
+	 * Temporarily set a valid parent so that the directory verifier will
+	 * pass.
+	 */
+	if (size > offsetof(struct xfs_dir2_sf_hdr, parent) &&
+	    size >= xfs_dir2_sf_hdr_size(sfp->i8count)) {
+		old_parent = xfs_dir2_sf_get_parent_ino(sfp);
+		if (!old_parent) {
+			xfs_dir2_sf_put_parent_ino(sfp, mp->m_sb.sb_rootino);
+			parent_bypass = true;
+		}
+	}
+
+	fa = __xfs_dir2_sf_verify(ip);
+
+	/* Put it back. */
+	if (parent_bypass)
+		xfs_dir2_sf_put_parent_ino(sfp, old_parent);
+	return fa;
+}
+
+/*
+ * Allow xfs_repair to enable the parent bypass mode.  For now this is entirely
+ * unused in the kernel, but might come in useful for online repair eventually.
+ */
+#ifndef xfs_inode_parent_bypass
+#define xfs_inode_parent_bypass(ip)	0
+#endif
+
+xfs_failaddr_t
+xfs_dir2_sf_verify(
+	struct xfs_inode		*ip)
+{
+	if (xfs_inode_parent_bypass(ip))
+		return xfs_dir2_sf_verify_dir_check(ip);
+	return __xfs_dir2_sf_verify(ip);
+}
+
 /*
  * Create a new (shortform) directory.
  */
diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c
index f30d43364aa92..f6dcee919f59e 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.c
+++ b/fs/xfs/libxfs/xfs_inode_fork.c
@@ -673,18 +673,10 @@ xfs_ifork_init_cow(
 	ip->i_cnextents = 0;
 }
 
-/* Default fork content verifiers. */
-struct xfs_ifork_ops xfs_default_ifork_ops = {
-	.verify_attr	= xfs_attr_shortform_verify,
-	.verify_dir	= xfs_dir2_sf_verify,
-	.verify_symlink	= xfs_symlink_shortform_verify,
-};
-
 /* Verify the inline contents of the data fork of an inode. */
 xfs_failaddr_t
 xfs_ifork_verify_data(
-	struct xfs_inode	*ip,
-	struct xfs_ifork_ops	*ops)
+	struct xfs_inode	*ip)
 {
 	/* Non-local data fork, we're done. */
 	if (ip->i_d.di_format != XFS_DINODE_FMT_LOCAL)
@@ -693,9 +685,9 @@ xfs_ifork_verify_data(
 	/* Check the inline data fork if there is one. */
 	switch (VFS_I(ip)->i_mode & S_IFMT) {
 	case S_IFDIR:
-		return ops->verify_dir(ip);
+		return xfs_dir2_sf_verify(ip);
 	case S_IFLNK:
-		return ops->verify_symlink(ip);
+		return xfs_symlink_shortform_verify(ip);
 	default:
 		return NULL;
 	}
@@ -704,13 +696,12 @@ xfs_ifork_verify_data(
 /* Verify the inline contents of the attr fork of an inode. */
 xfs_failaddr_t
 xfs_ifork_verify_attr(
-	struct xfs_inode	*ip,
-	struct xfs_ifork_ops	*ops)
+	struct xfs_inode	*ip)
 {
 	/* There has to be an attr fork allocated if aformat is local. */
 	if (ip->i_d.di_aformat != XFS_DINODE_FMT_LOCAL)
 		return NULL;
 	if (!XFS_IFORK_PTR(ip, XFS_ATTR_FORK))
 		return __this_address;
-	return ops->verify_attr(ip);
+	return xfs_attr_shortform_verify(ip);
 }
diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
index 8487b0c88a75e..3f84d33abd3b7 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.h
+++ b/fs/xfs/libxfs/xfs_inode_fork.h
@@ -176,18 +176,7 @@ extern struct kmem_zone	*xfs_ifork_zone;
 
 extern void xfs_ifork_init_cow(struct xfs_inode *ip);
 
-typedef xfs_failaddr_t (*xfs_ifork_verifier_t)(struct xfs_inode *);
-
-struct xfs_ifork_ops {
-	xfs_ifork_verifier_t	verify_symlink;
-	xfs_ifork_verifier_t	verify_dir;
-	xfs_ifork_verifier_t	verify_attr;
-};
-extern struct xfs_ifork_ops	xfs_default_ifork_ops;
-
-xfs_failaddr_t xfs_ifork_verify_data(struct xfs_inode *ip,
-		struct xfs_ifork_ops *ops);
-xfs_failaddr_t xfs_ifork_verify_attr(struct xfs_inode *ip,
-		struct xfs_ifork_ops *ops);
+xfs_failaddr_t xfs_ifork_verify_data(struct xfs_inode *ip);
+xfs_failaddr_t xfs_ifork_verify_attr(struct xfs_inode *ip);
 
 #endif	/* __XFS_INODE_FORK_H__ */
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index d1772786af29d..93967278355de 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -3769,7 +3769,7 @@ xfs_inode_verify_forks(
 	struct xfs_ifork	*ifp;
 	xfs_failaddr_t		fa;
 
-	fa = xfs_ifork_verify_data(ip, &xfs_default_ifork_ops);
+	fa = xfs_ifork_verify_data(ip);
 	if (fa) {
 		ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
 		xfs_inode_verifier_error(ip, -EFSCORRUPTED, "data fork",
@@ -3777,7 +3777,7 @@ xfs_inode_verify_forks(
 		return false;
 	}
 
-	fa = xfs_ifork_verify_attr(ip, &xfs_default_ifork_ops);
+	fa = xfs_ifork_verify_attr(ip);
 	if (fa) {
 		ifp = XFS_IFORK_PTR(ip, XFS_ATTR_FORK);
 		xfs_inode_verifier_error(ip, -EFSCORRUPTED, "attr fork",
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 09/12] xfs: refactor xfs_inode_verify_forks
  2020-05-01  8:14 dinode reading cleanups Christoph Hellwig
                   ` (7 preceding siblings ...)
  2020-05-01  8:14 ` [PATCH 08/12] xfs: remove xfs_ifork_ops Christoph Hellwig
@ 2020-05-01  8:14 ` Christoph Hellwig
  2020-05-01 15:57   ` Brian Foster
  2020-05-01  8:14 ` [PATCH 10/12] xfs: improve local fork verification Christoph Hellwig
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 41+ messages in thread
From: Christoph Hellwig @ 2020-05-01  8:14 UTC (permalink / raw)
  To: linux-xfs

The split between xfs_inode_verify_forks and the two helpers
implementing the actual functionality is a little strange.  Reshuffle
it so that xfs_inode_verify_forks verifies if the data and attr forks
are actually in local format and only call the low-level helpers if
that is the case.  Handle the actual error reporting in the low-level
handlers to streamline the caller.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_inode_fork.c | 47 ++++++++++++++++++++++------------
 fs/xfs/libxfs/xfs_inode_fork.h |  4 +--
 fs/xfs/xfs_inode.c             | 21 +++------------
 3 files changed, 37 insertions(+), 35 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c
index f6dcee919f59e..7e129ed2f345f 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.c
+++ b/fs/xfs/libxfs/xfs_inode_fork.c
@@ -674,34 +674,49 @@ xfs_ifork_init_cow(
 }
 
 /* Verify the inline contents of the data fork of an inode. */
-xfs_failaddr_t
-xfs_ifork_verify_data(
+int
+xfs_ifork_verify_local_data(
 	struct xfs_inode	*ip)
 {
-	/* Non-local data fork, we're done. */
-	if (ip->i_d.di_format != XFS_DINODE_FMT_LOCAL)
-		return NULL;
+	xfs_failaddr_t		fa = NULL;
 
-	/* Check the inline data fork if there is one. */
 	switch (VFS_I(ip)->i_mode & S_IFMT) {
 	case S_IFDIR:
-		return xfs_dir2_sf_verify(ip);
+		fa = xfs_dir2_sf_verify(ip);
+		break;
 	case S_IFLNK:
-		return xfs_symlink_shortform_verify(ip);
+		fa = xfs_symlink_shortform_verify(ip);
+		break;
 	default:
-		return NULL;
+		break;
 	}
+
+	if (fa) {
+		xfs_inode_verifier_error(ip, -EFSCORRUPTED, "data fork",
+			ip->i_df.if_u1.if_data, ip->i_df.if_bytes, fa);
+		return -EFSCORRUPTED;
+	}
+
+	return 0;
 }
 
 /* Verify the inline contents of the attr fork of an inode. */
-xfs_failaddr_t
-xfs_ifork_verify_attr(
+int
+xfs_ifork_verify_local_attr(
 	struct xfs_inode	*ip)
 {
-	/* There has to be an attr fork allocated if aformat is local. */
-	if (ip->i_d.di_aformat != XFS_DINODE_FMT_LOCAL)
-		return NULL;
+	xfs_failaddr_t		fa;
+
 	if (!XFS_IFORK_PTR(ip, XFS_ATTR_FORK))
-		return __this_address;
-	return xfs_attr_shortform_verify(ip);
+		fa = __this_address;
+	else
+		fa = xfs_attr_shortform_verify(ip);
+
+	if (fa) {
+		xfs_inode_verifier_error(ip, -EFSCORRUPTED, "attr fork",
+			ip->i_afp->if_u1.if_data, ip->i_afp->if_bytes, fa);
+		return -EFSCORRUPTED;
+	}
+
+	return 0;
 }
diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
index 3f84d33abd3b7..f46a8c1db5964 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.h
+++ b/fs/xfs/libxfs/xfs_inode_fork.h
@@ -176,7 +176,7 @@ extern struct kmem_zone	*xfs_ifork_zone;
 
 extern void xfs_ifork_init_cow(struct xfs_inode *ip);
 
-xfs_failaddr_t xfs_ifork_verify_data(struct xfs_inode *ip);
-xfs_failaddr_t xfs_ifork_verify_attr(struct xfs_inode *ip);
+int xfs_ifork_verify_local_data(struct xfs_inode *ip);
+int xfs_ifork_verify_local_attr(struct xfs_inode *ip);
 
 #endif	/* __XFS_INODE_FORK_H__ */
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 93967278355de..2ec7789317133 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -3766,25 +3766,12 @@ bool
 xfs_inode_verify_forks(
 	struct xfs_inode	*ip)
 {
-	struct xfs_ifork	*ifp;
-	xfs_failaddr_t		fa;
-
-	fa = xfs_ifork_verify_data(ip);
-	if (fa) {
-		ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
-		xfs_inode_verifier_error(ip, -EFSCORRUPTED, "data fork",
-				ifp->if_u1.if_data, ifp->if_bytes, fa);
+	if (ip->i_d.di_format == XFS_DINODE_FMT_LOCAL &&
+	    xfs_ifork_verify_local_data(ip))
 		return false;
-	}
-
-	fa = xfs_ifork_verify_attr(ip);
-	if (fa) {
-		ifp = XFS_IFORK_PTR(ip, XFS_ATTR_FORK);
-		xfs_inode_verifier_error(ip, -EFSCORRUPTED, "attr fork",
-				ifp ? ifp->if_u1.if_data : NULL,
-				ifp ? ifp->if_bytes : 0, fa);
+	if (ip->i_d.di_aformat == XFS_DINODE_FMT_LOCAL &&
+	    xfs_ifork_verify_local_attr(ip))
 		return false;
-	}
 	return true;
 }
 
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 10/12] xfs: improve local fork verification
  2020-05-01  8:14 dinode reading cleanups Christoph Hellwig
                   ` (8 preceding siblings ...)
  2020-05-01  8:14 ` [PATCH 09/12] xfs: refactor xfs_inode_verify_forks Christoph Hellwig
@ 2020-05-01  8:14 ` Christoph Hellwig
  2020-05-01  8:14 ` [PATCH 11/12] xfs: remove the special COW fork handling in xfs_bmapi_read Christoph Hellwig
  2020-05-01  8:14 ` [PATCH 12/12] xfs: remove the NULL " Christoph Hellwig
  11 siblings, 0 replies; 41+ messages in thread
From: Christoph Hellwig @ 2020-05-01  8:14 UTC (permalink / raw)
  To: linux-xfs

Call the data/attr local fork verifies as soon as we are ready for them.
This keeps them close to the code setting up the forks, and avoids a
few branches later on.  Also open code xfs_inode_verify_forks in the
only remaining caller.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_inode_fork.c |  8 +++++++-
 fs/xfs/xfs_icache.c            |  6 ------
 fs/xfs/xfs_inode.c             | 28 +++++++++-------------------
 fs/xfs/xfs_inode.h             |  2 --
 fs/xfs/xfs_log_recover.c       |  5 -----
 5 files changed, 16 insertions(+), 33 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c
index 7e129ed2f345f..cbe3347d6f23a 100644
--- a/fs/xfs/libxfs/xfs_inode_fork.c
+++ b/fs/xfs/libxfs/xfs_inode_fork.c
@@ -227,6 +227,7 @@ xfs_iformat_data_fork(
 	struct xfs_dinode	*dip)
 {
 	struct inode		*inode = VFS_I(ip);
+	int			error;
 
 	switch (inode->i_mode & S_IFMT) {
 	case S_IFIFO:
@@ -241,8 +242,11 @@ xfs_iformat_data_fork(
 	case S_IFDIR:
 		switch (dip->di_format) {
 		case XFS_DINODE_FMT_LOCAL:
-			return xfs_iformat_local(ip, dip, XFS_DATA_FORK,
+			error = xfs_iformat_local(ip, dip, XFS_DATA_FORK,
 					be64_to_cpu(dip->di_size));
+			if (!error)
+				error = xfs_ifork_verify_local_data(ip);
+			return error;
 		case XFS_DINODE_FMT_EXTENTS:
 			return xfs_iformat_extents(ip, dip, XFS_DATA_FORK);
 		case XFS_DINODE_FMT_BTREE:
@@ -282,6 +286,8 @@ xfs_iformat_attr_fork(
 	case XFS_DINODE_FMT_LOCAL:
 		error = xfs_iformat_local(ip, dip, XFS_ATTR_FORK,
 				xfs_dfork_attr_shortform_size(dip));
+		if (!error)
+			error = xfs_ifork_verify_local_attr(ip);
 		break;
 	case XFS_DINODE_FMT_EXTENTS:
 		error = xfs_iformat_extents(ip, dip, XFS_ATTR_FORK);
diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index dd757c6614956..f9ed02aafa89a 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -545,14 +545,8 @@ xfs_iget_cache_miss(
 			goto out_destroy;
 	}
 
-	if (!xfs_inode_verify_forks(ip)) {
-		error = -EFSCORRUPTED;
-		goto out_destroy;
-	}
-
 	trace_xfs_iget_miss(ip);
 
-
 	/*
 	 * Check the inode free state is valid. This also detects lookup
 	 * racing with unlinks.
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 2ec7789317133..a0dfdee7db4f1 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -3758,23 +3758,6 @@ xfs_iflush(
 	return error;
 }
 
-/*
- * If there are inline format data / attr forks attached to this inode,
- * make sure they're not corrupt.
- */
-bool
-xfs_inode_verify_forks(
-	struct xfs_inode	*ip)
-{
-	if (ip->i_d.di_format == XFS_DINODE_FMT_LOCAL &&
-	    xfs_ifork_verify_local_data(ip))
-		return false;
-	if (ip->i_d.di_aformat == XFS_DINODE_FMT_LOCAL &&
-	    xfs_ifork_verify_local_attr(ip))
-		return false;
-	return true;
-}
-
 STATIC int
 xfs_iflush_int(
 	struct xfs_inode	*ip,
@@ -3852,8 +3835,15 @@ xfs_iflush_int(
 	if (!xfs_sb_version_has_v3inode(&mp->m_sb))
 		ip->i_d.di_flushiter++;
 
-	/* Check the inline fork data before we write out. */
-	if (!xfs_inode_verify_forks(ip))
+	/*
+	 * If there are inline format data / attr forks attached to this inode,
+	 * make sure they are not corrupt.
+	 */
+	if (ip->i_d.di_format == XFS_DINODE_FMT_LOCAL &&
+	    xfs_ifork_verify_local_data(ip))
+		goto corrupt_out;
+	if (ip->i_d.di_aformat == XFS_DINODE_FMT_LOCAL &&
+	    xfs_ifork_verify_local_attr(ip))
 		goto corrupt_out;
 
 	/*
diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
index c6a63f6764a67..c7b201b4655ba 100644
--- a/fs/xfs/xfs_inode.h
+++ b/fs/xfs/xfs_inode.h
@@ -497,8 +497,6 @@ extern struct kmem_zone	*xfs_inode_zone;
 /* The default CoW extent size hint. */
 #define XFS_DEFAULT_COWEXTSZ_HINT 32
 
-bool xfs_inode_verify_forks(struct xfs_inode *ip);
-
 int xfs_iunlink_init(struct xfs_perag *pag);
 void xfs_iunlink_destroy(struct xfs_perag *pag);
 
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
index 464388125d20b..554646892c62a 100644
--- a/fs/xfs/xfs_log_recover.c
+++ b/fs/xfs/xfs_log_recover.c
@@ -2874,11 +2874,6 @@ xfs_recover_inode_owner_change(
 	if (error)
 		goto out_free_ip;
 
-	if (!xfs_inode_verify_forks(ip)) {
-		error = -EFSCORRUPTED;
-		goto out_free_ip;
-	}
-
 	if (in_f->ilf_fields & XFS_ILOG_DOWNER) {
 		ASSERT(in_f->ilf_fields & XFS_ILOG_DBROOT);
 		error = xfs_bmbt_change_owner(NULL, ip, XFS_DATA_FORK,
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 11/12] xfs: remove the special COW fork handling in xfs_bmapi_read
  2020-05-01  8:14 dinode reading cleanups Christoph Hellwig
                   ` (9 preceding siblings ...)
  2020-05-01  8:14 ` [PATCH 10/12] xfs: improve local fork verification Christoph Hellwig
@ 2020-05-01  8:14 ` Christoph Hellwig
  2020-05-01 15:57   ` Brian Foster
  2020-05-01  8:14 ` [PATCH 12/12] xfs: remove the NULL " Christoph Hellwig
  11 siblings, 1 reply; 41+ messages in thread
From: Christoph Hellwig @ 2020-05-01  8:14 UTC (permalink / raw)
  To: linux-xfs

We don't call xfs_bmapi_read for the COW fork anymore, so remove the
special casing.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_bmap.c | 13 +------------
 1 file changed, 1 insertion(+), 12 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index fda13cd7add0e..76be1a18e2442 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -3902,8 +3902,7 @@ xfs_bmapi_read(
 	int			whichfork = xfs_bmapi_whichfork(flags);
 
 	ASSERT(*nmap >= 1);
-	ASSERT(!(flags & ~(XFS_BMAPI_ATTRFORK|XFS_BMAPI_ENTIRE|
-			   XFS_BMAPI_COWFORK)));
+	ASSERT(!(flags & ~(XFS_BMAPI_ATTRFORK | XFS_BMAPI_ENTIRE)));
 	ASSERT(xfs_isilocked(ip, XFS_ILOCK_SHARED|XFS_ILOCK_EXCL));
 
 	if (XFS_IS_CORRUPT(mp, !xfs_ifork_has_extents(ip, whichfork)) ||
@@ -3918,16 +3917,6 @@ xfs_bmapi_read(
 
 	ifp = XFS_IFORK_PTR(ip, whichfork);
 	if (!ifp) {
-		/* No CoW fork?  Return a hole. */
-		if (whichfork == XFS_COW_FORK) {
-			mval->br_startoff = bno;
-			mval->br_startblock = HOLESTARTBLOCK;
-			mval->br_blockcount = len;
-			mval->br_state = XFS_EXT_NORM;
-			*nmap = 1;
-			return 0;
-		}
-
 		/*
 		 * A missing attr ifork implies that the inode says we're in
 		 * extents or btree format but failed to pass the inode fork
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* [PATCH 12/12] xfs: remove the NULL fork handling in xfs_bmapi_read
  2020-05-01  8:14 dinode reading cleanups Christoph Hellwig
                   ` (10 preceding siblings ...)
  2020-05-01  8:14 ` [PATCH 11/12] xfs: remove the special COW fork handling in xfs_bmapi_read Christoph Hellwig
@ 2020-05-01  8:14 ` Christoph Hellwig
  2020-05-01 15:58   ` Brian Foster
  11 siblings, 1 reply; 41+ messages in thread
From: Christoph Hellwig @ 2020-05-01  8:14 UTC (permalink / raw)
  To: linux-xfs

Now that we fully verify the inode forks before they are added to the
inode cache, the crash reported in

  https://bugzilla.kernel.org/show_bug.cgi?id=204031

can't happen anymore, as we'll never let an inode that has inconsistent
nextents counts vs the presence of an in-core attr fork leak into the
inactivate code path.

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 fs/xfs/libxfs/xfs_bmap.c | 19 ++-----------------
 1 file changed, 2 insertions(+), 17 deletions(-)

diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
index 76be1a18e2442..4246f2fd5b144 100644
--- a/fs/xfs/libxfs/xfs_bmap.c
+++ b/fs/xfs/libxfs/xfs_bmap.c
@@ -3891,7 +3891,8 @@ xfs_bmapi_read(
 	int			flags)
 {
 	struct xfs_mount	*mp = ip->i_mount;
-	struct xfs_ifork	*ifp;
+	int			whichfork = xfs_bmapi_whichfork(flags);
+	struct xfs_ifork	*ifp = XFS_IFORK_PTR(ip, whichfork);
 	struct xfs_bmbt_irec	got;
 	xfs_fileoff_t		obno;
 	xfs_fileoff_t		end;
@@ -3899,7 +3900,6 @@ xfs_bmapi_read(
 	int			error;
 	bool			eof = false;
 	int			n = 0;
-	int			whichfork = xfs_bmapi_whichfork(flags);
 
 	ASSERT(*nmap >= 1);
 	ASSERT(!(flags & ~(XFS_BMAPI_ATTRFORK | XFS_BMAPI_ENTIRE)));
@@ -3915,21 +3915,6 @@ xfs_bmapi_read(
 
 	XFS_STATS_INC(mp, xs_blk_mapr);
 
-	ifp = XFS_IFORK_PTR(ip, whichfork);
-	if (!ifp) {
-		/*
-		 * A missing attr ifork implies that the inode says we're in
-		 * extents or btree format but failed to pass the inode fork
-		 * verifier while trying to load it.  Treat that as a file
-		 * corruption too.
-		 */
-#ifdef DEBUG
-		xfs_alert(mp, "%s: inode %llu missing fork %d",
-				__func__, ip->i_ino, whichfork);
-#endif /* DEBUG */
-		return -EFSCORRUPTED;
-	}
-
 	if (!(ifp->if_flags & XFS_IFEXTENTS)) {
 		error = xfs_iread_extents(NULL, ip, whichfork);
 		if (error)
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH 01/12] xfs: xfs_bmapi_read doesn't take a fork id as the last argument
  2020-05-01  8:14 ` [PATCH 01/12] xfs: xfs_bmapi_read doesn't take a fork id as the last argument Christoph Hellwig
@ 2020-05-01 13:33   ` Brian Foster
  0 siblings, 0 replies; 41+ messages in thread
From: Brian Foster @ 2020-05-01 13:33 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Fri, May 01, 2020 at 10:14:13AM +0200, Christoph Hellwig wrote:
> The last argument to xfs_bmapi_raad contains XFS_BMAPI_* flags, not the
> fork.  Given that XFS_DATA_FORK evaluates to 0 no real harm is done,
> but let's fix this anyway.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  fs/xfs/libxfs/xfs_rtbitmap.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_rtbitmap.c b/fs/xfs/libxfs/xfs_rtbitmap.c
> index f42c74cb8be53..9498ced947be9 100644
> --- a/fs/xfs/libxfs/xfs_rtbitmap.c
> +++ b/fs/xfs/libxfs/xfs_rtbitmap.c
> @@ -66,7 +66,7 @@ xfs_rtbuf_get(
>  
>  	ip = issum ? mp->m_rsumip : mp->m_rbmip;
>  
> -	error = xfs_bmapi_read(ip, block, 1, &map, &nmap, XFS_DATA_FORK);
> +	error = xfs_bmapi_read(ip, block, 1, &map, &nmap, 0);
>  	if (error)
>  		return error;
>  
> -- 
> 2.26.2
> 


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 02/12] xfs: call xfs_iformat_fork from xfs_inode_from_disk
  2020-05-01  8:14 ` [PATCH 02/12] xfs: call xfs_iformat_fork from xfs_inode_from_disk Christoph Hellwig
@ 2020-05-01 13:33   ` Brian Foster
  0 siblings, 0 replies; 41+ messages in thread
From: Brian Foster @ 2020-05-01 13:33 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Fri, May 01, 2020 at 10:14:14AM +0200, Christoph Hellwig wrote:
> We always need to fill out the fork structures when reading the inode,
> so call xfs_iformat_fork from the tail of xfs_inode_from_disk.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  fs/xfs/libxfs/xfs_inode_buf.c | 7 ++++---
>  fs/xfs/libxfs/xfs_inode_buf.h | 2 +-
>  fs/xfs/xfs_log_recover.c      | 4 +---
>  3 files changed, 6 insertions(+), 7 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c
> index 39c5a6e24915c..02f06dec0a5a6 100644
> --- a/fs/xfs/libxfs/xfs_inode_buf.c
> +++ b/fs/xfs/libxfs/xfs_inode_buf.c
> @@ -186,7 +186,7 @@ xfs_imap_to_bp(
>  	return 0;
>  }
>  
> -void
> +int
>  xfs_inode_from_disk(
>  	struct xfs_inode	*ip,
>  	struct xfs_dinode	*from)
> @@ -247,6 +247,8 @@ xfs_inode_from_disk(
>  		to->di_flags2 = be64_to_cpu(from->di_flags2);
>  		to->di_cowextsize = be32_to_cpu(from->di_cowextsize);
>  	}
> +
> +	return xfs_iformat_fork(ip, from);
>  }
>  
>  void
> @@ -647,8 +649,7 @@ xfs_iread(
>  	 * Otherwise, just get the truly permanent information.
>  	 */
>  	if (dip->di_mode) {
> -		xfs_inode_from_disk(ip, dip);
> -		error = xfs_iformat_fork(ip, dip);
> +		error = xfs_inode_from_disk(ip, dip);
>  		if (error)  {
>  #ifdef DEBUG
>  			xfs_alert(mp, "%s: xfs_iformat() returned error %d",
> diff --git a/fs/xfs/libxfs/xfs_inode_buf.h b/fs/xfs/libxfs/xfs_inode_buf.h
> index 9b373dcf9e34d..081230faf7bdc 100644
> --- a/fs/xfs/libxfs/xfs_inode_buf.h
> +++ b/fs/xfs/libxfs/xfs_inode_buf.h
> @@ -54,7 +54,7 @@ int	xfs_iread(struct xfs_mount *, struct xfs_trans *,
>  void	xfs_dinode_calc_crc(struct xfs_mount *, struct xfs_dinode *);
>  void	xfs_inode_to_disk(struct xfs_inode *ip, struct xfs_dinode *to,
>  			  xfs_lsn_t lsn);
> -void	xfs_inode_from_disk(struct xfs_inode *ip, struct xfs_dinode *from);
> +int	xfs_inode_from_disk(struct xfs_inode *ip, struct xfs_dinode *from);
>  void	xfs_log_dinode_to_disk(struct xfs_log_dinode *from,
>  			       struct xfs_dinode *to);
>  
> diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c
> index 11c3502b07b13..464388125d20b 100644
> --- a/fs/xfs/xfs_log_recover.c
> +++ b/fs/xfs/xfs_log_recover.c
> @@ -2870,9 +2870,7 @@ xfs_recover_inode_owner_change(
>  
>  	/* instantiate the inode */
>  	ASSERT(dip->di_version >= 3);
> -	xfs_inode_from_disk(ip, dip);
> -
> -	error = xfs_iformat_fork(ip, dip);
> +	error = xfs_inode_from_disk(ip, dip);
>  	if (error)
>  		goto out_free_ip;
>  
> -- 
> 2.26.2
> 


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 03/12] xfs: split xfs_iformat_fork
  2020-05-01  8:14 ` [PATCH 03/12] xfs: split xfs_iformat_fork Christoph Hellwig
@ 2020-05-01 13:34   ` Brian Foster
  2020-05-07 12:27     ` Christoph Hellwig
  0 siblings, 1 reply; 41+ messages in thread
From: Brian Foster @ 2020-05-01 13:34 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Fri, May 01, 2020 at 10:14:15AM +0200, Christoph Hellwig wrote:
> xfs_iformat_fork is a weird catchall.  Split it into one helper for
> the data fork and one for the attr fork, and then call both helper
> as well as the COW fork initialization from xfs_inode_from_disk.  Order
> the COW fork initialization after the attr fork initialization given
> that it can't fail to simplify the error handling.
> 
> Note that the newly split helpers are moved down the file in
> xfs_inode_fork.c to avoid the need for forward declarations.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/xfs/libxfs/xfs_inode_buf.c  |  20 +++-
>  fs/xfs/libxfs/xfs_inode_fork.c | 186 +++++++++++++++------------------
>  fs/xfs/libxfs/xfs_inode_fork.h |   3 +-
>  3 files changed, 103 insertions(+), 106 deletions(-)
> 
...
> diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c
> index 518c6f0ec3a61..f30d43364aa92 100644
> --- a/fs/xfs/libxfs/xfs_inode_fork.c
> +++ b/fs/xfs/libxfs/xfs_inode_fork.c
...
> @@ -325,6 +221,88 @@ xfs_iformat_btree(
>  	return 0;
>  }
>  
> +int
> +xfs_iformat_data_fork(
> +	struct xfs_inode	*ip,
> +	struct xfs_dinode	*dip)
> +{
> +	struct inode		*inode = VFS_I(ip);
> +
> +	switch (inode->i_mode & S_IFMT) {
> +	case S_IFIFO:
> +	case S_IFCHR:
> +	case S_IFBLK:
> +	case S_IFSOCK:
> +		ip->i_d.di_size = 0;
> +		inode->i_rdev = xfs_to_linux_dev_t(xfs_dinode_get_rdev(dip));
> +		return 0;
> +	case S_IFREG:
> +	case S_IFLNK:
> +	case S_IFDIR:
> +		switch (dip->di_format) {
> +		case XFS_DINODE_FMT_LOCAL:
> +			return xfs_iformat_local(ip, dip, XFS_DATA_FORK,
> +					be64_to_cpu(dip->di_size));
> +		case XFS_DINODE_FMT_EXTENTS:
> +			return xfs_iformat_extents(ip, dip, XFS_DATA_FORK);
> +		case XFS_DINODE_FMT_BTREE:
> +			return xfs_iformat_btree(ip, dip, XFS_DATA_FORK);
> +		default:
> +			xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__,
> +					dip, sizeof(*dip), __this_address);
> +			return -EFSCORRUPTED;
> +		}
> +		break;
> +	default:
> +		xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__, dip,
> +				sizeof(*dip), __this_address);
> +		return -EFSCORRUPTED;
> +	}

Can we fix this function up to use an error variable and return error at
the end like xfs_iformat_attr_work() does? Otherwise nice cleanup..

Brian

> +}
> +
> +static uint16_t
> +xfs_dfork_attr_shortform_size(
> +	struct xfs_dinode		*dip)
> +{
> +	struct xfs_attr_shortform	*atp =
> +		(struct xfs_attr_shortform *)XFS_DFORK_APTR(dip);
> +
> +	return be16_to_cpu(atp->hdr.totsize);
> +}
> +
> +int
> +xfs_iformat_attr_fork(
> +	struct xfs_inode	*ip,
> +	struct xfs_dinode	*dip)
> +{
> +	int			error = 0;
> +
> +	ip->i_afp = kmem_zone_zalloc(xfs_ifork_zone, KM_NOFS);
> +	switch (dip->di_aformat) {
> +	case XFS_DINODE_FMT_LOCAL:
> +		error = xfs_iformat_local(ip, dip, XFS_ATTR_FORK,
> +				xfs_dfork_attr_shortform_size(dip));
> +		break;
> +	case XFS_DINODE_FMT_EXTENTS:
> +		error = xfs_iformat_extents(ip, dip, XFS_ATTR_FORK);
> +		break;
> +	case XFS_DINODE_FMT_BTREE:
> +		error = xfs_iformat_btree(ip, dip, XFS_ATTR_FORK);
> +		break;
> +	default:
> +		xfs_inode_verifier_error(ip, error, __func__, dip,
> +				sizeof(*dip), __this_address);
> +		error = -EFSCORRUPTED;
> +		break;
> +	}
> +
> +	if (error) {
> +		kmem_cache_free(xfs_ifork_zone, ip->i_afp);
> +		ip->i_afp = NULL;
> +	}
> +	return error;
> +}
> +
>  /*
>   * Reallocate the space for if_broot based on the number of records
>   * being added or deleted as indicated in rec_diff.  Move the records
> diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
> index 668ee942be224..8487b0c88a75e 100644
> --- a/fs/xfs/libxfs/xfs_inode_fork.h
> +++ b/fs/xfs/libxfs/xfs_inode_fork.h
> @@ -88,7 +88,8 @@ struct xfs_ifork {
>  
>  struct xfs_ifork *xfs_iext_state_to_fork(struct xfs_inode *ip, int state);
>  
> -int		xfs_iformat_fork(struct xfs_inode *, struct xfs_dinode *);
> +int		xfs_iformat_data_fork(struct xfs_inode *, struct xfs_dinode *);
> +int		xfs_iformat_attr_fork(struct xfs_inode *, struct xfs_dinode *);
>  void		xfs_iflush_fork(struct xfs_inode *, struct xfs_dinode *,
>  				struct xfs_inode_log_item *, int);
>  void		xfs_idestroy_fork(struct xfs_inode *, int);
> -- 
> 2.26.2
> 


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 04/12] xfs: handle unallocated inodes in xfs_inode_from_disk
  2020-05-01  8:14 ` [PATCH 04/12] xfs: handle unallocated inodes in xfs_inode_from_disk Christoph Hellwig
@ 2020-05-01 13:34   ` Brian Foster
  0 siblings, 0 replies; 41+ messages in thread
From: Brian Foster @ 2020-05-01 13:34 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Fri, May 01, 2020 at 10:14:16AM +0200, Christoph Hellwig wrote:
> Handle inodes with a 0 di_mode in xfs_inode_from_disk, instead of partially
> duplicating inode reading in xfs_iread.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/xfs/libxfs/xfs_inode_buf.c | 54 ++++++++++++-----------------------
>  1 file changed, 18 insertions(+), 36 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c
> index 983beb680e81a..b136f29f7d9d3 100644
> --- a/fs/xfs/libxfs/xfs_inode_buf.c
> +++ b/fs/xfs/libxfs/xfs_inode_buf.c
> @@ -198,6 +198,21 @@ xfs_inode_from_disk(
>  	ASSERT(ip->i_cowfp == NULL);
>  	ASSERT(ip->i_afp == NULL);
>  
> +	/*
> +	 * Get the truly permanent information first that is not overwritten by
> +	 * xfs_ialloc first.  This also includes i_mode so that a newly read
> +	 * in inode structure for an allocation is marked as already free.
> +	 */

The first sentence has a wording issue and the second is kind of
confusing. I think this can be simplified and condensed further (see the
diff below for example). Otherwise looks good.

Brian

diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c
index b136f29f7d9d..ed02649138aa 100644
--- a/fs/xfs/libxfs/xfs_inode_buf.c
+++ b/fs/xfs/libxfs/xfs_inode_buf.c
@@ -199,17 +199,14 @@ xfs_inode_from_disk(
 	ASSERT(ip->i_afp == NULL);
 
 	/*
-	 * Get the truly permanent information first that is not overwritten by
-	 * xfs_ialloc first.  This also includes i_mode so that a newly read
-	 * in inode structure for an allocation is marked as already free.
+	 * First get the permanent information that is needed to allocate an
+	 * inode. If the inode is unused, mode is zero and we shouldn't mess
+	 * with the unitialized part of it.
 	 */
+	to->di_flushiter = be16_to_cpu(from->di_flushiter);
 	inode->i_generation = be32_to_cpu(from->di_gen);
 	inode->i_mode = be16_to_cpu(from->di_mode);
-	to->di_flushiter = be16_to_cpu(from->di_flushiter);
 
-	/*
-	 * Only copy the rest if the inode is actually allocated.
-	 */
 	if (!inode->i_mode)
 		return 0;
 

> +	inode->i_generation = be32_to_cpu(from->di_gen);
> +	inode->i_mode = be16_to_cpu(from->di_mode);
> +	to->di_flushiter = be16_to_cpu(from->di_flushiter);
> +
> +	/*
> +	 * Only copy the rest if the inode is actually allocated.
> +	 */
> +	if (!inode->i_mode)
> +		return 0;
> +
>  	/*
>  	 * Convert v1 inodes immediately to v2 inode format as this is the
>  	 * minimum inode version format we support in the rest of the code.
> @@ -215,7 +230,6 @@ xfs_inode_from_disk(
>  	to->di_format = from->di_format;
>  	i_uid_write(inode, be32_to_cpu(from->di_uid));
>  	i_gid_write(inode, be32_to_cpu(from->di_gid));
> -	to->di_flushiter = be16_to_cpu(from->di_flushiter);
>  
>  	/*
>  	 * Time is signed, so need to convert to signed 32 bit before
> @@ -229,8 +243,6 @@ xfs_inode_from_disk(
>  	inode->i_mtime.tv_nsec = (int)be32_to_cpu(from->di_mtime.t_nsec);
>  	inode->i_ctime.tv_sec = (int)be32_to_cpu(from->di_ctime.t_sec);
>  	inode->i_ctime.tv_nsec = (int)be32_to_cpu(from->di_ctime.t_nsec);
> -	inode->i_generation = be32_to_cpu(from->di_gen);
> -	inode->i_mode = be16_to_cpu(from->di_mode);
>  
>  	to->di_size = be64_to_cpu(from->di_size);
>  	to->di_nblocks = be64_to_cpu(from->di_nblocks);
> @@ -659,39 +671,9 @@ xfs_iread(
>  		goto out_brelse;
>  	}
>  
> -	/*
> -	 * If the on-disk inode is already linked to a directory
> -	 * entry, copy all of the inode into the in-core inode.
> -	 * xfs_iformat_fork() handles copying in the inode format
> -	 * specific information.
> -	 * Otherwise, just get the truly permanent information.
> -	 */
> -	if (dip->di_mode) {
> -		error = xfs_inode_from_disk(ip, dip);
> -		if (error)  {
> -#ifdef DEBUG
> -			xfs_alert(mp, "%s: xfs_iformat() returned error %d",
> -				__func__, error);
> -#endif /* DEBUG */
> -			goto out_brelse;
> -		}
> -	} else {
> -		/*
> -		 * Partial initialisation of the in-core inode. Just the bits
> -		 * that xfs_ialloc won't overwrite or relies on being correct.
> -		 */
> -		VFS_I(ip)->i_generation = be32_to_cpu(dip->di_gen);
> -		ip->i_d.di_flushiter = be16_to_cpu(dip->di_flushiter);
> -
> -		/*
> -		 * Make sure to pull in the mode here as well in
> -		 * case the inode is released without being used.
> -		 * This ensures that xfs_inactive() will see that
> -		 * the inode is already free and not try to mess
> -		 * with the uninitialized part of it.
> -		 */
> -		VFS_I(ip)->i_mode = 0;
> -	}
> +	error = xfs_inode_from_disk(ip, dip);
> +	if (error)
> +		goto out_brelse;
>  
>  	ip->i_delayed_blks = 0;
>  
> -- 
> 2.26.2
> 


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH 05/12] xfs: call xfs_dinode_verify from xfs_inode_from_disk
  2020-05-01  8:14 ` [PATCH 05/12] xfs: call xfs_dinode_verify from xfs_inode_from_disk Christoph Hellwig
@ 2020-05-01 13:34   ` Brian Foster
  0 siblings, 0 replies; 41+ messages in thread
From: Brian Foster @ 2020-05-01 13:34 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Fri, May 01, 2020 at 10:14:17AM +0200, Christoph Hellwig wrote:
> Keep the code dealing with the dinode together, and also ensure we verify
> the dinode in the onwer change log recovery case as well.

		    owner

> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  .../xfs-self-describing-metadata.txt           | 10 +++++-----
>  fs/xfs/libxfs/xfs_inode_buf.c                  | 18 ++++++++----------
>  2 files changed, 13 insertions(+), 15 deletions(-)
> 
> diff --git a/Documentation/filesystems/xfs-self-describing-metadata.txt b/Documentation/filesystems/xfs-self-describing-metadata.txt
> index 8db0121d0980c..e912699d74301 100644
> --- a/Documentation/filesystems/xfs-self-describing-metadata.txt
> +++ b/Documentation/filesystems/xfs-self-describing-metadata.txt
> @@ -337,11 +337,11 @@ buffer.
>  
>  The structure of the verifiers and the identifiers checks is very similar to the
>  buffer code described above. The only difference is where they are called. For
> -example, inode read verification is done in xfs_iread() when the inode is first
> -read out of the buffer and the struct xfs_inode is instantiated. The inode is
> -already extensively verified during writeback in xfs_iflush_int, so the only
> -addition here is to add the LSN and CRC to the inode as it is copied back into
> -the buffer.
> +example, inode read verification is done in xfs_inode_from_disk() when the inode
> +is first read out of the buffer and the struct xfs_inode is instantiated. The
> +inode is already extensively verified during writeback in xfs_iflush_int, so the
> +only addition here is to add the LSN and CRC to the inode as it is copied back
> +into the buffer.
>  
>  XXX: inode unlinked list modification doesn't recalculate the inode CRC! None of
>  the unlinked list modifications check or update CRCs, neither during unlink nor
> diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c
> index b136f29f7d9d3..a00001a2336ef 100644
> --- a/fs/xfs/libxfs/xfs_inode_buf.c
> +++ b/fs/xfs/libxfs/xfs_inode_buf.c
> @@ -194,10 +194,18 @@ xfs_inode_from_disk(
>  	struct xfs_icdinode	*to = &ip->i_d;
>  	struct inode		*inode = VFS_I(ip);
>  	int			error;
> +	xfs_failaddr_t		fa;
>  
>  	ASSERT(ip->i_cowfp == NULL);
>  	ASSERT(ip->i_afp == NULL);
>  
> +	fa = xfs_dinode_verify(ip->i_mount, ip->i_ino, from);
> +	if (fa) {
> +		xfs_inode_verifier_error(ip, -EFSCORRUPTED, "dinode", from,
> +				sizeof(*from), fa);
> +		return -EFSCORRUPTED;
> +	}
> +
>  	/*
>  	 * Get the truly permanent information first that is not overwritten by
>  	 * xfs_ialloc first.  This also includes i_mode so that a newly read
> @@ -637,7 +645,6 @@ xfs_iread(
>  {
>  	xfs_buf_t	*bp;
>  	xfs_dinode_t	*dip;
> -	xfs_failaddr_t	fa;
>  	int		error;
>  
>  	/*
> @@ -662,15 +669,6 @@ xfs_iread(
>  	if (error)
>  		return error;
>  
> -	/* even unallocated inodes are verified */
> -	fa = xfs_dinode_verify(mp, ip->i_ino, dip);
> -	if (fa) {
> -		xfs_inode_verifier_error(ip, -EFSCORRUPTED, "dinode", dip,
> -				sizeof(*dip), fa);
> -		error = -EFSCORRUPTED;
> -		goto out_brelse;
> -	}
> -
>  	error = xfs_inode_from_disk(ip, dip);
>  	if (error)
>  		goto out_brelse;
> -- 
> 2.26.2
> 


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 06/12] xfs: don't reset i_delayed_blks in xfs_iread
  2020-05-01  8:14 ` [PATCH 06/12] xfs: don't reset i_delayed_blks in xfs_iread Christoph Hellwig
@ 2020-05-01 13:34   ` Brian Foster
  0 siblings, 0 replies; 41+ messages in thread
From: Brian Foster @ 2020-05-01 13:34 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Fri, May 01, 2020 at 10:14:18AM +0200, Christoph Hellwig wrote:
> i_delayed_blks is set to 0 in xfs_inode_alloc and can't have anything
> assigned to it until the inode is visible to the VFS.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  fs/xfs/libxfs/xfs_inode_buf.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c
> index a00001a2336ef..0357dc4b29481 100644
> --- a/fs/xfs/libxfs/xfs_inode_buf.c
> +++ b/fs/xfs/libxfs/xfs_inode_buf.c
> @@ -673,8 +673,6 @@ xfs_iread(
>  	if (error)
>  		goto out_brelse;
>  
> -	ip->i_delayed_blks = 0;
> -
>  	/*
>  	 * Mark the buffer containing the inode as something to keep
>  	 * around for a while.  This helps to keep recently accessed
> -- 
> 2.26.2
> 


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 07/12] xfs: remove xfs_iread
  2020-05-01  8:14 ` [PATCH 07/12] xfs: remove xfs_iread Christoph Hellwig
@ 2020-05-01 15:56   ` Brian Foster
  0 siblings, 0 replies; 41+ messages in thread
From: Brian Foster @ 2020-05-01 15:56 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Fri, May 01, 2020 at 10:14:19AM +0200, Christoph Hellwig wrote:
> There is not much point in the xfs_iread function, as it has a single
> caller and not a whole lot of code.  Move it into the only caller,
> and trim down the overdocumentation to just documenting the important
> "why" instead of a lot of redundant "what".
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/xfs/libxfs/xfs_inode_buf.c | 73 -----------------------------------
>  fs/xfs/libxfs/xfs_inode_buf.h |  2 -
>  fs/xfs/xfs_icache.c           | 33 +++++++++++++++-
>  3 files changed, 32 insertions(+), 76 deletions(-)
> 
...
> diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
> index 8bf1d15be3f6a..dd757c6614956 100644
> --- a/fs/xfs/xfs_icache.c
> +++ b/fs/xfs/xfs_icache.c
...
> @@ -510,10 +511,40 @@ xfs_iget_cache_miss(
>  	if (!ip)
>  		return -ENOMEM;
>  
> -	error = xfs_iread(mp, tp, ip, flags);
> +	error = xfs_imap(mp, tp, ip->i_ino, &ip->i_imap, flags);
>  	if (error)
>  		goto out_destroy;
>  
> +	/*
> +	 * For version 5 superblocks, if we are initialising a new inode and we
> +	 * are not utilising the XFS_MOUNT_IKEEP inode cluster mode, we can
> +	 * simple build the new inode core with a random generation number.

I'm assuming the original comment meant to say "simply" here instead of
"simple." Otherwise looks good to me:

Reviewed-by: Brian Foster <bfoster@redhat.com>

> +	 *
> +	 * For version 4 (and older) superblocks, log recovery is dependent on
> +	 * the di_flushiter field being initialised from the current on-disk
> +	 * value and hence we must also read the inode off disk even when
> +	 * initializing new inodes.
> +	 */
> +	if (xfs_sb_version_has_v3inode(&mp->m_sb) &&
> +	    (flags & XFS_IGET_CREATE) && !(mp->m_flags & XFS_MOUNT_IKEEP)) {
> +		VFS_I(ip)->i_generation = prandom_u32();
> +	} else {
> +		struct xfs_dinode	*dip;
> +		struct xfs_buf		*bp;
> +
> +		error = xfs_imap_to_bp(mp, tp, &ip->i_imap, &dip, &bp, 0, flags);
> +		if (error)
> +			goto out_destroy;
> +
> +		error = xfs_inode_from_disk(ip, dip);
> +		if (!error)
> +			xfs_buf_set_ref(bp, XFS_INO_REF);
> +		xfs_trans_brelse(tp, bp);
> +
> +		if (error)
> +			goto out_destroy;
> +	}
> +
>  	if (!xfs_inode_verify_forks(ip)) {
>  		error = -EFSCORRUPTED;
>  		goto out_destroy;
> -- 
> 2.26.2
> 


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 08/12] xfs: remove xfs_ifork_ops
  2020-05-01  8:14 ` [PATCH 08/12] xfs: remove xfs_ifork_ops Christoph Hellwig
@ 2020-05-01 15:56   ` Brian Foster
  2020-05-01 16:08     ` Darrick J. Wong
  0 siblings, 1 reply; 41+ messages in thread
From: Brian Foster @ 2020-05-01 15:56 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Fri, May 01, 2020 at 10:14:20AM +0200, Christoph Hellwig wrote:
> xfs_ifork_ops add up to two indirect calls per inode read and flush,
> despite just having a single instance in the kernel.  In xfsprogs
> phase6 in xfs_repair overrides the verify_dir method to deal with inodes
> that do not have a valid parent.  Instead of the costly indirection just
> life the repair code into xfs_dir2_sf.c under a condition that ensures
> it is compiled as part of a kernel build, but instantly eliminated as
> it is unreachable.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/xfs/libxfs/xfs_dir2_sf.c    | 64 ++++++++++++++++++++++++++++++++--
>  fs/xfs/libxfs/xfs_inode_fork.c | 19 +++-------
>  fs/xfs/libxfs/xfs_inode_fork.h | 15 ++------
>  fs/xfs/xfs_inode.c             |  4 +--
>  4 files changed, 71 insertions(+), 31 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_dir2_sf.c b/fs/xfs/libxfs/xfs_dir2_sf.c
> index 7b7f6fb2ea3b2..1f6c30b68917c 100644
> --- a/fs/xfs/libxfs/xfs_dir2_sf.c
> +++ b/fs/xfs/libxfs/xfs_dir2_sf.c
...
> @@ -804,6 +804,66 @@ xfs_dir2_sf_verify(
>  	return NULL;
>  }
>  
> +/*
> + * When we're checking directory inodes, we're allowed to set a directory's
> + * dotdot entry to zero to signal that the parent needs to be reconnected
> + * during xfs_repair phase 6.  If we're handling a shortform directory the ifork
> + * verifiers will fail, so temporarily patch out this canary so that we can
> + * verify the rest of the fork and move on to fixing the dir.
> + */
> +static xfs_failaddr_t
> +xfs_dir2_sf_verify_dir_check(
> +	struct xfs_inode		*ip)
> +{
> +	struct xfs_mount		*mp = ip->i_mount;
> +	struct xfs_ifork		*ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
> +	struct xfs_dir2_sf_hdr		*sfp =
> +		(struct xfs_dir2_sf_hdr *)ifp->if_u1.if_data;
> +	int				size = ifp->if_bytes;
> +	bool				parent_bypass = false;
> +	xfs_ino_t			old_parent;
> +	xfs_failaddr_t			fa;
> +
> +	/*
> +	 * If this is a shortform directory, phase4 in xfs_repair may have set
> +	 * the parent inode to zero to indicate that it must be fixed.
> +	 * Temporarily set a valid parent so that the directory verifier will
> +	 * pass.
> +	 */
> +	if (size > offsetof(struct xfs_dir2_sf_hdr, parent) &&
> +	    size >= xfs_dir2_sf_hdr_size(sfp->i8count)) {
> +		old_parent = xfs_dir2_sf_get_parent_ino(sfp);
> +		if (!old_parent) {
> +			xfs_dir2_sf_put_parent_ino(sfp, mp->m_sb.sb_rootino);
> +			parent_bypass = true;
> +		}
> +	}
> +
> +	fa = __xfs_dir2_sf_verify(ip);
> +
> +	/* Put it back. */
> +	if (parent_bypass)
> +		xfs_dir2_sf_put_parent_ino(sfp, old_parent);
> +	return fa;
> +}

I'm not sure the cleanup is worth the kludge of including repair code in
the kernel like this. It might be better to reduce or replace ifork_ops
to a single directory function pointer until there's a reason for this
to become common. I dunno, maybe others have thoughts...

Brian

> +
> +/*
> + * Allow xfs_repair to enable the parent bypass mode.  For now this is entirely
> + * unused in the kernel, but might come in useful for online repair eventually.
> + */
> +#ifndef xfs_inode_parent_bypass
> +#define xfs_inode_parent_bypass(ip)	0
> +#endif
> +
> +xfs_failaddr_t
> +xfs_dir2_sf_verify(
> +	struct xfs_inode		*ip)
> +{
> +	if (xfs_inode_parent_bypass(ip))
> +		return xfs_dir2_sf_verify_dir_check(ip);
> +	return __xfs_dir2_sf_verify(ip);
> +}
> +
>  /*
>   * Create a new (shortform) directory.
>   */
> diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c
> index f30d43364aa92..f6dcee919f59e 100644
> --- a/fs/xfs/libxfs/xfs_inode_fork.c
> +++ b/fs/xfs/libxfs/xfs_inode_fork.c
> @@ -673,18 +673,10 @@ xfs_ifork_init_cow(
>  	ip->i_cnextents = 0;
>  }
>  
> -/* Default fork content verifiers. */
> -struct xfs_ifork_ops xfs_default_ifork_ops = {
> -	.verify_attr	= xfs_attr_shortform_verify,
> -	.verify_dir	= xfs_dir2_sf_verify,
> -	.verify_symlink	= xfs_symlink_shortform_verify,
> -};
> -
>  /* Verify the inline contents of the data fork of an inode. */
>  xfs_failaddr_t
>  xfs_ifork_verify_data(
> -	struct xfs_inode	*ip,
> -	struct xfs_ifork_ops	*ops)
> +	struct xfs_inode	*ip)
>  {
>  	/* Non-local data fork, we're done. */
>  	if (ip->i_d.di_format != XFS_DINODE_FMT_LOCAL)
> @@ -693,9 +685,9 @@ xfs_ifork_verify_data(
>  	/* Check the inline data fork if there is one. */
>  	switch (VFS_I(ip)->i_mode & S_IFMT) {
>  	case S_IFDIR:
> -		return ops->verify_dir(ip);
> +		return xfs_dir2_sf_verify(ip);
>  	case S_IFLNK:
> -		return ops->verify_symlink(ip);
> +		return xfs_symlink_shortform_verify(ip);
>  	default:
>  		return NULL;
>  	}
> @@ -704,13 +696,12 @@ xfs_ifork_verify_data(
>  /* Verify the inline contents of the attr fork of an inode. */
>  xfs_failaddr_t
>  xfs_ifork_verify_attr(
> -	struct xfs_inode	*ip,
> -	struct xfs_ifork_ops	*ops)
> +	struct xfs_inode	*ip)
>  {
>  	/* There has to be an attr fork allocated if aformat is local. */
>  	if (ip->i_d.di_aformat != XFS_DINODE_FMT_LOCAL)
>  		return NULL;
>  	if (!XFS_IFORK_PTR(ip, XFS_ATTR_FORK))
>  		return __this_address;
> -	return ops->verify_attr(ip);
> +	return xfs_attr_shortform_verify(ip);
>  }
> diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
> index 8487b0c88a75e..3f84d33abd3b7 100644
> --- a/fs/xfs/libxfs/xfs_inode_fork.h
> +++ b/fs/xfs/libxfs/xfs_inode_fork.h
> @@ -176,18 +176,7 @@ extern struct kmem_zone	*xfs_ifork_zone;
>  
>  extern void xfs_ifork_init_cow(struct xfs_inode *ip);
>  
> -typedef xfs_failaddr_t (*xfs_ifork_verifier_t)(struct xfs_inode *);
> -
> -struct xfs_ifork_ops {
> -	xfs_ifork_verifier_t	verify_symlink;
> -	xfs_ifork_verifier_t	verify_dir;
> -	xfs_ifork_verifier_t	verify_attr;
> -};
> -extern struct xfs_ifork_ops	xfs_default_ifork_ops;
> -
> -xfs_failaddr_t xfs_ifork_verify_data(struct xfs_inode *ip,
> -		struct xfs_ifork_ops *ops);
> -xfs_failaddr_t xfs_ifork_verify_attr(struct xfs_inode *ip,
> -		struct xfs_ifork_ops *ops);
> +xfs_failaddr_t xfs_ifork_verify_data(struct xfs_inode *ip);
> +xfs_failaddr_t xfs_ifork_verify_attr(struct xfs_inode *ip);
>  
>  #endif	/* __XFS_INODE_FORK_H__ */
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index d1772786af29d..93967278355de 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -3769,7 +3769,7 @@ xfs_inode_verify_forks(
>  	struct xfs_ifork	*ifp;
>  	xfs_failaddr_t		fa;
>  
> -	fa = xfs_ifork_verify_data(ip, &xfs_default_ifork_ops);
> +	fa = xfs_ifork_verify_data(ip);
>  	if (fa) {
>  		ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
>  		xfs_inode_verifier_error(ip, -EFSCORRUPTED, "data fork",
> @@ -3777,7 +3777,7 @@ xfs_inode_verify_forks(
>  		return false;
>  	}
>  
> -	fa = xfs_ifork_verify_attr(ip, &xfs_default_ifork_ops);
> +	fa = xfs_ifork_verify_attr(ip);
>  	if (fa) {
>  		ifp = XFS_IFORK_PTR(ip, XFS_ATTR_FORK);
>  		xfs_inode_verifier_error(ip, -EFSCORRUPTED, "attr fork",
> -- 
> 2.26.2
> 


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 09/12] xfs: refactor xfs_inode_verify_forks
  2020-05-01  8:14 ` [PATCH 09/12] xfs: refactor xfs_inode_verify_forks Christoph Hellwig
@ 2020-05-01 15:57   ` Brian Foster
  2020-05-01 16:40     ` Christoph Hellwig
  0 siblings, 1 reply; 41+ messages in thread
From: Brian Foster @ 2020-05-01 15:57 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Fri, May 01, 2020 at 10:14:21AM +0200, Christoph Hellwig wrote:
> The split between xfs_inode_verify_forks and the two helpers
> implementing the actual functionality is a little strange.  Reshuffle
> it so that xfs_inode_verify_forks verifies if the data and attr forks
> are actually in local format and only call the low-level helpers if
> that is the case.  Handle the actual error reporting in the low-level
> handlers to streamline the caller.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/xfs/libxfs/xfs_inode_fork.c | 47 ++++++++++++++++++++++------------
>  fs/xfs/libxfs/xfs_inode_fork.h |  4 +--
>  fs/xfs/xfs_inode.c             | 21 +++------------
>  3 files changed, 37 insertions(+), 35 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c
> index f6dcee919f59e..7e129ed2f345f 100644
> --- a/fs/xfs/libxfs/xfs_inode_fork.c
> +++ b/fs/xfs/libxfs/xfs_inode_fork.c
> @@ -674,34 +674,49 @@ xfs_ifork_init_cow(
>  }
>  
...
>  
>  /* Verify the inline contents of the attr fork of an inode. */
> -xfs_failaddr_t
> -xfs_ifork_verify_attr(
> +int
> +xfs_ifork_verify_local_attr(
>  	struct xfs_inode	*ip)
>  {
> -	/* There has to be an attr fork allocated if aformat is local. */
> -	if (ip->i_d.di_aformat != XFS_DINODE_FMT_LOCAL)
> -		return NULL;
> +	xfs_failaddr_t		fa;
> +
>  	if (!XFS_IFORK_PTR(ip, XFS_ATTR_FORK))
> -		return __this_address;
> -	return xfs_attr_shortform_verify(ip);
> +		fa = __this_address;
> +	else
> +		fa = xfs_attr_shortform_verify(ip);
> +
> +	if (fa) {
> +		xfs_inode_verifier_error(ip, -EFSCORRUPTED, "attr fork",
> +			ip->i_afp->if_u1.if_data, ip->i_afp->if_bytes, fa);
> +		return -EFSCORRUPTED;

This explicitly makes !ip->i_afp one of the handled corruption cases for
XFS_DINODE_FMT_LOCAL, but then attempts to access it anyways. Otherwise
seems Ok modulo the comments on the previous patch...

Brian

> +	}
> +
> +	return 0;
>  }
> diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
> index 3f84d33abd3b7..f46a8c1db5964 100644
> --- a/fs/xfs/libxfs/xfs_inode_fork.h
> +++ b/fs/xfs/libxfs/xfs_inode_fork.h
> @@ -176,7 +176,7 @@ extern struct kmem_zone	*xfs_ifork_zone;
>  
>  extern void xfs_ifork_init_cow(struct xfs_inode *ip);
>  
> -xfs_failaddr_t xfs_ifork_verify_data(struct xfs_inode *ip);
> -xfs_failaddr_t xfs_ifork_verify_attr(struct xfs_inode *ip);
> +int xfs_ifork_verify_local_data(struct xfs_inode *ip);
> +int xfs_ifork_verify_local_attr(struct xfs_inode *ip);
>  
>  #endif	/* __XFS_INODE_FORK_H__ */
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index 93967278355de..2ec7789317133 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -3766,25 +3766,12 @@ bool
>  xfs_inode_verify_forks(
>  	struct xfs_inode	*ip)
>  {
> -	struct xfs_ifork	*ifp;
> -	xfs_failaddr_t		fa;
> -
> -	fa = xfs_ifork_verify_data(ip);
> -	if (fa) {
> -		ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
> -		xfs_inode_verifier_error(ip, -EFSCORRUPTED, "data fork",
> -				ifp->if_u1.if_data, ifp->if_bytes, fa);
> +	if (ip->i_d.di_format == XFS_DINODE_FMT_LOCAL &&
> +	    xfs_ifork_verify_local_data(ip))
>  		return false;
> -	}
> -
> -	fa = xfs_ifork_verify_attr(ip);
> -	if (fa) {
> -		ifp = XFS_IFORK_PTR(ip, XFS_ATTR_FORK);
> -		xfs_inode_verifier_error(ip, -EFSCORRUPTED, "attr fork",
> -				ifp ? ifp->if_u1.if_data : NULL,
> -				ifp ? ifp->if_bytes : 0, fa);
> +	if (ip->i_d.di_aformat == XFS_DINODE_FMT_LOCAL &&
> +	    xfs_ifork_verify_local_attr(ip))
>  		return false;
> -	}
>  	return true;
>  }
>  
> -- 
> 2.26.2
> 


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 11/12] xfs: remove the special COW fork handling in xfs_bmapi_read
  2020-05-01  8:14 ` [PATCH 11/12] xfs: remove the special COW fork handling in xfs_bmapi_read Christoph Hellwig
@ 2020-05-01 15:57   ` Brian Foster
  0 siblings, 0 replies; 41+ messages in thread
From: Brian Foster @ 2020-05-01 15:57 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Fri, May 01, 2020 at 10:14:23AM +0200, Christoph Hellwig wrote:
> We don't call xfs_bmapi_read for the COW fork anymore, so remove the
> special casing.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---

Seems fine, though I wonder if I we really need that DEBUG check just
for the alert message in the same branch. ISTM we could drop the ifdef
or the whole hunk..

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  fs/xfs/libxfs/xfs_bmap.c | 13 +------------
>  1 file changed, 1 insertion(+), 12 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> index fda13cd7add0e..76be1a18e2442 100644
> --- a/fs/xfs/libxfs/xfs_bmap.c
> +++ b/fs/xfs/libxfs/xfs_bmap.c
> @@ -3902,8 +3902,7 @@ xfs_bmapi_read(
>  	int			whichfork = xfs_bmapi_whichfork(flags);
>  
>  	ASSERT(*nmap >= 1);
> -	ASSERT(!(flags & ~(XFS_BMAPI_ATTRFORK|XFS_BMAPI_ENTIRE|
> -			   XFS_BMAPI_COWFORK)));
> +	ASSERT(!(flags & ~(XFS_BMAPI_ATTRFORK | XFS_BMAPI_ENTIRE)));
>  	ASSERT(xfs_isilocked(ip, XFS_ILOCK_SHARED|XFS_ILOCK_EXCL));
>  
>  	if (XFS_IS_CORRUPT(mp, !xfs_ifork_has_extents(ip, whichfork)) ||
> @@ -3918,16 +3917,6 @@ xfs_bmapi_read(
>  
>  	ifp = XFS_IFORK_PTR(ip, whichfork);
>  	if (!ifp) {
> -		/* No CoW fork?  Return a hole. */
> -		if (whichfork == XFS_COW_FORK) {
> -			mval->br_startoff = bno;
> -			mval->br_startblock = HOLESTARTBLOCK;
> -			mval->br_blockcount = len;
> -			mval->br_state = XFS_EXT_NORM;
> -			*nmap = 1;
> -			return 0;
> -		}
> -
>  		/*
>  		 * A missing attr ifork implies that the inode says we're in
>  		 * extents or btree format but failed to pass the inode fork
> -- 
> 2.26.2
> 


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 12/12] xfs: remove the NULL fork handling in xfs_bmapi_read
  2020-05-01  8:14 ` [PATCH 12/12] xfs: remove the NULL " Christoph Hellwig
@ 2020-05-01 15:58   ` Brian Foster
  0 siblings, 0 replies; 41+ messages in thread
From: Brian Foster @ 2020-05-01 15:58 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Fri, May 01, 2020 at 10:14:24AM +0200, Christoph Hellwig wrote:
> Now that we fully verify the inode forks before they are added to the
> inode cache, the crash reported in
> 
>   https://bugzilla.kernel.org/show_bug.cgi?id=204031
> 
> can't happen anymore, as we'll never let an inode that has inconsistent
> nextents counts vs the presence of an in-core attr fork leak into the
> inactivate code path.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/xfs/libxfs/xfs_bmap.c | 19 ++-----------------
>  1 file changed, 2 insertions(+), 17 deletions(-)
> 
> diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> index 76be1a18e2442..4246f2fd5b144 100644
> --- a/fs/xfs/libxfs/xfs_bmap.c
> +++ b/fs/xfs/libxfs/xfs_bmap.c
> @@ -3891,7 +3891,8 @@ xfs_bmapi_read(
>  	int			flags)
>  {
>  	struct xfs_mount	*mp = ip->i_mount;
> -	struct xfs_ifork	*ifp;
> +	int			whichfork = xfs_bmapi_whichfork(flags);
> +	struct xfs_ifork	*ifp = XFS_IFORK_PTR(ip, whichfork);
>  	struct xfs_bmbt_irec	got;
>  	xfs_fileoff_t		obno;
>  	xfs_fileoff_t		end;
> @@ -3899,7 +3900,6 @@ xfs_bmapi_read(
>  	int			error;
>  	bool			eof = false;
>  	int			n = 0;
> -	int			whichfork = xfs_bmapi_whichfork(flags);
>  
>  	ASSERT(*nmap >= 1);
>  	ASSERT(!(flags & ~(XFS_BMAPI_ATTRFORK | XFS_BMAPI_ENTIRE)));
> @@ -3915,21 +3915,6 @@ xfs_bmapi_read(
>  
>  	XFS_STATS_INC(mp, xs_blk_mapr);
>  
> -	ifp = XFS_IFORK_PTR(ip, whichfork);
> -	if (!ifp) {
> -		/*
> -		 * A missing attr ifork implies that the inode says we're in
> -		 * extents or btree format but failed to pass the inode fork
> -		 * verifier while trying to load it.  Treat that as a file
> -		 * corruption too.
> -		 */
> -#ifdef DEBUG
> -		xfs_alert(mp, "%s: inode %llu missing fork %d",
> -				__func__, ip->i_ino, whichfork);
> -#endif /* DEBUG */
> -		return -EFSCORRUPTED;
> -	}
> -

Well that addresses my thought on the previous patch, but I don't see
the value in removing the check entirely. It might be safe for the inode
from disk path, but that doesn't preclude current or future runtime bugs
associated with xattr removal (i.e. fork removal) or inappropriate use
of XFS_BMAPI_ATTRFORK, for example. In fact, I think it makes sense for
any inappropriate use of xfs_bmapi_read() due to lack of the associated
fork to return an error rather than explode.

Brian

>  	if (!(ifp->if_flags & XFS_IFEXTENTS)) {
>  		error = xfs_iread_extents(NULL, ip, whichfork);
>  		if (error)
> -- 
> 2.26.2
> 


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 08/12] xfs: remove xfs_ifork_ops
  2020-05-01 15:56   ` Brian Foster
@ 2020-05-01 16:08     ` Darrick J. Wong
  2020-05-01 16:38       ` Christoph Hellwig
  0 siblings, 1 reply; 41+ messages in thread
From: Darrick J. Wong @ 2020-05-01 16:08 UTC (permalink / raw)
  To: Brian Foster; +Cc: Christoph Hellwig, linux-xfs

On Fri, May 01, 2020 at 11:56:49AM -0400, Brian Foster wrote:
> On Fri, May 01, 2020 at 10:14:20AM +0200, Christoph Hellwig wrote:
> > xfs_ifork_ops add up to two indirect calls per inode read and flush,
> > despite just having a single instance in the kernel.  In xfsprogs
> > phase6 in xfs_repair overrides the verify_dir method to deal with inodes
> > that do not have a valid parent.  Instead of the costly indirection just
> > life the repair code into xfs_dir2_sf.c under a condition that ensures
> > it is compiled as part of a kernel build, but instantly eliminated as
> > it is unreachable.
> > 
> > Signed-off-by: Christoph Hellwig <hch@lst.de>
> > ---
> >  fs/xfs/libxfs/xfs_dir2_sf.c    | 64 ++++++++++++++++++++++++++++++++--
> >  fs/xfs/libxfs/xfs_inode_fork.c | 19 +++-------
> >  fs/xfs/libxfs/xfs_inode_fork.h | 15 ++------
> >  fs/xfs/xfs_inode.c             |  4 +--
> >  4 files changed, 71 insertions(+), 31 deletions(-)
> > 
> > diff --git a/fs/xfs/libxfs/xfs_dir2_sf.c b/fs/xfs/libxfs/xfs_dir2_sf.c
> > index 7b7f6fb2ea3b2..1f6c30b68917c 100644
> > --- a/fs/xfs/libxfs/xfs_dir2_sf.c
> > +++ b/fs/xfs/libxfs/xfs_dir2_sf.c
> ...
> > @@ -804,6 +804,66 @@ xfs_dir2_sf_verify(
> >  	return NULL;
> >  }
> >  
> > +/*
> > + * When we're checking directory inodes, we're allowed to set a directory's
> > + * dotdot entry to zero to signal that the parent needs to be reconnected
> > + * during xfs_repair phase 6.  If we're handling a shortform directory the ifork
> > + * verifiers will fail, so temporarily patch out this canary so that we can
> > + * verify the rest of the fork and move on to fixing the dir.
> > + */
> > +static xfs_failaddr_t
> > +xfs_dir2_sf_verify_dir_check(
> > +	struct xfs_inode		*ip)
> > +{
> > +	struct xfs_mount		*mp = ip->i_mount;
> > +	struct xfs_ifork		*ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
> > +	struct xfs_dir2_sf_hdr		*sfp =
> > +		(struct xfs_dir2_sf_hdr *)ifp->if_u1.if_data;
> > +	int				size = ifp->if_bytes;
> > +	bool				parent_bypass = false;
> > +	xfs_ino_t			old_parent;
> > +	xfs_failaddr_t			fa;
> > +
> > +	/*
> > +	 * If this is a shortform directory, phase4 in xfs_repair may have set
> > +	 * the parent inode to zero to indicate that it must be fixed.
> > +	 * Temporarily set a valid parent so that the directory verifier will
> > +	 * pass.
> > +	 */
> > +	if (size > offsetof(struct xfs_dir2_sf_hdr, parent) &&
> > +	    size >= xfs_dir2_sf_hdr_size(sfp->i8count)) {
> > +		old_parent = xfs_dir2_sf_get_parent_ino(sfp);
> > +		if (!old_parent) {
> > +			xfs_dir2_sf_put_parent_ino(sfp, mp->m_sb.sb_rootino);
> > +			parent_bypass = true;
> > +		}
> > +	}
> > +
> > +	fa = __xfs_dir2_sf_verify(ip);
> > +
> > +	/* Put it back. */
> > +	if (parent_bypass)
> > +		xfs_dir2_sf_put_parent_ino(sfp, old_parent);
> > +	return fa;
> > +}
> 
> I'm not sure the cleanup is worth the kludge of including repair code in
> the kernel like this. It might be better to reduce or replace ifork_ops
> to a single directory function pointer until there's a reason for this
> to become common. I dunno, maybe others have thoughts...

One of the online repair gaps I haven't figured out how to close yet is
what to do when there's a short format directory that fails validation
(such that iget fails).  The inode repairer gets stuck with the job of
fixing the sf dir, but the (future) directory repair code will have all
the expertise in fixing directories.  Regrettably, it also requires a
working xfs_inode.

So I could just set the sf parent to some obviously garbage value (like
repair does) to make the verifiers pass and then trip the directory
repair, and then this hunk would be useful to have in the kernel.  OTOH
that means more special case flags and other junk, just to end up with
this kludge that sucks even for xfs_repair.

OTOH I have spent quite a bit of time trying to figure out how to kill
that stupid kludge of repair's, and come up emptyhanded, so <shrug>?

--D

> Brian
> 
> > +
> > +/*
> > + * Allow xfs_repair to enable the parent bypass mode.  For now this is entirely
> > + * unused in the kernel, but might come in useful for online repair eventually.
> > + */
> > +#ifndef xfs_inode_parent_bypass
> > +#define xfs_inode_parent_bypass(ip)	0
> > +#endif
> > +
> > +xfs_failaddr_t
> > +xfs_dir2_sf_verify(
> > +	struct xfs_inode		*ip)
> > +{
> > +	if (xfs_inode_parent_bypass(ip))
> > +		return xfs_dir2_sf_verify_dir_check(ip);
> > +	return __xfs_dir2_sf_verify(ip);
> > +}
> > +
> >  /*
> >   * Create a new (shortform) directory.
> >   */
> > diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c
> > index f30d43364aa92..f6dcee919f59e 100644
> > --- a/fs/xfs/libxfs/xfs_inode_fork.c
> > +++ b/fs/xfs/libxfs/xfs_inode_fork.c
> > @@ -673,18 +673,10 @@ xfs_ifork_init_cow(
> >  	ip->i_cnextents = 0;
> >  }
> >  
> > -/* Default fork content verifiers. */
> > -struct xfs_ifork_ops xfs_default_ifork_ops = {
> > -	.verify_attr	= xfs_attr_shortform_verify,
> > -	.verify_dir	= xfs_dir2_sf_verify,
> > -	.verify_symlink	= xfs_symlink_shortform_verify,
> > -};
> > -
> >  /* Verify the inline contents of the data fork of an inode. */
> >  xfs_failaddr_t
> >  xfs_ifork_verify_data(
> > -	struct xfs_inode	*ip,
> > -	struct xfs_ifork_ops	*ops)
> > +	struct xfs_inode	*ip)
> >  {
> >  	/* Non-local data fork, we're done. */
> >  	if (ip->i_d.di_format != XFS_DINODE_FMT_LOCAL)
> > @@ -693,9 +685,9 @@ xfs_ifork_verify_data(
> >  	/* Check the inline data fork if there is one. */
> >  	switch (VFS_I(ip)->i_mode & S_IFMT) {
> >  	case S_IFDIR:
> > -		return ops->verify_dir(ip);
> > +		return xfs_dir2_sf_verify(ip);
> >  	case S_IFLNK:
> > -		return ops->verify_symlink(ip);
> > +		return xfs_symlink_shortform_verify(ip);
> >  	default:
> >  		return NULL;
> >  	}
> > @@ -704,13 +696,12 @@ xfs_ifork_verify_data(
> >  /* Verify the inline contents of the attr fork of an inode. */
> >  xfs_failaddr_t
> >  xfs_ifork_verify_attr(
> > -	struct xfs_inode	*ip,
> > -	struct xfs_ifork_ops	*ops)
> > +	struct xfs_inode	*ip)
> >  {
> >  	/* There has to be an attr fork allocated if aformat is local. */
> >  	if (ip->i_d.di_aformat != XFS_DINODE_FMT_LOCAL)
> >  		return NULL;
> >  	if (!XFS_IFORK_PTR(ip, XFS_ATTR_FORK))
> >  		return __this_address;
> > -	return ops->verify_attr(ip);
> > +	return xfs_attr_shortform_verify(ip);
> >  }
> > diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
> > index 8487b0c88a75e..3f84d33abd3b7 100644
> > --- a/fs/xfs/libxfs/xfs_inode_fork.h
> > +++ b/fs/xfs/libxfs/xfs_inode_fork.h
> > @@ -176,18 +176,7 @@ extern struct kmem_zone	*xfs_ifork_zone;
> >  
> >  extern void xfs_ifork_init_cow(struct xfs_inode *ip);
> >  
> > -typedef xfs_failaddr_t (*xfs_ifork_verifier_t)(struct xfs_inode *);
> > -
> > -struct xfs_ifork_ops {
> > -	xfs_ifork_verifier_t	verify_symlink;
> > -	xfs_ifork_verifier_t	verify_dir;
> > -	xfs_ifork_verifier_t	verify_attr;
> > -};
> > -extern struct xfs_ifork_ops	xfs_default_ifork_ops;
> > -
> > -xfs_failaddr_t xfs_ifork_verify_data(struct xfs_inode *ip,
> > -		struct xfs_ifork_ops *ops);
> > -xfs_failaddr_t xfs_ifork_verify_attr(struct xfs_inode *ip,
> > -		struct xfs_ifork_ops *ops);
> > +xfs_failaddr_t xfs_ifork_verify_data(struct xfs_inode *ip);
> > +xfs_failaddr_t xfs_ifork_verify_attr(struct xfs_inode *ip);
> >  
> >  #endif	/* __XFS_INODE_FORK_H__ */
> > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> > index d1772786af29d..93967278355de 100644
> > --- a/fs/xfs/xfs_inode.c
> > +++ b/fs/xfs/xfs_inode.c
> > @@ -3769,7 +3769,7 @@ xfs_inode_verify_forks(
> >  	struct xfs_ifork	*ifp;
> >  	xfs_failaddr_t		fa;
> >  
> > -	fa = xfs_ifork_verify_data(ip, &xfs_default_ifork_ops);
> > +	fa = xfs_ifork_verify_data(ip);
> >  	if (fa) {
> >  		ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
> >  		xfs_inode_verifier_error(ip, -EFSCORRUPTED, "data fork",
> > @@ -3777,7 +3777,7 @@ xfs_inode_verify_forks(
> >  		return false;
> >  	}
> >  
> > -	fa = xfs_ifork_verify_attr(ip, &xfs_default_ifork_ops);
> > +	fa = xfs_ifork_verify_attr(ip);
> >  	if (fa) {
> >  		ifp = XFS_IFORK_PTR(ip, XFS_ATTR_FORK);
> >  		xfs_inode_verifier_error(ip, -EFSCORRUPTED, "attr fork",
> > -- 
> > 2.26.2
> > 
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 08/12] xfs: remove xfs_ifork_ops
  2020-05-01 16:08     ` Darrick J. Wong
@ 2020-05-01 16:38       ` Christoph Hellwig
  2020-05-01 16:50         ` Christoph Hellwig
  0 siblings, 1 reply; 41+ messages in thread
From: Christoph Hellwig @ 2020-05-01 16:38 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Brian Foster, Christoph Hellwig, linux-xfs

On Fri, May 01, 2020 at 09:08:09AM -0700, Darrick J. Wong wrote:
> > I'm not sure the cleanup is worth the kludge of including repair code in
> > the kernel like this. It might be better to reduce or replace ifork_ops
> > to a single directory function pointer until there's a reason for this
> > to become common. I dunno, maybe others have thoughts...

The whole point is trying to avoid calling function pointers and
keeping that cruft around.

> 
> One of the online repair gaps I haven't figured out how to close yet is
> what to do when there's a short format directory that fails validation
> (such that iget fails).  The inode repairer gets stuck with the job of
> fixing the sf dir, but the (future) directory repair code will have all
> the expertise in fixing directories.  Regrettably, it also requires a
> working xfs_inode.
> 
> So I could just set the sf parent to some obviously garbage value (like
> repair does) to make the verifiers pass and then trip the directory
> repair, and then this hunk would be useful to have in the kernel.  OTOH
> that means more special case flags and other junk, just to end up with
> this kludge that sucks even for xfs_repair.

That being said my approach here was a little too dumb.  Once we are
all in the same code base we can stop the stupid patching of the
parent and just handle the case directly.  Something like this
incremental diff on top of the sent out version (not actually tested).

Total diffstate with the original patch is:

 4 files changed, 37 insertions(+), 35 deletions(-)

and this should also help with online repair while killing a horrible
kludge.


diff --git a/fs/xfs/libxfs/xfs_dir2_sf.c b/fs/xfs/libxfs/xfs_dir2_sf.c
index 1f6c30b68917c..b4195fafe2172 100644
--- a/fs/xfs/libxfs/xfs_dir2_sf.c
+++ b/fs/xfs/libxfs/xfs_dir2_sf.c
@@ -704,9 +704,17 @@ xfs_dir2_sf_check(
 }
 #endif	/* DEBUG */
 
+/*
+ * Allow xfs_repair to enable the parent bypass mode.  For now this is entirely
+ * unused in the kernel, but might come in useful for online repair eventually.
+ */
+#ifndef xfs_inode_parent_bypass
+#define xfs_inode_parent_bypass(ip)	0
+#endif
+
 /* Verify the consistency of an inline directory. */
-static xfs_failaddr_t
-__xfs_dir2_sf_verify(
+xfs_failaddr_t
+xfs_dir2_sf_verify(
 	struct xfs_inode		*ip)
 {
 	struct xfs_mount		*mp = ip->i_mount;
@@ -738,12 +746,26 @@ __xfs_dir2_sf_verify(
 
 	endp = (char *)sfp + size;
 
-	/* Check .. entry */
-	ino = xfs_dir2_sf_get_parent_ino(sfp);
-	i8count = ino > XFS_DIR2_MAX_SHORT_INUM;
-	error = xfs_dir_ino_validate(mp, ino);
-	if (error)
-		return __this_address;
+	/*
+	 * Check the .. entry.
+	 *
+	 * If we are running a repair, phase4 may have set the parent inode to
+	 * zero to indicate that it must be fixed.  Skip validating the parent
+	 * in that case.
+	 */
+	if (likely(!xfs_inode_parent_bypass(ip))) {
+		ino = xfs_dir2_sf_get_parent_ino(sfp);
+		i8count = ino > XFS_DIR2_MAX_SHORT_INUM;
+		error = xfs_dir_ino_validate(mp, ino);
+		if (error)
+			return __this_address;
+	} else {
+		/*
+		 * Ensure we account the missing parent as in the right format.
+		 */
+		if (sfp->i8count)
+			i8count++;
+	}
 	offset = mp->m_dir_geo->data_first_offset;
 
 	/* Check all reported entries */
@@ -804,66 +826,6 @@ __xfs_dir2_sf_verify(
 	return NULL;
 }
 
-/*
- * When we're checking directory inodes, we're allowed to set a directory's
- * dotdot entry to zero to signal that the parent needs to be reconnected
- * during xfs_repair phase 6.  If we're handling a shortform directory the ifork
- * verifiers will fail, so temporarily patch out this canary so that we can
- * verify the rest of the fork and move on to fixing the dir.
- */
-static xfs_failaddr_t
-xfs_dir2_sf_verify_dir_check(
-	struct xfs_inode		*ip)
-{
-	struct xfs_mount		*mp = ip->i_mount;
-	struct xfs_ifork		*ifp = XFS_IFORK_PTR(ip, XFS_DATA_FORK);
-	struct xfs_dir2_sf_hdr		*sfp =
-		(struct xfs_dir2_sf_hdr *)ifp->if_u1.if_data;
-	int				size = ifp->if_bytes;
-	bool				parent_bypass = false;
-	xfs_ino_t			old_parent;
-	xfs_failaddr_t			fa;
-
-	/*
-	 * If this is a shortform directory, phase4 in xfs_repair may have set
-	 * the parent inode to zero to indicate that it must be fixed.
-	 * Temporarily set a valid parent so that the directory verifier will
-	 * pass.
-	 */
-	if (size > offsetof(struct xfs_dir2_sf_hdr, parent) &&
-	    size >= xfs_dir2_sf_hdr_size(sfp->i8count)) {
-		old_parent = xfs_dir2_sf_get_parent_ino(sfp);
-		if (!old_parent) {
-			xfs_dir2_sf_put_parent_ino(sfp, mp->m_sb.sb_rootino);
-			parent_bypass = true;
-		}
-	}
-
-	fa = __xfs_dir2_sf_verify(ip);
-
-	/* Put it back. */
-	if (parent_bypass)
-		xfs_dir2_sf_put_parent_ino(sfp, old_parent);
-	return fa;
-}
-
-/*
- * Allow xfs_repair to enable the parent bypass mode.  For now this is entirely
- * unused in the kernel, but might come in useful for online repair eventually.
- */
-#ifndef xfs_inode_parent_bypass
-#define xfs_inode_parent_bypass(ip)	0
-#endif
-
-xfs_failaddr_t
-xfs_dir2_sf_verify(
-	struct xfs_inode		*ip)
-{
-	if (xfs_inode_parent_bypass(ip))
-		return xfs_dir2_sf_verify_dir_check(ip);
-	return __xfs_dir2_sf_verify(ip);
-}
-
 /*
  * Create a new (shortform) directory.
  */

^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH 09/12] xfs: refactor xfs_inode_verify_forks
  2020-05-01 15:57   ` Brian Foster
@ 2020-05-01 16:40     ` Christoph Hellwig
  2020-05-01 17:02       ` Brian Foster
  0 siblings, 1 reply; 41+ messages in thread
From: Christoph Hellwig @ 2020-05-01 16:40 UTC (permalink / raw)
  To: Brian Foster; +Cc: Christoph Hellwig, linux-xfs

On Fri, May 01, 2020 at 11:57:24AM -0400, Brian Foster wrote:
> >  	if (!XFS_IFORK_PTR(ip, XFS_ATTR_FORK))
> > -		return __this_address;
> > -	return xfs_attr_shortform_verify(ip);
> > +		fa = __this_address;
> > +	else
> > +		fa = xfs_attr_shortform_verify(ip);
> > +
> > +	if (fa) {
> > +		xfs_inode_verifier_error(ip, -EFSCORRUPTED, "attr fork",
> > +			ip->i_afp->if_u1.if_data, ip->i_afp->if_bytes, fa);
> > +		return -EFSCORRUPTED;
> 
> This explicitly makes !ip->i_afp one of the handled corruption cases for
> XFS_DINODE_FMT_LOCAL, but then attempts to access it anyways. Otherwise
> seems Ok modulo the comments on the previous patch...

No, it keeps the existing bogus behavior, just making the bug more
obvious by moving the bits closer together :(

That being said this is getting fixed later in the series.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 08/12] xfs: remove xfs_ifork_ops
  2020-05-01 16:38       ` Christoph Hellwig
@ 2020-05-01 16:50         ` Christoph Hellwig
  2020-05-01 18:23           ` Brian Foster
  0 siblings, 1 reply; 41+ messages in thread
From: Christoph Hellwig @ 2020-05-01 16:50 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: Brian Foster, Christoph Hellwig, linux-xfs

On Fri, May 01, 2020 at 06:38:09PM +0200, Christoph Hellwig wrote:
> That being said my approach here was a little too dumb.  Once we are
> all in the same code base we can stop the stupid patching of the
> parent and just handle the case directly.  Something like this
> incremental diff on top of the sent out version (not actually tested).
> 
> Total diffstate with the original patch is:
> 
>  4 files changed, 37 insertions(+), 35 deletions(-)
> 
> and this should also help with online repair while killing a horrible
> kludge.

Btw, І wonder if for repair / online repair just skipping the verifiers
entirely would make more sense.  But I think we can go there
incrementally and just keep the existing repair behavior for now.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 09/12] xfs: refactor xfs_inode_verify_forks
  2020-05-01 16:40     ` Christoph Hellwig
@ 2020-05-01 17:02       ` Brian Foster
  2020-05-01 17:08         ` Christoph Hellwig
  0 siblings, 1 reply; 41+ messages in thread
From: Brian Foster @ 2020-05-01 17:02 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Fri, May 01, 2020 at 06:40:09PM +0200, Christoph Hellwig wrote:
> On Fri, May 01, 2020 at 11:57:24AM -0400, Brian Foster wrote:
> > >  	if (!XFS_IFORK_PTR(ip, XFS_ATTR_FORK))
> > > -		return __this_address;
> > > -	return xfs_attr_shortform_verify(ip);
> > > +		fa = __this_address;
> > > +	else
> > > +		fa = xfs_attr_shortform_verify(ip);
> > > +
> > > +	if (fa) {
> > > +		xfs_inode_verifier_error(ip, -EFSCORRUPTED, "attr fork",
> > > +			ip->i_afp->if_u1.if_data, ip->i_afp->if_bytes, fa);
> > > +		return -EFSCORRUPTED;
> > 
> > This explicitly makes !ip->i_afp one of the handled corruption cases for
> > XFS_DINODE_FMT_LOCAL, but then attempts to access it anyways. Otherwise
> > seems Ok modulo the comments on the previous patch...
> 
> No, it keeps the existing bogus behavior, just making the bug more
> obvious by moving the bits closer together :(
> 

The associated code replaced in this patch checks the attr fork pointer:

-               ifp = XFS_IFORK_PTR(ip, XFS_ATTR_FORK);
-               xfs_inode_verifier_error(ip, -EFSCORRUPTED, "attr fork",
-                               ifp ? ifp->if_u1.if_data : NULL,
-                               ifp ? ifp->if_bytes : 0, fa);

Brian

> That being said this is getting fixed later in the series.
> 


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 09/12] xfs: refactor xfs_inode_verify_forks
  2020-05-01 17:02       ` Brian Foster
@ 2020-05-01 17:08         ` Christoph Hellwig
  0 siblings, 0 replies; 41+ messages in thread
From: Christoph Hellwig @ 2020-05-01 17:08 UTC (permalink / raw)
  To: Brian Foster; +Cc: Christoph Hellwig, linux-xfs

On Fri, May 01, 2020 at 01:02:27PM -0400, Brian Foster wrote:
> The associated code replaced in this patch checks the attr fork pointer:
> 
> -               ifp = XFS_IFORK_PTR(ip, XFS_ATTR_FORK);
> -               xfs_inode_verifier_error(ip, -EFSCORRUPTED, "attr fork",
> -                               ifp ? ifp->if_u1.if_data : NULL,
> -                               ifp ? ifp->if_bytes : 0, fa);

Oh, true.  I'll fix this up to keep bisectability.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 08/12] xfs: remove xfs_ifork_ops
  2020-05-01 16:50         ` Christoph Hellwig
@ 2020-05-01 18:23           ` Brian Foster
  2020-05-07 12:34             ` Christoph Hellwig
  0 siblings, 1 reply; 41+ messages in thread
From: Brian Foster @ 2020-05-01 18:23 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Darrick J. Wong, linux-xfs

On Fri, May 01, 2020 at 06:50:17PM +0200, Christoph Hellwig wrote:
> On Fri, May 01, 2020 at 06:38:09PM +0200, Christoph Hellwig wrote:
> > That being said my approach here was a little too dumb.  Once we are
> > all in the same code base we can stop the stupid patching of the
> > parent and just handle the case directly.  Something like this
> > incremental diff on top of the sent out version (not actually tested).
> > 
> > Total diffstate with the original patch is:
> > 
> >  4 files changed, 37 insertions(+), 35 deletions(-)
> > 
> > and this should also help with online repair while killing a horrible
> > kludge.
> 
> Btw, І wonder if for repair / online repair just skipping the verifiers
> entirely would make more sense.  But I think we can go there
> incrementally and just keep the existing repair behavior for now.
> 

Can we use another dummy parent inode value in xfs_repair? It looks to
me that we set it to zero in phase 4 if it fails verification and set
the parent to NULLFSINO (i.e. unknown) in repair's in-core tracking.
Phase 6 walks the directory entries and explicitly sets the parent inode
number of entries with an unknown parent (according to the in-core
tracking). IOW, I don't see where we actually rely on the directory
header having a parent inode of zero outside of detecting it in the
custom verifier. If that's the only functional purpose, I wonder if we
could do something like set the bogus parent field of a sf dir to the
root inode or to itself, that way the default verifier wouldn't trip
over it..

Brian


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 03/12] xfs: split xfs_iformat_fork
  2020-05-01 13:34   ` Brian Foster
@ 2020-05-07 12:27     ` Christoph Hellwig
  2020-05-07 13:40       ` Brian Foster
  0 siblings, 1 reply; 41+ messages in thread
From: Christoph Hellwig @ 2020-05-07 12:27 UTC (permalink / raw)
  To: Brian Foster; +Cc: Christoph Hellwig, linux-xfs

On Fri, May 01, 2020 at 09:34:31AM -0400, Brian Foster wrote:
> > +	default:
> > +		xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__, dip,
> > +				sizeof(*dip), __this_address);
> > +		return -EFSCORRUPTED;
> > +	}
> 
> Can we fix this function up to use an error variable and return error at
> the end like xfs_iformat_attr_work() does? Otherwise nice cleanup..

What would the benefit of a local variable be here?  It just adds a
little extra code for no real gain.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 08/12] xfs: remove xfs_ifork_ops
  2020-05-01 18:23           ` Brian Foster
@ 2020-05-07 12:34             ` Christoph Hellwig
  2020-05-07 13:43               ` Brian Foster
  0 siblings, 1 reply; 41+ messages in thread
From: Christoph Hellwig @ 2020-05-07 12:34 UTC (permalink / raw)
  To: Brian Foster; +Cc: Christoph Hellwig, Darrick J. Wong, linux-xfs

On Fri, May 01, 2020 at 02:23:16PM -0400, Brian Foster wrote:
> Can we use another dummy parent inode value in xfs_repair? It looks to
> me that we set it to zero in phase 4 if it fails verification and set
> the parent to NULLFSINO (i.e. unknown) in repair's in-core tracking.
> Phase 6 walks the directory entries and explicitly sets the parent inode
> number of entries with an unknown parent (according to the in-core
> tracking). IOW, I don't see where we actually rely on the directory
> header having a parent inode of zero outside of detecting it in the
> custom verifier. If that's the only functional purpose, I wonder if we
> could do something like set the bogus parent field of a sf dir to the
> root inode or to itself, that way the default verifier wouldn't trip
> over it..

I don't think we need a dummy parent at all - we can just skip the
parent validation entirely, which is what my incremental patch does.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 03/12] xfs: split xfs_iformat_fork
  2020-05-07 12:27     ` Christoph Hellwig
@ 2020-05-07 13:40       ` Brian Foster
  2020-05-07 13:42         ` Christoph Hellwig
  0 siblings, 1 reply; 41+ messages in thread
From: Brian Foster @ 2020-05-07 13:40 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Thu, May 07, 2020 at 02:27:18PM +0200, Christoph Hellwig wrote:
> On Fri, May 01, 2020 at 09:34:31AM -0400, Brian Foster wrote:
> > > +	default:
> > > +		xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__, dip,
> > > +				sizeof(*dip), __this_address);
> > > +		return -EFSCORRUPTED;
> > > +	}
> > 
> > Can we fix this function up to use an error variable and return error at
> > the end like xfs_iformat_attr_work() does? Otherwise nice cleanup..
> 
> What would the benefit of a local variable be here?  It just adds a
> little extra code for no real gain.
> 

It looks like the variable is already defined, it's just not used
consistently. The only extra code are break statements in the switch and
a return statement at the end of the function, which currently looks odd
without it IMO.

Brian


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 03/12] xfs: split xfs_iformat_fork
  2020-05-07 13:40       ` Brian Foster
@ 2020-05-07 13:42         ` Christoph Hellwig
  0 siblings, 0 replies; 41+ messages in thread
From: Christoph Hellwig @ 2020-05-07 13:42 UTC (permalink / raw)
  To: Brian Foster; +Cc: Christoph Hellwig, linux-xfs

On Thu, May 07, 2020 at 09:40:22AM -0400, Brian Foster wrote:
> On Thu, May 07, 2020 at 02:27:18PM +0200, Christoph Hellwig wrote:
> > On Fri, May 01, 2020 at 09:34:31AM -0400, Brian Foster wrote:
> > > > +	default:
> > > > +		xfs_inode_verifier_error(ip, -EFSCORRUPTED, __func__, dip,
> > > > +				sizeof(*dip), __this_address);
> > > > +		return -EFSCORRUPTED;
> > > > +	}
> > > 
> > > Can we fix this function up to use an error variable and return error at
> > > the end like xfs_iformat_attr_work() does? Otherwise nice cleanup..
> > 
> > What would the benefit of a local variable be here?  It just adds a
> > little extra code for no real gain.
> > 
> 
> It looks like the variable is already defined, it's just not used
> consistently. The only extra code are break statements in the switch and
> a return statement at the end of the function, which currently looks odd
> without it IMO.

As of this patch there is no local error variable.  Later on it gets
added, but only used in a single place for the fork verifier.  I find
functions that basically are a big switch statement and return from
each case pretty normal.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 08/12] xfs: remove xfs_ifork_ops
  2020-05-07 12:34             ` Christoph Hellwig
@ 2020-05-07 13:43               ` Brian Foster
  2020-05-07 16:28                 ` Brian Foster
  0 siblings, 1 reply; 41+ messages in thread
From: Brian Foster @ 2020-05-07 13:43 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Darrick J. Wong, linux-xfs

On Thu, May 07, 2020 at 02:34:11PM +0200, Christoph Hellwig wrote:
> On Fri, May 01, 2020 at 02:23:16PM -0400, Brian Foster wrote:
> > Can we use another dummy parent inode value in xfs_repair? It looks to
> > me that we set it to zero in phase 4 if it fails verification and set
> > the parent to NULLFSINO (i.e. unknown) in repair's in-core tracking.
> > Phase 6 walks the directory entries and explicitly sets the parent inode
> > number of entries with an unknown parent (according to the in-core
> > tracking). IOW, I don't see where we actually rely on the directory
> > header having a parent inode of zero outside of detecting it in the
> > custom verifier. If that's the only functional purpose, I wonder if we
> > could do something like set the bogus parent field of a sf dir to the
> > root inode or to itself, that way the default verifier wouldn't trip
> > over it..
> 
> I don't think we need a dummy parent at all - we can just skip the
> parent validation entirely, which is what my incremental patch does.
> 

xfs_repair already skips the parent validation, this patch just
refactors it. What I was considering above is whether repair uses the
current dummy value of zero for any functional reason. If not, it kind
of looks like the earlier phase of repair checks the parent, sees that
it would fail a verifier, replaces it with zero (which would also fail
the verifier) and then eventually replaces zero with a valid parent or
ditches the entry in phase 6. If we placed a temporary parent value in
the early phase that wouldn't explicitly fail a verifier by being an
invalid inode number (instead of using 0 to notify the verifier to skip
the validation), then we wouldn't need to skip the parent validation in
phase 6 when we look up the inode again.

Brian


^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 08/12] xfs: remove xfs_ifork_ops
  2020-05-07 13:43               ` Brian Foster
@ 2020-05-07 16:28                 ` Brian Foster
  2020-05-07 17:18                   ` Christoph Hellwig
  0 siblings, 1 reply; 41+ messages in thread
From: Brian Foster @ 2020-05-07 16:28 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Darrick J. Wong, linux-xfs

On Thu, May 07, 2020 at 09:43:55AM -0400, Brian Foster wrote:
> On Thu, May 07, 2020 at 02:34:11PM +0200, Christoph Hellwig wrote:
> > On Fri, May 01, 2020 at 02:23:16PM -0400, Brian Foster wrote:
> > > Can we use another dummy parent inode value in xfs_repair? It looks to
> > > me that we set it to zero in phase 4 if it fails verification and set
> > > the parent to NULLFSINO (i.e. unknown) in repair's in-core tracking.
> > > Phase 6 walks the directory entries and explicitly sets the parent inode
> > > number of entries with an unknown parent (according to the in-core
> > > tracking). IOW, I don't see where we actually rely on the directory
> > > header having a parent inode of zero outside of detecting it in the
> > > custom verifier. If that's the only functional purpose, I wonder if we
> > > could do something like set the bogus parent field of a sf dir to the
> > > root inode or to itself, that way the default verifier wouldn't trip
> > > over it..
> > 
> > I don't think we need a dummy parent at all - we can just skip the
> > parent validation entirely, which is what my incremental patch does.
> > 
> 
> xfs_repair already skips the parent validation, this patch just
> refactors it. What I was considering above is whether repair uses the
> current dummy value of zero for any functional reason. If not, it kind
> of looks like the earlier phase of repair checks the parent, sees that
> it would fail a verifier, replaces it with zero (which would also fail
> the verifier) and then eventually replaces zero with a valid parent or
> ditches the entry in phase 6. If we placed a temporary parent value in
> the early phase that wouldn't explicitly fail a verifier by being an
> invalid inode number (instead of using 0 to notify the verifier to skip
> the validation), then we wouldn't need to skip the parent validation in
> phase 6 when we look up the inode again.
> 
...

To demonstrate, I hacked on repair a bit using an fs with an
intentionally corrupted shortform parent inode and had to make the
following tweaks to work around the custom fork verifier. The
ino_discovery checks were added because phases 3 and 4 toggle that flag
such that the former clears the parent value in the inode, but the
latter actually updates the external parent tracking. IOW, setting a
"valid" inode in phase 3 would otherwise trick phase 4 into using it.
I'd probably try to think of something cleaner for that issue if we were
to take such an approach.

Brian

diff --git a/repair/dir2.c b/repair/dir2.c
index cbbce601..c30ccb37 100644
--- a/repair/dir2.c
+++ b/repair/dir2.c
@@ -165,7 +165,7 @@ process_sf_dir2(
 	int			tmp_elen;
 	int			tmp_len;
 	xfs_dir2_sf_entry_t	*tmp_sfep;
-	xfs_ino_t		zero = 0;
+	xfs_ino_t		zero = mp->m_sb.sb_rootino;
 
 	sfp = (struct xfs_dir2_sf_hdr *)XFS_DFORK_DPTR(dip);
 	max_size = XFS_DFORK_DSIZE(dip, mp);
@@ -494,7 +494,8 @@ _("bogus .. inode number (%" PRIu64 ") in directory inode %" PRIu64 ", "),
 		if (!no_modify)  {
 			do_warn(_("clearing inode number\n"));
 
-			libxfs_dir2_sf_put_parent_ino(sfp, zero);
+			if (!ino_discovery)
+				libxfs_dir2_sf_put_parent_ino(sfp, zero);
 			*dino_dirty = 1;
 			*repair = 1;
 		} else  {
@@ -528,8 +529,8 @@ _("bad .. entry in directory inode %" PRIu64 ", points to self, "),
 			ino);
 		if (!no_modify)  {
 			do_warn(_("clearing inode number\n"));
-
-			libxfs_dir2_sf_put_parent_ino(sfp, zero);
+			if (!ino_discovery)
+				libxfs_dir2_sf_put_parent_ino(sfp, zero);
 			*dino_dirty = 1;
 			*repair = 1;
 		} else  {
diff --git a/repair/phase6.c b/repair/phase6.c
index beceea9a..613ca578 100644
--- a/repair/phase6.c
+++ b/repair/phase6.c
@@ -1104,7 +1104,7 @@ mv_orphanage(
 					(unsigned long long)ino, ++incr);
 
 	/* Orphans may not have a proper parent, so use custom ops here */
-	err = -libxfs_iget(mp, NULL, ino, 0, &ino_p, &phase6_ifork_ops);
+	err = -libxfs_iget(mp, NULL, ino, 0, &ino_p, &xfs_default_ifork_ops);
 	if (err)
 		do_error(_("%d - couldn't iget disconnected inode\n"), err);
 
@@ -2875,7 +2875,7 @@ process_dir_inode(
 
 	ASSERT(!is_inode_refchecked(irec, ino_offset) || dotdot_update);
 
-	error = -libxfs_iget(mp, NULL, ino, 0, &ip, &phase6_ifork_ops);
+	error = -libxfs_iget(mp, NULL, ino, 0, &ip, &xfs_default_ifork_ops);
 	if (error) {
 		if (!no_modify)
 			do_error(


^ permalink raw reply related	[flat|nested] 41+ messages in thread

* Re: [PATCH 08/12] xfs: remove xfs_ifork_ops
  2020-05-07 16:28                 ` Brian Foster
@ 2020-05-07 17:18                   ` Christoph Hellwig
  2020-05-12 23:50                     ` Darrick J. Wong
  0 siblings, 1 reply; 41+ messages in thread
From: Christoph Hellwig @ 2020-05-07 17:18 UTC (permalink / raw)
  To: Brian Foster; +Cc: Christoph Hellwig, Darrick J. Wong, linux-xfs

On Thu, May 07, 2020 at 12:28:46PM -0400, Brian Foster wrote:
> To demonstrate, I hacked on repair a bit using an fs with an
> intentionally corrupted shortform parent inode and had to make the
> following tweaks to work around the custom fork verifier. The
> ino_discovery checks were added because phases 3 and 4 toggle that flag
> such that the former clears the parent value in the inode, but the
> latter actually updates the external parent tracking. IOW, setting a
> "valid" inode in phase 3 would otherwise trick phase 4 into using it.
> I'd probably try to think of something cleaner for that issue if we were
> to take such an approach.

Ok, so instead of clearing the parent we'll set it to a guaranteed good
value (the root ino).  That could kill the workaround I had entirely.

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 08/12] xfs: remove xfs_ifork_ops
  2020-05-07 17:18                   ` Christoph Hellwig
@ 2020-05-12 23:50                     ` Darrick J. Wong
  0 siblings, 0 replies; 41+ messages in thread
From: Darrick J. Wong @ 2020-05-12 23:50 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Brian Foster, linux-xfs

On Thu, May 07, 2020 at 07:18:57PM +0200, Christoph Hellwig wrote:
> On Thu, May 07, 2020 at 12:28:46PM -0400, Brian Foster wrote:
> > To demonstrate, I hacked on repair a bit using an fs with an
> > intentionally corrupted shortform parent inode and had to make the
> > following tweaks to work around the custom fork verifier. The
> > ino_discovery checks were added because phases 3 and 4 toggle that flag
> > such that the former clears the parent value in the inode, but the
> > latter actually updates the external parent tracking. IOW, setting a
> > "valid" inode in phase 3 would otherwise trick phase 4 into using it.
> > I'd probably try to think of something cleaner for that issue if we were
> > to take such an approach.
> 
> Ok, so instead of clearing the parent we'll set it to a guaranteed good
> value (the root ino).  That could kill the workaround I had entirely.

Seems reasonable to me, but someone should try it and see how xfs_repair
reacts.  I think it ought to be fine since it will detect the lack of a
rootdir entry pointing to the damaged dir and move it to lost+found.

--D

^ permalink raw reply	[flat|nested] 41+ messages in thread

* Re: [PATCH 05/12] xfs: call xfs_dinode_verify from xfs_inode_from_disk
  2020-05-08  6:34 ` [PATCH 05/12] xfs: call xfs_dinode_verify from xfs_inode_from_disk Christoph Hellwig
@ 2020-05-16 17:40   ` Darrick J. Wong
  0 siblings, 0 replies; 41+ messages in thread
From: Darrick J. Wong @ 2020-05-16 17:40 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs, Brian Foster

On Fri, May 08, 2020 at 08:34:16AM +0200, Christoph Hellwig wrote:
> Keep the code dealing with the dinode together, and also ensure we verify
> the dinode in the owner change log recovery case as well.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> Reviewed-by: Brian Foster <bfoster@redhat.com>

Seems reasonable to me,
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> ---
>  .../xfs-self-describing-metadata.txt           | 10 +++++-----
>  fs/xfs/libxfs/xfs_inode_buf.c                  | 18 ++++++++----------
>  2 files changed, 13 insertions(+), 15 deletions(-)
> 
> diff --git a/Documentation/filesystems/xfs-self-describing-metadata.txt b/Documentation/filesystems/xfs-self-describing-metadata.txt
> index 8db0121d0980c..e912699d74301 100644
> --- a/Documentation/filesystems/xfs-self-describing-metadata.txt
> +++ b/Documentation/filesystems/xfs-self-describing-metadata.txt
> @@ -337,11 +337,11 @@ buffer.
>  
>  The structure of the verifiers and the identifiers checks is very similar to the
>  buffer code described above. The only difference is where they are called. For
> -example, inode read verification is done in xfs_iread() when the inode is first
> -read out of the buffer and the struct xfs_inode is instantiated. The inode is
> -already extensively verified during writeback in xfs_iflush_int, so the only
> -addition here is to add the LSN and CRC to the inode as it is copied back into
> -the buffer.
> +example, inode read verification is done in xfs_inode_from_disk() when the inode
> +is first read out of the buffer and the struct xfs_inode is instantiated. The
> +inode is already extensively verified during writeback in xfs_iflush_int, so the
> +only addition here is to add the LSN and CRC to the inode as it is copied back
> +into the buffer.
>  
>  XXX: inode unlinked list modification doesn't recalculate the inode CRC! None of
>  the unlinked list modifications check or update CRCs, neither during unlink nor
> diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c
> index 686a026b5f6ed..3aac22e892985 100644
> --- a/fs/xfs/libxfs/xfs_inode_buf.c
> +++ b/fs/xfs/libxfs/xfs_inode_buf.c
> @@ -188,10 +188,18 @@ xfs_inode_from_disk(
>  	struct xfs_icdinode	*to = &ip->i_d;
>  	struct inode		*inode = VFS_I(ip);
>  	int			error;
> +	xfs_failaddr_t		fa;
>  
>  	ASSERT(ip->i_cowfp == NULL);
>  	ASSERT(ip->i_afp == NULL);
>  
> +	fa = xfs_dinode_verify(ip->i_mount, ip->i_ino, from);
> +	if (fa) {
> +		xfs_inode_verifier_error(ip, -EFSCORRUPTED, "dinode", from,
> +				sizeof(*from), fa);
> +		return -EFSCORRUPTED;
> +	}
> +
>  	/*
>  	 * First get the permanent information that is needed to allocate an
>  	 * inode. If the inode is unused, mode is zero and we shouldn't mess
> @@ -627,7 +635,6 @@ xfs_iread(
>  {
>  	xfs_buf_t	*bp;
>  	xfs_dinode_t	*dip;
> -	xfs_failaddr_t	fa;
>  	int		error;
>  
>  	/*
> @@ -652,15 +659,6 @@ xfs_iread(
>  	if (error)
>  		return error;
>  
> -	/* even unallocated inodes are verified */
> -	fa = xfs_dinode_verify(mp, ip->i_ino, dip);
> -	if (fa) {
> -		xfs_inode_verifier_error(ip, -EFSCORRUPTED, "dinode", dip,
> -				sizeof(*dip), fa);
> -		error = -EFSCORRUPTED;
> -		goto out_brelse;
> -	}
> -
>  	error = xfs_inode_from_disk(ip, dip);
>  	if (error)
>  		goto out_brelse;
> -- 
> 2.26.2
> 

^ permalink raw reply	[flat|nested] 41+ messages in thread

* [PATCH 05/12] xfs: call xfs_dinode_verify from xfs_inode_from_disk
  2020-05-08  6:34 dinode reading cleanups v2 Christoph Hellwig
@ 2020-05-08  6:34 ` Christoph Hellwig
  2020-05-16 17:40   ` Darrick J. Wong
  0 siblings, 1 reply; 41+ messages in thread
From: Christoph Hellwig @ 2020-05-08  6:34 UTC (permalink / raw)
  To: linux-xfs; +Cc: Brian Foster

Keep the code dealing with the dinode together, and also ensure we verify
the dinode in the owner change log recovery case as well.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Brian Foster <bfoster@redhat.com>
---
 .../xfs-self-describing-metadata.txt           | 10 +++++-----
 fs/xfs/libxfs/xfs_inode_buf.c                  | 18 ++++++++----------
 2 files changed, 13 insertions(+), 15 deletions(-)

diff --git a/Documentation/filesystems/xfs-self-describing-metadata.txt b/Documentation/filesystems/xfs-self-describing-metadata.txt
index 8db0121d0980c..e912699d74301 100644
--- a/Documentation/filesystems/xfs-self-describing-metadata.txt
+++ b/Documentation/filesystems/xfs-self-describing-metadata.txt
@@ -337,11 +337,11 @@ buffer.
 
 The structure of the verifiers and the identifiers checks is very similar to the
 buffer code described above. The only difference is where they are called. For
-example, inode read verification is done in xfs_iread() when the inode is first
-read out of the buffer and the struct xfs_inode is instantiated. The inode is
-already extensively verified during writeback in xfs_iflush_int, so the only
-addition here is to add the LSN and CRC to the inode as it is copied back into
-the buffer.
+example, inode read verification is done in xfs_inode_from_disk() when the inode
+is first read out of the buffer and the struct xfs_inode is instantiated. The
+inode is already extensively verified during writeback in xfs_iflush_int, so the
+only addition here is to add the LSN and CRC to the inode as it is copied back
+into the buffer.
 
 XXX: inode unlinked list modification doesn't recalculate the inode CRC! None of
 the unlinked list modifications check or update CRCs, neither during unlink nor
diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c
index 686a026b5f6ed..3aac22e892985 100644
--- a/fs/xfs/libxfs/xfs_inode_buf.c
+++ b/fs/xfs/libxfs/xfs_inode_buf.c
@@ -188,10 +188,18 @@ xfs_inode_from_disk(
 	struct xfs_icdinode	*to = &ip->i_d;
 	struct inode		*inode = VFS_I(ip);
 	int			error;
+	xfs_failaddr_t		fa;
 
 	ASSERT(ip->i_cowfp == NULL);
 	ASSERT(ip->i_afp == NULL);
 
+	fa = xfs_dinode_verify(ip->i_mount, ip->i_ino, from);
+	if (fa) {
+		xfs_inode_verifier_error(ip, -EFSCORRUPTED, "dinode", from,
+				sizeof(*from), fa);
+		return -EFSCORRUPTED;
+	}
+
 	/*
 	 * First get the permanent information that is needed to allocate an
 	 * inode. If the inode is unused, mode is zero and we shouldn't mess
@@ -627,7 +635,6 @@ xfs_iread(
 {
 	xfs_buf_t	*bp;
 	xfs_dinode_t	*dip;
-	xfs_failaddr_t	fa;
 	int		error;
 
 	/*
@@ -652,15 +659,6 @@ xfs_iread(
 	if (error)
 		return error;
 
-	/* even unallocated inodes are verified */
-	fa = xfs_dinode_verify(mp, ip->i_ino, dip);
-	if (fa) {
-		xfs_inode_verifier_error(ip, -EFSCORRUPTED, "dinode", dip,
-				sizeof(*dip), fa);
-		error = -EFSCORRUPTED;
-		goto out_brelse;
-	}
-
 	error = xfs_inode_from_disk(ip, dip);
 	if (error)
 		goto out_brelse;
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 41+ messages in thread

end of thread, other threads:[~2020-05-16 17:40 UTC | newest]

Thread overview: 41+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-01  8:14 dinode reading cleanups Christoph Hellwig
2020-05-01  8:14 ` [PATCH 01/12] xfs: xfs_bmapi_read doesn't take a fork id as the last argument Christoph Hellwig
2020-05-01 13:33   ` Brian Foster
2020-05-01  8:14 ` [PATCH 02/12] xfs: call xfs_iformat_fork from xfs_inode_from_disk Christoph Hellwig
2020-05-01 13:33   ` Brian Foster
2020-05-01  8:14 ` [PATCH 03/12] xfs: split xfs_iformat_fork Christoph Hellwig
2020-05-01 13:34   ` Brian Foster
2020-05-07 12:27     ` Christoph Hellwig
2020-05-07 13:40       ` Brian Foster
2020-05-07 13:42         ` Christoph Hellwig
2020-05-01  8:14 ` [PATCH 04/12] xfs: handle unallocated inodes in xfs_inode_from_disk Christoph Hellwig
2020-05-01 13:34   ` Brian Foster
2020-05-01  8:14 ` [PATCH 05/12] xfs: call xfs_dinode_verify from xfs_inode_from_disk Christoph Hellwig
2020-05-01 13:34   ` Brian Foster
2020-05-01  8:14 ` [PATCH 06/12] xfs: don't reset i_delayed_blks in xfs_iread Christoph Hellwig
2020-05-01 13:34   ` Brian Foster
2020-05-01  8:14 ` [PATCH 07/12] xfs: remove xfs_iread Christoph Hellwig
2020-05-01 15:56   ` Brian Foster
2020-05-01  8:14 ` [PATCH 08/12] xfs: remove xfs_ifork_ops Christoph Hellwig
2020-05-01 15:56   ` Brian Foster
2020-05-01 16:08     ` Darrick J. Wong
2020-05-01 16:38       ` Christoph Hellwig
2020-05-01 16:50         ` Christoph Hellwig
2020-05-01 18:23           ` Brian Foster
2020-05-07 12:34             ` Christoph Hellwig
2020-05-07 13:43               ` Brian Foster
2020-05-07 16:28                 ` Brian Foster
2020-05-07 17:18                   ` Christoph Hellwig
2020-05-12 23:50                     ` Darrick J. Wong
2020-05-01  8:14 ` [PATCH 09/12] xfs: refactor xfs_inode_verify_forks Christoph Hellwig
2020-05-01 15:57   ` Brian Foster
2020-05-01 16:40     ` Christoph Hellwig
2020-05-01 17:02       ` Brian Foster
2020-05-01 17:08         ` Christoph Hellwig
2020-05-01  8:14 ` [PATCH 10/12] xfs: improve local fork verification Christoph Hellwig
2020-05-01  8:14 ` [PATCH 11/12] xfs: remove the special COW fork handling in xfs_bmapi_read Christoph Hellwig
2020-05-01 15:57   ` Brian Foster
2020-05-01  8:14 ` [PATCH 12/12] xfs: remove the NULL " Christoph Hellwig
2020-05-01 15:58   ` Brian Foster
2020-05-08  6:34 dinode reading cleanups v2 Christoph Hellwig
2020-05-08  6:34 ` [PATCH 05/12] xfs: call xfs_dinode_verify from xfs_inode_from_disk Christoph Hellwig
2020-05-16 17:40   ` Darrick J. Wong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.