All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCHSET 0/4] xfs: strengthen validation of extent size hints
@ 2021-05-13  1:01 Darrick J. Wong
  2021-05-13  1:01 ` [PATCH 1/4] xfs: standardize extent size hint validation Darrick J. Wong
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Darrick J. Wong @ 2021-05-13  1:01 UTC (permalink / raw)
  To: djwong; +Cc: linux-xfs

Hi all,

While playing around with realtime extent sizes and extent size hints, I
noticed that it was very possible for userspace to trip the inode
verifiers if they tried to set an extent size hint that wasn't aligned
to the rt extent size and then create realtime files.  This series
tightens the existing checks and refactors the ioctls to use the libxfs
validation functions like the verifiers, mkfs, and repair use.

If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.

This is an extraordinary way to destroy everything.  Enjoy!
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=extsize-fixes-5.13
---
 fs/xfs/libxfs/xfs_inode_buf.c |   20 ++++++---
 fs/xfs/xfs_inode.c            |   19 +++++++++
 fs/xfs/xfs_ioctl.c            |   90 +++++++++--------------------------------
 3 files changed, 52 insertions(+), 77 deletions(-)


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH 1/4] xfs: standardize extent size hint validation
  2021-05-13  1:01 [PATCHSET 0/4] xfs: strengthen validation of extent size hints Darrick J. Wong
@ 2021-05-13  1:01 ` Darrick J. Wong
  2021-05-14 12:38   ` Brian Foster
  2021-05-13  1:01 ` [PATCH 2/4] xfs: don't propagate invalid extent size hints to new files Darrick J. Wong
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Darrick J. Wong @ 2021-05-13  1:01 UTC (permalink / raw)
  To: djwong; +Cc: linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

While chasing a bug involving invalid extent size hints being propagated
into newly created realtime files, I noticed that the xfs_ioctl_setattr
checks for the extent size hints weren't the same as the ones now
encoded in libxfs and used for validation in repair and mkfs.

Because the checks in libxfs are more stringent than the ones in the
ioctl, it's possible for a live system to set inode flags that
immediately result in corruption warnings.  Specifically, it's possible
to set an extent size hint on an rtinherit directory without checking if
the hint is aligned to the realtime extent size, which makes no sense
since that combination is used only to seed new realtime files.

Replace the open-coded and inadequate checks with the libxfs verifier
versions.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_ioctl.c |   90 +++++++++++-----------------------------------------
 1 file changed, 19 insertions(+), 71 deletions(-)


diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
index 3925bfcb2365..44d55ebdea09 100644
--- a/fs/xfs/xfs_ioctl.c
+++ b/fs/xfs/xfs_ioctl.c
@@ -1266,108 +1266,56 @@ xfs_ioctl_setattr_get_trans(
 	return ERR_PTR(error);
 }
 
-/*
- * extent size hint validation is somewhat cumbersome. Rules are:
- *
- * 1. extent size hint is only valid for directories and regular files
- * 2. FS_XFLAG_EXTSIZE is only valid for regular files
- * 3. FS_XFLAG_EXTSZINHERIT is only valid for directories.
- * 4. can only be changed on regular files if no extents are allocated
- * 5. can be changed on directories at any time
- * 6. extsize hint of 0 turns off hints, clears inode flags.
- * 7. Extent size must be a multiple of the appropriate block size.
- * 8. for non-realtime files, the extent size hint must be limited
- *    to half the AG size to avoid alignment extending the extent beyond the
- *    limits of the AG.
- *
- * Please keep this function in sync with xfs_scrub_inode_extsize.
- */
 static int
 xfs_ioctl_setattr_check_extsize(
 	struct xfs_inode	*ip,
 	struct fileattr		*fa)
 {
 	struct xfs_mount	*mp = ip->i_mount;
-	xfs_extlen_t		size;
-	xfs_fsblock_t		extsize_fsb;
+	xfs_failaddr_t		failaddr;
+	uint16_t		new_diflags;
 
 	if (!fa->fsx_valid)
 		return 0;
 
 	if (S_ISREG(VFS_I(ip)->i_mode) && ip->i_df.if_nextents &&
-	    ((ip->i_extsize << mp->m_sb.sb_blocklog) != fa->fsx_extsize))
+	    XFS_FSB_TO_B(mp, ip->i_extsize) != fa->fsx_extsize)
 		return -EINVAL;
 
-	if (fa->fsx_extsize == 0)
-		return 0;
-
-	extsize_fsb = XFS_B_TO_FSB(mp, fa->fsx_extsize);
-	if (extsize_fsb > MAXEXTLEN)
+	if (fa->fsx_extsize & mp->m_blockmask)
 		return -EINVAL;
 
-	if (XFS_IS_REALTIME_INODE(ip) ||
-	    (fa->fsx_xflags & FS_XFLAG_REALTIME)) {
-		size = mp->m_sb.sb_rextsize << mp->m_sb.sb_blocklog;
-	} else {
-		size = mp->m_sb.sb_blocksize;
-		if (extsize_fsb > mp->m_sb.sb_agblocks / 2)
-			return -EINVAL;
-	}
-
-	if (fa->fsx_extsize % size)
-		return -EINVAL;
+	new_diflags = xfs_flags2diflags(ip, fa->fsx_xflags);
 
-	return 0;
+	failaddr = xfs_inode_validate_extsize(ip->i_mount,
+			XFS_B_TO_FSB(mp, fa->fsx_extsize),
+			VFS_I(ip)->i_mode, new_diflags);
+	return failaddr != NULL ? -EINVAL : 0;
 }
 
-/*
- * CoW extent size hint validation rules are:
- *
- * 1. CoW extent size hint can only be set if reflink is enabled on the fs.
- *    The inode does not have to have any shared blocks, but it must be a v3.
- * 2. FS_XFLAG_COWEXTSIZE is only valid for directories and regular files;
- *    for a directory, the hint is propagated to new files.
- * 3. Can be changed on files & directories at any time.
- * 4. CoW extsize hint of 0 turns off hints, clears inode flags.
- * 5. Extent size must be a multiple of the appropriate block size.
- * 6. The extent size hint must be limited to half the AG size to avoid
- *    alignment extending the extent beyond the limits of the AG.
- *
- * Please keep this function in sync with xfs_scrub_inode_cowextsize.
- */
 static int
 xfs_ioctl_setattr_check_cowextsize(
 	struct xfs_inode	*ip,
 	struct fileattr		*fa)
 {
 	struct xfs_mount	*mp = ip->i_mount;
-	xfs_extlen_t		size;
-	xfs_fsblock_t		cowextsize_fsb;
+	xfs_failaddr_t		failaddr;
+	uint64_t		new_diflags2;
+	uint16_t		new_diflags;
 
 	if (!fa->fsx_valid)
 		return 0;
 
-	if (!(fa->fsx_xflags & FS_XFLAG_COWEXTSIZE))
-		return 0;
-
-	if (!xfs_sb_version_hasreflink(&ip->i_mount->m_sb))
-		return -EINVAL;
-
-	if (fa->fsx_cowextsize == 0)
-		return 0;
-
-	cowextsize_fsb = XFS_B_TO_FSB(mp, fa->fsx_cowextsize);
-	if (cowextsize_fsb > MAXEXTLEN)
+	if (fa->fsx_cowextsize & mp->m_blockmask)
 		return -EINVAL;
 
-	size = mp->m_sb.sb_blocksize;
-	if (cowextsize_fsb > mp->m_sb.sb_agblocks / 2)
-		return -EINVAL;
-
-	if (fa->fsx_cowextsize % size)
-		return -EINVAL;
+	new_diflags = xfs_flags2diflags(ip, fa->fsx_xflags);
+	new_diflags2 = xfs_flags2diflags2(ip, fa->fsx_xflags);
 
-	return 0;
+	failaddr = xfs_inode_validate_cowextsize(ip->i_mount,
+			XFS_B_TO_FSB(mp, fa->fsx_cowextsize),
+			VFS_I(ip)->i_mode, new_diflags, new_diflags2);
+	return failaddr != NULL ? -EINVAL : 0;
 }
 
 static int


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 2/4] xfs: don't propagate invalid extent size hints to new files
  2021-05-13  1:01 [PATCHSET 0/4] xfs: strengthen validation of extent size hints Darrick J. Wong
  2021-05-13  1:01 ` [PATCH 1/4] xfs: standardize extent size hint validation Darrick J. Wong
@ 2021-05-13  1:01 ` Darrick J. Wong
  2021-05-14 12:38   ` Brian Foster
  2021-05-13  1:01 ` [PATCH 3/4] xfs: validate extsz hints against rt extent size when rtinherit is set Darrick J. Wong
  2021-05-13  1:02 ` [PATCH 4/4] xfs: apply rt extent alignment constraints to cow extsize hint Darrick J. Wong
  3 siblings, 1 reply; 13+ messages in thread
From: Darrick J. Wong @ 2021-05-13  1:01 UTC (permalink / raw)
  To: djwong; +Cc: linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Under the current inode extent size hint validation rules, it's possible
to set extent size hints on directories along with an 'inherit' flag so
that the values will be propagated to newly created regular files.  (The
directories themselves do not care about the hint values.)

For these directories, the alignment of the hint is checked against the
data device even if the directory also has the rtinherit hint set, which
means that one can set a directory's hint value to something that isn't
an integer multiple of the realtime extent size.  This isn't a problem
for the directory itself, but the validation routines require rt extent
alignment for realtime files.

If the unaligned hint value and the realtime bit are both propagated
into a newly created regular realtime file, we end up writing out an
incorrect hint that trips the verifiers the next time we try to read the
inode buffer, and the fs shuts down.  Fix this by cancelling the hint
propagation if it would cause problems.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/xfs_inode.c |   19 +++++++++++++++++++
 1 file changed, 19 insertions(+)


diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 0369eb22c1bb..db81e8c22708 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -689,6 +689,7 @@ xfs_inode_inherit_flags(
 	struct xfs_inode	*ip,
 	const struct xfs_inode	*pip)
 {
+	xfs_failaddr_t		failaddr;
 	unsigned int		di_flags = 0;
 	umode_t			mode = VFS_I(ip)->i_mode;
 
@@ -728,6 +729,14 @@ xfs_inode_inherit_flags(
 	if (pip->i_diflags & XFS_DIFLAG_FILESTREAM)
 		di_flags |= XFS_DIFLAG_FILESTREAM;
 
+	/* Make sure the extsize actually validates properly. */
+	failaddr = xfs_inode_validate_extsize(ip->i_mount, ip->i_extsize,
+			VFS_I(ip)->i_mode, ip->i_diflags);
+	if (failaddr) {
+		di_flags &= ~(XFS_DIFLAG_EXTSIZE | XFS_DIFLAG_EXTSZINHERIT);
+		ip->i_extsize = 0;
+	}
+
 	ip->i_diflags |= di_flags;
 }
 
@@ -737,12 +746,22 @@ xfs_inode_inherit_flags2(
 	struct xfs_inode	*ip,
 	const struct xfs_inode	*pip)
 {
+	xfs_failaddr_t		failaddr;
+
 	if (pip->i_diflags2 & XFS_DIFLAG2_COWEXTSIZE) {
 		ip->i_diflags2 |= XFS_DIFLAG2_COWEXTSIZE;
 		ip->i_cowextsize = pip->i_cowextsize;
 	}
 	if (pip->i_diflags2 & XFS_DIFLAG2_DAX)
 		ip->i_diflags2 |= XFS_DIFLAG2_DAX;
+
+	/* Make sure the cowextsize actually validates properly. */
+	failaddr = xfs_inode_validate_cowextsize(ip->i_mount, ip->i_cowextsize,
+			VFS_I(ip)->i_mode, ip->i_diflags, ip->i_diflags2);
+	if (failaddr) {
+		ip->i_diflags2 &= ~XFS_DIFLAG2_COWEXTSIZE;
+		ip->i_cowextsize = 0;
+	}
 }
 
 /*


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 3/4] xfs: validate extsz hints against rt extent size when rtinherit is set
  2021-05-13  1:01 [PATCHSET 0/4] xfs: strengthen validation of extent size hints Darrick J. Wong
  2021-05-13  1:01 ` [PATCH 1/4] xfs: standardize extent size hint validation Darrick J. Wong
  2021-05-13  1:01 ` [PATCH 2/4] xfs: don't propagate invalid extent size hints to new files Darrick J. Wong
@ 2021-05-13  1:01 ` Darrick J. Wong
  2021-05-14 12:38   ` Brian Foster
  2021-05-13  1:02 ` [PATCH 4/4] xfs: apply rt extent alignment constraints to cow extsize hint Darrick J. Wong
  3 siblings, 1 reply; 13+ messages in thread
From: Darrick J. Wong @ 2021-05-13  1:01 UTC (permalink / raw)
  To: djwong; +Cc: linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

The RTINHERIT bit can be set on a directory so that newly created
regular files will have the REALTIME bit set to store their data on the
realtime volume.  If an extent size hint (and EXTSZINHERIT) are set on
the directory, the hint will also be copied into the new file.

As pointed out in previous patches, for realtime files we require the
extent size hint be an integer multiple of the realtime extent, but we
don't perform the same validation on a directory with both RTINHERIT and
EXTSZINHERIT set, even though the only use-case of that combination is
to propagate extent size hints into new realtime files.  This leads to
inode corruption errors when the bad values are propagated.

Strengthen the validation routine to avoid this situation and fix the
open-coded unit conversion while we're at it.  Note that this is
technically a breaking change to the ondisk format, but the risk should
be minimal because (a) most vendors disable realtime, (b) letting
unaligned hints propagate to new files would immediately crash the
filesystem, and (c) xfs_repair flags such filesystems as corrupt, so
anyone with such a configuration is broken already anyway.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/libxfs/xfs_inode_buf.c |    7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)


diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c
index 5c9a7440d9e4..25261dd73290 100644
--- a/fs/xfs/libxfs/xfs_inode_buf.c
+++ b/fs/xfs/libxfs/xfs_inode_buf.c
@@ -569,19 +569,20 @@ xfs_inode_validate_extsize(
 	uint16_t			mode,
 	uint16_t			flags)
 {
-	bool				rt_flag;
+	bool				rt_flag, rtinherit_flag;
 	bool				hint_flag;
 	bool				inherit_flag;
 	uint32_t			extsize_bytes;
 	uint32_t			blocksize_bytes;
 
 	rt_flag = (flags & XFS_DIFLAG_REALTIME);
+	rtinherit_flag = (flags & XFS_DIFLAG_RTINHERIT);
 	hint_flag = (flags & XFS_DIFLAG_EXTSIZE);
 	inherit_flag = (flags & XFS_DIFLAG_EXTSZINHERIT);
 	extsize_bytes = XFS_FSB_TO_B(mp, extsize);
 
-	if (rt_flag)
-		blocksize_bytes = mp->m_sb.sb_rextsize << mp->m_sb.sb_blocklog;
+	if (rt_flag || (rtinherit_flag && inherit_flag))
+		blocksize_bytes = XFS_FSB_TO_B(mp, mp->m_sb.sb_rextsize);
 	else
 		blocksize_bytes = mp->m_sb.sb_blocksize;
 


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH 4/4] xfs: apply rt extent alignment constraints to cow extsize hint
  2021-05-13  1:01 [PATCHSET 0/4] xfs: strengthen validation of extent size hints Darrick J. Wong
                   ` (2 preceding siblings ...)
  2021-05-13  1:01 ` [PATCH 3/4] xfs: validate extsz hints against rt extent size when rtinherit is set Darrick J. Wong
@ 2021-05-13  1:02 ` Darrick J. Wong
  2021-05-14 17:24   ` Darrick J. Wong
  3 siblings, 1 reply; 13+ messages in thread
From: Darrick J. Wong @ 2021-05-13  1:02 UTC (permalink / raw)
  To: djwong; +Cc: linux-xfs

From: Darrick J. Wong <djwong@kernel.org>

Even though reflink and copy-on-write aren't supported on the realtime
volume, if we ever turn that on, we'd still be constrained to the same
rt extent alignment requirements because cow involves remapping, and
we can only allocate and free in rtextsize units on the realtime volume.

At the moment there aren't any filesystems with rt and reflink in the
wild, so this is should be a zero-risk change.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
---
 fs/xfs/libxfs/xfs_inode_buf.c |   13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)


diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c
index 25261dd73290..704faf806e46 100644
--- a/fs/xfs/libxfs/xfs_inode_buf.c
+++ b/fs/xfs/libxfs/xfs_inode_buf.c
@@ -628,14 +628,21 @@ xfs_inode_validate_cowextsize(
 	uint16_t			flags,
 	uint64_t			flags2)
 {
-	bool				rt_flag;
+	bool				rt_flag, rtinherit_flag;
 	bool				hint_flag;
 	uint32_t			cowextsize_bytes;
+	uint32_t			blocksize_bytes;
 
 	rt_flag = (flags & XFS_DIFLAG_REALTIME);
+	rtinherit_flag = (flags & XFS_DIFLAG_RTINHERIT);
 	hint_flag = (flags2 & XFS_DIFLAG2_COWEXTSIZE);
 	cowextsize_bytes = XFS_FSB_TO_B(mp, cowextsize);
 
+	if (rt_flag || (rtinherit_flag && hint_flag))
+		blocksize_bytes = XFS_FSB_TO_B(mp, mp->m_sb.sb_rextsize);
+	else
+		blocksize_bytes = mp->m_sb.sb_blocksize;
+
 	if (hint_flag && !xfs_sb_version_hasreflink(&mp->m_sb))
 		return __this_address;
 
@@ -652,13 +659,13 @@ xfs_inode_validate_cowextsize(
 	if (hint_flag && rt_flag)
 		return __this_address;
 
-	if (cowextsize_bytes % mp->m_sb.sb_blocksize)
+	if (cowextsize_bytes % blocksize_bytes)
 		return __this_address;
 
 	if (cowextsize > MAXEXTLEN)
 		return __this_address;
 
-	if (cowextsize > mp->m_sb.sb_agblocks / 2)
+	if (!rt_flag && cowextsize > mp->m_sb.sb_agblocks / 2)
 		return __this_address;
 
 	return NULL;


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH 1/4] xfs: standardize extent size hint validation
  2021-05-13  1:01 ` [PATCH 1/4] xfs: standardize extent size hint validation Darrick J. Wong
@ 2021-05-14 12:38   ` Brian Foster
  0 siblings, 0 replies; 13+ messages in thread
From: Brian Foster @ 2021-05-14 12:38 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Wed, May 12, 2021 at 06:01:47PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> While chasing a bug involving invalid extent size hints being propagated
> into newly created realtime files, I noticed that the xfs_ioctl_setattr
> checks for the extent size hints weren't the same as the ones now
> encoded in libxfs and used for validation in repair and mkfs.
> 
> Because the checks in libxfs are more stringent than the ones in the
> ioctl, it's possible for a live system to set inode flags that
> immediately result in corruption warnings.  Specifically, it's possible
> to set an extent size hint on an rtinherit directory without checking if
> the hint is aligned to the realtime extent size, which makes no sense
> since that combination is used only to seed new realtime files.
> 
> Replace the open-coded and inadequate checks with the libxfs verifier
> versions.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---
>  fs/xfs/xfs_ioctl.c |   90 +++++++++++-----------------------------------------
>  1 file changed, 19 insertions(+), 71 deletions(-)
> 
> 
> diff --git a/fs/xfs/xfs_ioctl.c b/fs/xfs/xfs_ioctl.c
> index 3925bfcb2365..44d55ebdea09 100644
> --- a/fs/xfs/xfs_ioctl.c
> +++ b/fs/xfs/xfs_ioctl.c
> @@ -1266,108 +1266,56 @@ xfs_ioctl_setattr_get_trans(
>  	return ERR_PTR(error);
>  }
>  
> -/*
> - * extent size hint validation is somewhat cumbersome. Rules are:
> - *
> - * 1. extent size hint is only valid for directories and regular files
> - * 2. FS_XFLAG_EXTSIZE is only valid for regular files
> - * 3. FS_XFLAG_EXTSZINHERIT is only valid for directories.
> - * 4. can only be changed on regular files if no extents are allocated
> - * 5. can be changed on directories at any time
> - * 6. extsize hint of 0 turns off hints, clears inode flags.
> - * 7. Extent size must be a multiple of the appropriate block size.
> - * 8. for non-realtime files, the extent size hint must be limited
> - *    to half the AG size to avoid alignment extending the extent beyond the
> - *    limits of the AG.
> - *
> - * Please keep this function in sync with xfs_scrub_inode_extsize.
> - */

The comments for the corresponding xfs_inode_validate_*() functions
actually refer back to the comments for these functions by name as
documentation for the hint rules. We should probably fix that side up
one way or another if this comment is going away. Otherwise the patch
looks good to me, so with that addressed:

Reviewed-by: Brian Foster <bfoster@redhat.com>

>  static int
>  xfs_ioctl_setattr_check_extsize(
>  	struct xfs_inode	*ip,
>  	struct fileattr		*fa)
>  {
>  	struct xfs_mount	*mp = ip->i_mount;
> -	xfs_extlen_t		size;
> -	xfs_fsblock_t		extsize_fsb;
> +	xfs_failaddr_t		failaddr;
> +	uint16_t		new_diflags;
>  
>  	if (!fa->fsx_valid)
>  		return 0;
>  
>  	if (S_ISREG(VFS_I(ip)->i_mode) && ip->i_df.if_nextents &&
> -	    ((ip->i_extsize << mp->m_sb.sb_blocklog) != fa->fsx_extsize))
> +	    XFS_FSB_TO_B(mp, ip->i_extsize) != fa->fsx_extsize)
>  		return -EINVAL;
>  
> -	if (fa->fsx_extsize == 0)
> -		return 0;
> -
> -	extsize_fsb = XFS_B_TO_FSB(mp, fa->fsx_extsize);
> -	if (extsize_fsb > MAXEXTLEN)
> +	if (fa->fsx_extsize & mp->m_blockmask)
>  		return -EINVAL;
>  
> -	if (XFS_IS_REALTIME_INODE(ip) ||
> -	    (fa->fsx_xflags & FS_XFLAG_REALTIME)) {
> -		size = mp->m_sb.sb_rextsize << mp->m_sb.sb_blocklog;
> -	} else {
> -		size = mp->m_sb.sb_blocksize;
> -		if (extsize_fsb > mp->m_sb.sb_agblocks / 2)
> -			return -EINVAL;
> -	}
> -
> -	if (fa->fsx_extsize % size)
> -		return -EINVAL;
> +	new_diflags = xfs_flags2diflags(ip, fa->fsx_xflags);
>  
> -	return 0;
> +	failaddr = xfs_inode_validate_extsize(ip->i_mount,
> +			XFS_B_TO_FSB(mp, fa->fsx_extsize),
> +			VFS_I(ip)->i_mode, new_diflags);
> +	return failaddr != NULL ? -EINVAL : 0;
>  }
>  
> -/*
> - * CoW extent size hint validation rules are:
> - *
> - * 1. CoW extent size hint can only be set if reflink is enabled on the fs.
> - *    The inode does not have to have any shared blocks, but it must be a v3.
> - * 2. FS_XFLAG_COWEXTSIZE is only valid for directories and regular files;
> - *    for a directory, the hint is propagated to new files.
> - * 3. Can be changed on files & directories at any time.
> - * 4. CoW extsize hint of 0 turns off hints, clears inode flags.
> - * 5. Extent size must be a multiple of the appropriate block size.
> - * 6. The extent size hint must be limited to half the AG size to avoid
> - *    alignment extending the extent beyond the limits of the AG.
> - *
> - * Please keep this function in sync with xfs_scrub_inode_cowextsize.
> - */
>  static int
>  xfs_ioctl_setattr_check_cowextsize(
>  	struct xfs_inode	*ip,
>  	struct fileattr		*fa)
>  {
>  	struct xfs_mount	*mp = ip->i_mount;
> -	xfs_extlen_t		size;
> -	xfs_fsblock_t		cowextsize_fsb;
> +	xfs_failaddr_t		failaddr;
> +	uint64_t		new_diflags2;
> +	uint16_t		new_diflags;
>  
>  	if (!fa->fsx_valid)
>  		return 0;
>  
> -	if (!(fa->fsx_xflags & FS_XFLAG_COWEXTSIZE))
> -		return 0;
> -
> -	if (!xfs_sb_version_hasreflink(&ip->i_mount->m_sb))
> -		return -EINVAL;
> -
> -	if (fa->fsx_cowextsize == 0)
> -		return 0;
> -
> -	cowextsize_fsb = XFS_B_TO_FSB(mp, fa->fsx_cowextsize);
> -	if (cowextsize_fsb > MAXEXTLEN)
> +	if (fa->fsx_cowextsize & mp->m_blockmask)
>  		return -EINVAL;
>  
> -	size = mp->m_sb.sb_blocksize;
> -	if (cowextsize_fsb > mp->m_sb.sb_agblocks / 2)
> -		return -EINVAL;
> -
> -	if (fa->fsx_cowextsize % size)
> -		return -EINVAL;
> +	new_diflags = xfs_flags2diflags(ip, fa->fsx_xflags);
> +	new_diflags2 = xfs_flags2diflags2(ip, fa->fsx_xflags);
>  
> -	return 0;
> +	failaddr = xfs_inode_validate_cowextsize(ip->i_mount,
> +			XFS_B_TO_FSB(mp, fa->fsx_cowextsize),
> +			VFS_I(ip)->i_mode, new_diflags, new_diflags2);
> +	return failaddr != NULL ? -EINVAL : 0;
>  }
>  
>  static int
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/4] xfs: don't propagate invalid extent size hints to new files
  2021-05-13  1:01 ` [PATCH 2/4] xfs: don't propagate invalid extent size hints to new files Darrick J. Wong
@ 2021-05-14 12:38   ` Brian Foster
  2021-05-14 15:55     ` Darrick J. Wong
  0 siblings, 1 reply; 13+ messages in thread
From: Brian Foster @ 2021-05-14 12:38 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Wed, May 12, 2021 at 06:01:53PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Under the current inode extent size hint validation rules, it's possible
> to set extent size hints on directories along with an 'inherit' flag so
> that the values will be propagated to newly created regular files.  (The
> directories themselves do not care about the hint values.)
> 
> For these directories, the alignment of the hint is checked against the
> data device even if the directory also has the rtinherit hint set, which
> means that one can set a directory's hint value to something that isn't
> an integer multiple of the realtime extent size.  This isn't a problem
> for the directory itself, but the validation routines require rt extent
> alignment for realtime files.
> 
> If the unaligned hint value and the realtime bit are both propagated
> into a newly created regular realtime file, we end up writing out an
> incorrect hint that trips the verifiers the next time we try to read the
> inode buffer, and the fs shuts down.  Fix this by cancelling the hint
> propagation if it would cause problems.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---

Hmm.. this seems a bit unfortunate. Is the purpose of this flag
cancellation behavior basically to accommodate existing filesystems that
might have this incompatible combination in place?

Brian

>  fs/xfs/xfs_inode.c |   19 +++++++++++++++++++
>  1 file changed, 19 insertions(+)
> 
> 
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index 0369eb22c1bb..db81e8c22708 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -689,6 +689,7 @@ xfs_inode_inherit_flags(
>  	struct xfs_inode	*ip,
>  	const struct xfs_inode	*pip)
>  {
> +	xfs_failaddr_t		failaddr;
>  	unsigned int		di_flags = 0;
>  	umode_t			mode = VFS_I(ip)->i_mode;
>  
> @@ -728,6 +729,14 @@ xfs_inode_inherit_flags(
>  	if (pip->i_diflags & XFS_DIFLAG_FILESTREAM)
>  		di_flags |= XFS_DIFLAG_FILESTREAM;
>  
> +	/* Make sure the extsize actually validates properly. */
> +	failaddr = xfs_inode_validate_extsize(ip->i_mount, ip->i_extsize,
> +			VFS_I(ip)->i_mode, ip->i_diflags);
> +	if (failaddr) {
> +		di_flags &= ~(XFS_DIFLAG_EXTSIZE | XFS_DIFLAG_EXTSZINHERIT);
> +		ip->i_extsize = 0;
> +	}
> +
>  	ip->i_diflags |= di_flags;
>  }
>  
> @@ -737,12 +746,22 @@ xfs_inode_inherit_flags2(
>  	struct xfs_inode	*ip,
>  	const struct xfs_inode	*pip)
>  {
> +	xfs_failaddr_t		failaddr;
> +
>  	if (pip->i_diflags2 & XFS_DIFLAG2_COWEXTSIZE) {
>  		ip->i_diflags2 |= XFS_DIFLAG2_COWEXTSIZE;
>  		ip->i_cowextsize = pip->i_cowextsize;
>  	}
>  	if (pip->i_diflags2 & XFS_DIFLAG2_DAX)
>  		ip->i_diflags2 |= XFS_DIFLAG2_DAX;
> +
> +	/* Make sure the cowextsize actually validates properly. */
> +	failaddr = xfs_inode_validate_cowextsize(ip->i_mount, ip->i_cowextsize,
> +			VFS_I(ip)->i_mode, ip->i_diflags, ip->i_diflags2);
> +	if (failaddr) {
> +		ip->i_diflags2 &= ~XFS_DIFLAG2_COWEXTSIZE;
> +		ip->i_cowextsize = 0;
> +	}
>  }
>  
>  /*
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/4] xfs: validate extsz hints against rt extent size when rtinherit is set
  2021-05-13  1:01 ` [PATCH 3/4] xfs: validate extsz hints against rt extent size when rtinherit is set Darrick J. Wong
@ 2021-05-14 12:38   ` Brian Foster
  2021-05-14 18:22     ` Darrick J. Wong
  0 siblings, 1 reply; 13+ messages in thread
From: Brian Foster @ 2021-05-14 12:38 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Wed, May 12, 2021 at 06:01:58PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> The RTINHERIT bit can be set on a directory so that newly created
> regular files will have the REALTIME bit set to store their data on the
> realtime volume.  If an extent size hint (and EXTSZINHERIT) are set on
> the directory, the hint will also be copied into the new file.
> 
> As pointed out in previous patches, for realtime files we require the
> extent size hint be an integer multiple of the realtime extent, but we
> don't perform the same validation on a directory with both RTINHERIT and
> EXTSZINHERIT set, even though the only use-case of that combination is
> to propagate extent size hints into new realtime files.  This leads to
> inode corruption errors when the bad values are propagated.
> 
> Strengthen the validation routine to avoid this situation and fix the
> open-coded unit conversion while we're at it.  Note that this is
> technically a breaking change to the ondisk format, but the risk should
> be minimal because (a) most vendors disable realtime, (b) letting
> unaligned hints propagate to new files would immediately crash the
> filesystem, and (c) xfs_repair flags such filesystems as corrupt, so
> anyone with such a configuration is broken already anyway.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> ---

Ok, so this looks more like a proper fix, but does this turn an existing
directory with (rtinherit && extszinherit) and a badly aligned extsz
hint into a read validation error?

Brian

>  fs/xfs/libxfs/xfs_inode_buf.c |    7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> 
> diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c
> index 5c9a7440d9e4..25261dd73290 100644
> --- a/fs/xfs/libxfs/xfs_inode_buf.c
> +++ b/fs/xfs/libxfs/xfs_inode_buf.c
> @@ -569,19 +569,20 @@ xfs_inode_validate_extsize(
>  	uint16_t			mode,
>  	uint16_t			flags)
>  {
> -	bool				rt_flag;
> +	bool				rt_flag, rtinherit_flag;
>  	bool				hint_flag;
>  	bool				inherit_flag;
>  	uint32_t			extsize_bytes;
>  	uint32_t			blocksize_bytes;
>  
>  	rt_flag = (flags & XFS_DIFLAG_REALTIME);
> +	rtinherit_flag = (flags & XFS_DIFLAG_RTINHERIT);
>  	hint_flag = (flags & XFS_DIFLAG_EXTSIZE);
>  	inherit_flag = (flags & XFS_DIFLAG_EXTSZINHERIT);
>  	extsize_bytes = XFS_FSB_TO_B(mp, extsize);
>  
> -	if (rt_flag)
> -		blocksize_bytes = mp->m_sb.sb_rextsize << mp->m_sb.sb_blocklog;
> +	if (rt_flag || (rtinherit_flag && inherit_flag))
> +		blocksize_bytes = XFS_FSB_TO_B(mp, mp->m_sb.sb_rextsize);
>  	else
>  		blocksize_bytes = mp->m_sb.sb_blocksize;
>  
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 2/4] xfs: don't propagate invalid extent size hints to new files
  2021-05-14 12:38   ` Brian Foster
@ 2021-05-14 15:55     ` Darrick J. Wong
  0 siblings, 0 replies; 13+ messages in thread
From: Darrick J. Wong @ 2021-05-14 15:55 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Fri, May 14, 2021 at 08:38:27AM -0400, Brian Foster wrote:
> On Wed, May 12, 2021 at 06:01:53PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > Under the current inode extent size hint validation rules, it's possible
> > to set extent size hints on directories along with an 'inherit' flag so
> > that the values will be propagated to newly created regular files.  (The
> > directories themselves do not care about the hint values.)
> > 
> > For these directories, the alignment of the hint is checked against the
> > data device even if the directory also has the rtinherit hint set, which
> > means that one can set a directory's hint value to something that isn't
> > an integer multiple of the realtime extent size.  This isn't a problem
> > for the directory itself, but the validation routines require rt extent
> > alignment for realtime files.
> > 
> > If the unaligned hint value and the realtime bit are both propagated
> > into a newly created regular realtime file, we end up writing out an
> > incorrect hint that trips the verifiers the next time we try to read the
> > inode buffer, and the fs shuts down.  Fix this by cancelling the hint
> > propagation if it would cause problems.
> > 
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > ---
> 
> Hmm.. this seems a bit unfortunate. Is the purpose of this flag
> cancellation behavior basically to accommodate existing filesystems that
> might have this incompatible combination in place?

Yes.  The incompatible combination when set on a directory is benign,
but setting it on regular files gets us into real trouble, so the goal
here is to end the propagation of the incompatible values.

--D

> Brian
> 
> >  fs/xfs/xfs_inode.c |   19 +++++++++++++++++++
> >  1 file changed, 19 insertions(+)
> > 
> > 
> > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> > index 0369eb22c1bb..db81e8c22708 100644
> > --- a/fs/xfs/xfs_inode.c
> > +++ b/fs/xfs/xfs_inode.c
> > @@ -689,6 +689,7 @@ xfs_inode_inherit_flags(
> >  	struct xfs_inode	*ip,
> >  	const struct xfs_inode	*pip)
> >  {
> > +	xfs_failaddr_t		failaddr;
> >  	unsigned int		di_flags = 0;
> >  	umode_t			mode = VFS_I(ip)->i_mode;
> >  
> > @@ -728,6 +729,14 @@ xfs_inode_inherit_flags(
> >  	if (pip->i_diflags & XFS_DIFLAG_FILESTREAM)
> >  		di_flags |= XFS_DIFLAG_FILESTREAM;
> >  
> > +	/* Make sure the extsize actually validates properly. */
> > +	failaddr = xfs_inode_validate_extsize(ip->i_mount, ip->i_extsize,
> > +			VFS_I(ip)->i_mode, ip->i_diflags);
> > +	if (failaddr) {
> > +		di_flags &= ~(XFS_DIFLAG_EXTSIZE | XFS_DIFLAG_EXTSZINHERIT);
> > +		ip->i_extsize = 0;
> > +	}
> > +
> >  	ip->i_diflags |= di_flags;
> >  }
> >  
> > @@ -737,12 +746,22 @@ xfs_inode_inherit_flags2(
> >  	struct xfs_inode	*ip,
> >  	const struct xfs_inode	*pip)
> >  {
> > +	xfs_failaddr_t		failaddr;
> > +
> >  	if (pip->i_diflags2 & XFS_DIFLAG2_COWEXTSIZE) {
> >  		ip->i_diflags2 |= XFS_DIFLAG2_COWEXTSIZE;
> >  		ip->i_cowextsize = pip->i_cowextsize;
> >  	}
> >  	if (pip->i_diflags2 & XFS_DIFLAG2_DAX)
> >  		ip->i_diflags2 |= XFS_DIFLAG2_DAX;
> > +
> > +	/* Make sure the cowextsize actually validates properly. */
> > +	failaddr = xfs_inode_validate_cowextsize(ip->i_mount, ip->i_cowextsize,
> > +			VFS_I(ip)->i_mode, ip->i_diflags, ip->i_diflags2);
> > +	if (failaddr) {
> > +		ip->i_diflags2 &= ~XFS_DIFLAG2_COWEXTSIZE;
> > +		ip->i_cowextsize = 0;
> > +	}
> >  }
> >  
> >  /*
> > 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 4/4] xfs: apply rt extent alignment constraints to cow extsize hint
  2021-05-13  1:02 ` [PATCH 4/4] xfs: apply rt extent alignment constraints to cow extsize hint Darrick J. Wong
@ 2021-05-14 17:24   ` Darrick J. Wong
  0 siblings, 0 replies; 13+ messages in thread
From: Darrick J. Wong @ 2021-05-14 17:24 UTC (permalink / raw)
  To: linux-xfs

On Wed, May 12, 2021 at 06:02:04PM -0700, Darrick J. Wong wrote:
> From: Darrick J. Wong <djwong@kernel.org>
> 
> Even though reflink and copy-on-write aren't supported on the realtime
> volume, if we ever turn that on, we'd still be constrained to the same
> rt extent alignment requirements because cow involves remapping, and
> we can only allocate and free in rtextsize units on the realtime volume.
> 
> At the moment there aren't any filesystems with rt and reflink in the
> wild, so this is should be a zero-risk change.
> 
> Signed-off-by: Darrick J. Wong <djwong@kernel.org>

Self NAK; one cannot set a nonzero cowextsize hint on a non-reflink
filesystem, and outside of djwong-dev, reflink on realtime isn't
supported either.  I guess I got too enthusiastic when I was
redistributing patches. :(

--D

> ---
>  fs/xfs/libxfs/xfs_inode_buf.c |   13 ++++++++++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
> 
> 
> diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c
> index 25261dd73290..704faf806e46 100644
> --- a/fs/xfs/libxfs/xfs_inode_buf.c
> +++ b/fs/xfs/libxfs/xfs_inode_buf.c
> @@ -628,14 +628,21 @@ xfs_inode_validate_cowextsize(
>  	uint16_t			flags,
>  	uint64_t			flags2)
>  {
> -	bool				rt_flag;
> +	bool				rt_flag, rtinherit_flag;
>  	bool				hint_flag;
>  	uint32_t			cowextsize_bytes;
> +	uint32_t			blocksize_bytes;
>  
>  	rt_flag = (flags & XFS_DIFLAG_REALTIME);
> +	rtinherit_flag = (flags & XFS_DIFLAG_RTINHERIT);
>  	hint_flag = (flags2 & XFS_DIFLAG2_COWEXTSIZE);
>  	cowextsize_bytes = XFS_FSB_TO_B(mp, cowextsize);
>  
> +	if (rt_flag || (rtinherit_flag && hint_flag))
> +		blocksize_bytes = XFS_FSB_TO_B(mp, mp->m_sb.sb_rextsize);
> +	else
> +		blocksize_bytes = mp->m_sb.sb_blocksize;
> +
>  	if (hint_flag && !xfs_sb_version_hasreflink(&mp->m_sb))
>  		return __this_address;
>  
> @@ -652,13 +659,13 @@ xfs_inode_validate_cowextsize(
>  	if (hint_flag && rt_flag)
>  		return __this_address;
>  
> -	if (cowextsize_bytes % mp->m_sb.sb_blocksize)
> +	if (cowextsize_bytes % blocksize_bytes)
>  		return __this_address;
>  
>  	if (cowextsize > MAXEXTLEN)
>  		return __this_address;
>  
> -	if (cowextsize > mp->m_sb.sb_agblocks / 2)
> +	if (!rt_flag && cowextsize > mp->m_sb.sb_agblocks / 2)
>  		return __this_address;
>  
>  	return NULL;
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/4] xfs: validate extsz hints against rt extent size when rtinherit is set
  2021-05-14 12:38   ` Brian Foster
@ 2021-05-14 18:22     ` Darrick J. Wong
  2021-05-14 18:51       ` Brian Foster
  0 siblings, 1 reply; 13+ messages in thread
From: Darrick J. Wong @ 2021-05-14 18:22 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Fri, May 14, 2021 at 08:38:35AM -0400, Brian Foster wrote:
> On Wed, May 12, 2021 at 06:01:58PM -0700, Darrick J. Wong wrote:
> > From: Darrick J. Wong <djwong@kernel.org>
> > 
> > The RTINHERIT bit can be set on a directory so that newly created
> > regular files will have the REALTIME bit set to store their data on the
> > realtime volume.  If an extent size hint (and EXTSZINHERIT) are set on
> > the directory, the hint will also be copied into the new file.
> > 
> > As pointed out in previous patches, for realtime files we require the
> > extent size hint be an integer multiple of the realtime extent, but we
> > don't perform the same validation on a directory with both RTINHERIT and
> > EXTSZINHERIT set, even though the only use-case of that combination is
> > to propagate extent size hints into new realtime files.  This leads to
> > inode corruption errors when the bad values are propagated.
> > 
> > Strengthen the validation routine to avoid this situation and fix the
> > open-coded unit conversion while we're at it.  Note that this is
> > technically a breaking change to the ondisk format, but the risk should
> > be minimal because (a) most vendors disable realtime, (b) letting
> > unaligned hints propagate to new files would immediately crash the
> > filesystem, and (c) xfs_repair flags such filesystems as corrupt, so
> > anyone with such a configuration is broken already anyway.
> > 
> > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > ---
> 
> Ok, so this looks more like a proper fix, but does this turn an existing
> directory with (rtinherit && extszinherit) and a badly aligned extsz
> hint into a read validation error?

Hmm, you're right.  This fix needs to be more targeted in its nature.
For non-rt filesystems, the rtinherit bit being set on a directory is
benign because we won't set the realtime bit on new files, so there's no
need to introduce a new verifier error that will fail existing
filesystems.

We /do/ need to trap the misconfiguration for filesystems with an rt
volume because those filesystems will fail if the propagation happens.

I think the solution here is to change the verifier check here to
prevent the spread of bad extent size hints:

	if (rt_flag || (xfs_sb_version_hasrealtime(&mp->m_sb) &&
			rtinherit_flag && inherit_flag))
		blocksize_bytes = XFS_FSB_TO_B(mp, mp->m_sb.sb_rextsize);
	else
		blocksize_bytes = mp->m_sb.sb_blocksize;

...and add a check to xfs_ioctl_setattr_check_extsize to prevent
sysadmins from misconfiguring directories in the first place.

--D

> Brian
> 
> >  fs/xfs/libxfs/xfs_inode_buf.c |    7 ++++---
> >  1 file changed, 4 insertions(+), 3 deletions(-)
> > 
> > 
> > diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c
> > index 5c9a7440d9e4..25261dd73290 100644
> > --- a/fs/xfs/libxfs/xfs_inode_buf.c
> > +++ b/fs/xfs/libxfs/xfs_inode_buf.c
> > @@ -569,19 +569,20 @@ xfs_inode_validate_extsize(
> >  	uint16_t			mode,
> >  	uint16_t			flags)
> >  {
> > -	bool				rt_flag;
> > +	bool				rt_flag, rtinherit_flag;
> >  	bool				hint_flag;
> >  	bool				inherit_flag;
> >  	uint32_t			extsize_bytes;
> >  	uint32_t			blocksize_bytes;
> >  
> >  	rt_flag = (flags & XFS_DIFLAG_REALTIME);
> > +	rtinherit_flag = (flags & XFS_DIFLAG_RTINHERIT);
> >  	hint_flag = (flags & XFS_DIFLAG_EXTSIZE);
> >  	inherit_flag = (flags & XFS_DIFLAG_EXTSZINHERIT);
> >  	extsize_bytes = XFS_FSB_TO_B(mp, extsize);
> >  
> > -	if (rt_flag)
> > -		blocksize_bytes = mp->m_sb.sb_rextsize << mp->m_sb.sb_blocklog;
> > +	if (rt_flag || (rtinherit_flag && inherit_flag))
> > +		blocksize_bytes = XFS_FSB_TO_B(mp, mp->m_sb.sb_rextsize);
> >  	else
> >  		blocksize_bytes = mp->m_sb.sb_blocksize;
> >  
> > 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/4] xfs: validate extsz hints against rt extent size when rtinherit is set
  2021-05-14 18:22     ` Darrick J. Wong
@ 2021-05-14 18:51       ` Brian Foster
  2021-05-14 20:30         ` Darrick J. Wong
  0 siblings, 1 reply; 13+ messages in thread
From: Brian Foster @ 2021-05-14 18:51 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Fri, May 14, 2021 at 11:22:53AM -0700, Darrick J. Wong wrote:
> On Fri, May 14, 2021 at 08:38:35AM -0400, Brian Foster wrote:
> > On Wed, May 12, 2021 at 06:01:58PM -0700, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <djwong@kernel.org>
> > > 
> > > The RTINHERIT bit can be set on a directory so that newly created
> > > regular files will have the REALTIME bit set to store their data on the
> > > realtime volume.  If an extent size hint (and EXTSZINHERIT) are set on
> > > the directory, the hint will also be copied into the new file.
> > > 
> > > As pointed out in previous patches, for realtime files we require the
> > > extent size hint be an integer multiple of the realtime extent, but we
> > > don't perform the same validation on a directory with both RTINHERIT and
> > > EXTSZINHERIT set, even though the only use-case of that combination is
> > > to propagate extent size hints into new realtime files.  This leads to
> > > inode corruption errors when the bad values are propagated.
> > > 
> > > Strengthen the validation routine to avoid this situation and fix the
> > > open-coded unit conversion while we're at it.  Note that this is
> > > technically a breaking change to the ondisk format, but the risk should
> > > be minimal because (a) most vendors disable realtime, (b) letting
> > > unaligned hints propagate to new files would immediately crash the
> > > filesystem, and (c) xfs_repair flags such filesystems as corrupt, so
> > > anyone with such a configuration is broken already anyway.
> > > 
> > > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > > ---
> > 
> > Ok, so this looks more like a proper fix, but does this turn an existing
> > directory with (rtinherit && extszinherit) and a badly aligned extsz
> > hint into a read validation error?
> 
> Hmm, you're right.  This fix needs to be more targeted in its nature.
> For non-rt filesystems, the rtinherit bit being set on a directory is
> benign because we won't set the realtime bit on new files, so there's no
> need to introduce a new verifier error that will fail existing
> filesystems.
> 
> We /do/ need to trap the misconfiguration for filesystems with an rt
> volume because those filesystems will fail if the propagation happens.
> 
> I think the solution here is to change the verifier check here to
> prevent the spread of bad extent size hints:
> 
> 	if (rt_flag || (xfs_sb_version_hasrealtime(&mp->m_sb) &&
> 			rtinherit_flag && inherit_flag))
> 		blocksize_bytes = XFS_FSB_TO_B(mp, mp->m_sb.sb_rextsize);
> 	else
> 		blocksize_bytes = mp->m_sb.sb_blocksize;
> 
> ...and add a check to xfs_ioctl_setattr_check_extsize to prevent
> sysadmins from misconfiguring directories in the first place.
> 

It definitely makes sense to prevent this misconfiguration going
forward, but I'm a little confused on the intended behavior for
filesystems where this is already present (and not benign). ISTM the
previous patch is intended to allow the filesystem to continue running
with the added behavior that we restrict further propagation of
preexisting misconfigured extent size hints, but would this patch
trigger a verifier failure on read of such a misconfigured directory
inode..?

Brian

> --D
> 
> > Brian
> > 
> > >  fs/xfs/libxfs/xfs_inode_buf.c |    7 ++++---
> > >  1 file changed, 4 insertions(+), 3 deletions(-)
> > > 
> > > 
> > > diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c
> > > index 5c9a7440d9e4..25261dd73290 100644
> > > --- a/fs/xfs/libxfs/xfs_inode_buf.c
> > > +++ b/fs/xfs/libxfs/xfs_inode_buf.c
> > > @@ -569,19 +569,20 @@ xfs_inode_validate_extsize(
> > >  	uint16_t			mode,
> > >  	uint16_t			flags)
> > >  {
> > > -	bool				rt_flag;
> > > +	bool				rt_flag, rtinherit_flag;
> > >  	bool				hint_flag;
> > >  	bool				inherit_flag;
> > >  	uint32_t			extsize_bytes;
> > >  	uint32_t			blocksize_bytes;
> > >  
> > >  	rt_flag = (flags & XFS_DIFLAG_REALTIME);
> > > +	rtinherit_flag = (flags & XFS_DIFLAG_RTINHERIT);
> > >  	hint_flag = (flags & XFS_DIFLAG_EXTSIZE);
> > >  	inherit_flag = (flags & XFS_DIFLAG_EXTSZINHERIT);
> > >  	extsize_bytes = XFS_FSB_TO_B(mp, extsize);
> > >  
> > > -	if (rt_flag)
> > > -		blocksize_bytes = mp->m_sb.sb_rextsize << mp->m_sb.sb_blocklog;
> > > +	if (rt_flag || (rtinherit_flag && inherit_flag))
> > > +		blocksize_bytes = XFS_FSB_TO_B(mp, mp->m_sb.sb_rextsize);
> > >  	else
> > >  		blocksize_bytes = mp->m_sb.sb_blocksize;
> > >  
> > > 
> > 
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH 3/4] xfs: validate extsz hints against rt extent size when rtinherit is set
  2021-05-14 18:51       ` Brian Foster
@ 2021-05-14 20:30         ` Darrick J. Wong
  0 siblings, 0 replies; 13+ messages in thread
From: Darrick J. Wong @ 2021-05-14 20:30 UTC (permalink / raw)
  To: Brian Foster; +Cc: linux-xfs

On Fri, May 14, 2021 at 02:51:50PM -0400, Brian Foster wrote:
> On Fri, May 14, 2021 at 11:22:53AM -0700, Darrick J. Wong wrote:
> > On Fri, May 14, 2021 at 08:38:35AM -0400, Brian Foster wrote:
> > > On Wed, May 12, 2021 at 06:01:58PM -0700, Darrick J. Wong wrote:
> > > > From: Darrick J. Wong <djwong@kernel.org>
> > > > 
> > > > The RTINHERIT bit can be set on a directory so that newly created
> > > > regular files will have the REALTIME bit set to store their data on the
> > > > realtime volume.  If an extent size hint (and EXTSZINHERIT) are set on
> > > > the directory, the hint will also be copied into the new file.
> > > > 
> > > > As pointed out in previous patches, for realtime files we require the
> > > > extent size hint be an integer multiple of the realtime extent, but we
> > > > don't perform the same validation on a directory with both RTINHERIT and
> > > > EXTSZINHERIT set, even though the only use-case of that combination is
> > > > to propagate extent size hints into new realtime files.  This leads to
> > > > inode corruption errors when the bad values are propagated.
> > > > 
> > > > Strengthen the validation routine to avoid this situation and fix the
> > > > open-coded unit conversion while we're at it.  Note that this is
> > > > technically a breaking change to the ondisk format, but the risk should
> > > > be minimal because (a) most vendors disable realtime, (b) letting
> > > > unaligned hints propagate to new files would immediately crash the
> > > > filesystem, and (c) xfs_repair flags such filesystems as corrupt, so
> > > > anyone with such a configuration is broken already anyway.
> > > > 
> > > > Signed-off-by: Darrick J. Wong <djwong@kernel.org>
> > > > ---
> > > 
> > > Ok, so this looks more like a proper fix, but does this turn an existing
> > > directory with (rtinherit && extszinherit) and a badly aligned extsz
> > > hint into a read validation error?
> > 
> > Hmm, you're right.  This fix needs to be more targeted in its nature.
> > For non-rt filesystems, the rtinherit bit being set on a directory is
> > benign because we won't set the realtime bit on new files, so there's no
> > need to introduce a new verifier error that will fail existing
> > filesystems.
> > 
> > We /do/ need to trap the misconfiguration for filesystems with an rt
> > volume because those filesystems will fail if the propagation happens.
> > 
> > I think the solution here is to change the verifier check here to
> > prevent the spread of bad extent size hints:
> > 
> > 	if (rt_flag || (xfs_sb_version_hasrealtime(&mp->m_sb) &&
> > 			rtinherit_flag && inherit_flag))
> > 		blocksize_bytes = XFS_FSB_TO_B(mp, mp->m_sb.sb_rextsize);
> > 	else
> > 		blocksize_bytes = mp->m_sb.sb_blocksize;
> > 
> > ...and add a check to xfs_ioctl_setattr_check_extsize to prevent
> > sysadmins from misconfiguring directories in the first place.
> > 
> 
> It definitely makes sense to prevent this misconfiguration going
> forward, but I'm a little confused on the intended behavior for
> filesystems where this is already present (and not benign). ISTM the
> previous patch is intended to allow the filesystem to continue running
> with the added behavior that we restrict further propagation of
> preexisting misconfigured extent size hints, but would this patch
> trigger a verifier failure on read of such a misconfigured directory
> inode..?

Yeah, it would.  In the longer term I think we'd want to make it a part
of the verifier if whatever's the next new new inode-related feature is
set.

--D

> Brian
> 
> > --D
> > 
> > > Brian
> > > 
> > > >  fs/xfs/libxfs/xfs_inode_buf.c |    7 ++++---
> > > >  1 file changed, 4 insertions(+), 3 deletions(-)
> > > > 
> > > > 
> > > > diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c
> > > > index 5c9a7440d9e4..25261dd73290 100644
> > > > --- a/fs/xfs/libxfs/xfs_inode_buf.c
> > > > +++ b/fs/xfs/libxfs/xfs_inode_buf.c
> > > > @@ -569,19 +569,20 @@ xfs_inode_validate_extsize(
> > > >  	uint16_t			mode,
> > > >  	uint16_t			flags)
> > > >  {
> > > > -	bool				rt_flag;
> > > > +	bool				rt_flag, rtinherit_flag;
> > > >  	bool				hint_flag;
> > > >  	bool				inherit_flag;
> > > >  	uint32_t			extsize_bytes;
> > > >  	uint32_t			blocksize_bytes;
> > > >  
> > > >  	rt_flag = (flags & XFS_DIFLAG_REALTIME);
> > > > +	rtinherit_flag = (flags & XFS_DIFLAG_RTINHERIT);
> > > >  	hint_flag = (flags & XFS_DIFLAG_EXTSIZE);
> > > >  	inherit_flag = (flags & XFS_DIFLAG_EXTSZINHERIT);
> > > >  	extsize_bytes = XFS_FSB_TO_B(mp, extsize);
> > > >  
> > > > -	if (rt_flag)
> > > > -		blocksize_bytes = mp->m_sb.sb_rextsize << mp->m_sb.sb_blocklog;
> > > > +	if (rt_flag || (rtinherit_flag && inherit_flag))
> > > > +		blocksize_bytes = XFS_FSB_TO_B(mp, mp->m_sb.sb_rextsize);
> > > >  	else
> > > >  		blocksize_bytes = mp->m_sb.sb_blocksize;
> > > >  
> > > > 
> > > 
> > 
> 

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-05-14 20:30 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-13  1:01 [PATCHSET 0/4] xfs: strengthen validation of extent size hints Darrick J. Wong
2021-05-13  1:01 ` [PATCH 1/4] xfs: standardize extent size hint validation Darrick J. Wong
2021-05-14 12:38   ` Brian Foster
2021-05-13  1:01 ` [PATCH 2/4] xfs: don't propagate invalid extent size hints to new files Darrick J. Wong
2021-05-14 12:38   ` Brian Foster
2021-05-14 15:55     ` Darrick J. Wong
2021-05-13  1:01 ` [PATCH 3/4] xfs: validate extsz hints against rt extent size when rtinherit is set Darrick J. Wong
2021-05-14 12:38   ` Brian Foster
2021-05-14 18:22     ` Darrick J. Wong
2021-05-14 18:51       ` Brian Foster
2021-05-14 20:30         ` Darrick J. Wong
2021-05-13  1:02 ` [PATCH 4/4] xfs: apply rt extent alignment constraints to cow extsize hint Darrick J. Wong
2021-05-14 17:24   ` Darrick J. Wong

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.