linux-xfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/3] xfs: fix maxbytes problems on 32-bit systems
@ 2020-01-08  4:17 Darrick J. Wong
  2020-01-08  4:17 ` [PATCH 1/3] xfs: introduce XFS_MAX_FILEOFF Darrick J. Wong
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Darrick J. Wong @ 2020-01-08  4:17 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

Hi all,

Here's a series of patches to fix some problems related to s_maxbytes
that I found by running fstests in a 32-bit VM.  The first step is to
fix bunmapi during inactivation so that it cleans out /all/ the blocks
of an unlinked file instead of leaking them; and the second is to fix
the s_maxbytes computation to avoid setting a limit larger than what the
pagecache supports.

This has been lightly tested with fstests.  Enjoy!
Comments and questions are, as always, welcome.

--D

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/3] xfs: introduce XFS_MAX_FILEOFF
  2020-01-08  4:17 [PATCH v2 0/3] xfs: fix maxbytes problems on 32-bit systems Darrick J. Wong
@ 2020-01-08  4:17 ` Darrick J. Wong
  2020-01-08  8:09   ` Christoph Hellwig
  2020-01-08 20:40   ` Dave Chinner
  2020-01-08  4:17 ` [PATCH 2/3] xfs: truncate should remove all blocks, not just to the end of the page cache Darrick J. Wong
  2020-01-08  4:17 ` [PATCH 3/3] xfs: fix s_maxbytes computation on 32-bit kernels Darrick J. Wong
  2 siblings, 2 replies; 15+ messages in thread
From: Darrick J. Wong @ 2020-01-08  4:17 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Introduce a new #define for the maximum supported file block offset.
We'll use this in the next patch to make it more obvious that we're
doing some operation for all possible inode fork mappings after a given
offset.  We can't use ULLONG_MAX here because bunmapi uses that to
detect when it's done.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_format.h |    1 +
 fs/xfs/xfs_reflink.c       |    3 ++-
 2 files changed, 3 insertions(+), 1 deletion(-)


diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h
index 1b7dcbae051c..c2976e441d43 100644
--- a/fs/xfs/libxfs/xfs_format.h
+++ b/fs/xfs/libxfs/xfs_format.h
@@ -1540,6 +1540,7 @@ typedef struct xfs_bmdr_block {
 #define BMBT_BLOCKCOUNT_BITLEN	21
 
 #define BMBT_STARTOFF_MASK	((1ULL << BMBT_STARTOFF_BITLEN) - 1)
+#define XFS_MAX_FILEOFF		(BMBT_STARTOFF_MASK)
 
 typedef struct xfs_bmbt_rec {
 	__be64			l0, l1;
diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index de451235c4ee..7a6c94295b8a 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -1457,7 +1457,8 @@ xfs_reflink_clear_inode_flag(
 	 * We didn't find any shared blocks so turn off the reflink flag.
 	 * First, get rid of any leftover CoW mappings.
 	 */
-	error = xfs_reflink_cancel_cow_blocks(ip, tpp, 0, NULLFILEOFF, true);
+	error = xfs_reflink_cancel_cow_blocks(ip, tpp, 0, XFS_MAX_FILEOFF,
+			true);
 	if (error)
 		return error;
 


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/3] xfs: truncate should remove all blocks, not just to the end of the page cache
  2020-01-08  4:17 [PATCH v2 0/3] xfs: fix maxbytes problems on 32-bit systems Darrick J. Wong
  2020-01-08  4:17 ` [PATCH 1/3] xfs: introduce XFS_MAX_FILEOFF Darrick J. Wong
@ 2020-01-08  4:17 ` Darrick J. Wong
  2020-01-08  8:11   ` Christoph Hellwig
  2020-01-08  4:17 ` [PATCH 3/3] xfs: fix s_maxbytes computation on 32-bit kernels Darrick J. Wong
  2 siblings, 1 reply; 15+ messages in thread
From: Darrick J. Wong @ 2020-01-08  4:17 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

xfs_itruncate_extents_flags() is supposed to unmap every block in a file
from EOF onwards.  Oddly, it uses s_maxbytes as the upper limit to the
bunmapi range, even though s_maxbytes reflects the highest offset the
pagecache can support, not the highest offset that XFS supports.

The result of this confusion is that if you create a 20T file on a
64-bit machine, mount the filesystem on a 32-bit machine, and remove the
file, we leak everything above 16T.  Fix this by capping the bunmapi
request at the maximum possible block offset, not s_maxbytes.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_inode.c |   23 +++++++++++------------
 1 file changed, 11 insertions(+), 12 deletions(-)


diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index fc3aec26ef87..79799ab30c93 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -1518,7 +1518,6 @@ xfs_itruncate_extents_flags(
 	struct xfs_mount	*mp = ip->i_mount;
 	struct xfs_trans	*tp = *tpp;
 	xfs_fileoff_t		first_unmap_block;
-	xfs_fileoff_t		last_block;
 	xfs_filblks_t		unmap_len;
 	int			error = 0;
 
@@ -1540,21 +1539,21 @@ xfs_itruncate_extents_flags(
 	 * the end of the file (in a crash where the space is allocated
 	 * but the inode size is not yet updated), simply remove any
 	 * blocks which show up between the new EOF and the maximum
-	 * possible file size.  If the first block to be removed is
-	 * beyond the maximum file size (ie it is the same as last_block),
-	 * then there is nothing to do.
+	 * possible file size.
+	 *
+	 * We have to free all the blocks to the bmbt maximum offset, even if
+	 * the page cache can't scale that far.
 	 */
 	first_unmap_block = XFS_B_TO_FSB(mp, (xfs_ufsize_t)new_size);
-	last_block = XFS_B_TO_FSB(mp, mp->m_super->s_maxbytes);
-	if (first_unmap_block == last_block)
+	if (first_unmap_block == XFS_MAX_FILEOFF)
 		return 0;
 
-	ASSERT(first_unmap_block < last_block);
-	unmap_len = last_block - first_unmap_block + 1;
-	while (!done) {
+	ASSERT(first_unmap_block < XFS_MAX_FILEOFF);
+	unmap_len = XFS_MAX_FILEOFF - first_unmap_block + 1;
+	while (unmap_len > 0) {
 		ASSERT(tp->t_firstblock == NULLFSBLOCK);
-		error = xfs_bunmapi(tp, ip, first_unmap_block, unmap_len, flags,
-				    XFS_ITRUNC_MAX_EXTENTS, &done);
+		error = __xfs_bunmapi(tp, ip, first_unmap_block, &unmap_len,
+				flags, XFS_ITRUNC_MAX_EXTENTS);
 		if (error)
 			goto out;
 
@@ -1574,7 +1573,7 @@ xfs_itruncate_extents_flags(
 	if (whichfork == XFS_DATA_FORK) {
 		/* Remove all pending CoW reservations. */
 		error = xfs_reflink_cancel_cow_blocks(ip, &tp,
-				first_unmap_block, last_block, true);
+				first_unmap_block, XFS_MAX_FILEOFF, true);
 		if (error)
 			goto out;
 


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/3] xfs: fix s_maxbytes computation on 32-bit kernels
  2020-01-08  4:17 [PATCH v2 0/3] xfs: fix maxbytes problems on 32-bit systems Darrick J. Wong
  2020-01-08  4:17 ` [PATCH 1/3] xfs: introduce XFS_MAX_FILEOFF Darrick J. Wong
  2020-01-08  4:17 ` [PATCH 2/3] xfs: truncate should remove all blocks, not just to the end of the page cache Darrick J. Wong
@ 2020-01-08  4:17 ` Darrick J. Wong
  2020-01-08  8:12   ` Christoph Hellwig
  2 siblings, 1 reply; 15+ messages in thread
From: Darrick J. Wong @ 2020-01-08  4:17 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

I observed a hang in generic/308 while running fstests on a i686 kernel.
The hang occurred when trying to purge the pagecache on a large sparse
file that had a page created past MAX_LFS_FILESIZE, which caused an
integer overflow in the pagecache xarray and resulted in an infinite
loop.

I then noticed that Linus changed the definition of MAX_LFS_FILESIZE in
commit 0cc3b0ec23ce ("Clarify (and fix) MAX_LFS_FILESIZE macros") so
that it is now one page short of the maximum page index on 32-bit
kernels.  Because the XFS function to compute max offset open-codes the
2005-era MAX_LFS_FILESIZE computation and neither the vfs nor mm perform
any sanity checking of s_maxbytes, the code in generic/308 can create a
page above the pagecache's limit and kaboom.

Fix all this by setting s_maxbytes to MAX_LFS_FILESIZE directly and
aborting the mount with a warning if our assumptions ever break.  I have
no answer for why this seems to have been broken for years and nobody
noticed.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/xfs_super.c |   48 +++++++++++++++++++++---------------------------
 1 file changed, 21 insertions(+), 27 deletions(-)


diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index d9ae27ddf253..05ca2a7bbe55 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -193,32 +193,6 @@ xfs_fs_show_options(
 	return 0;
 }
 
-static uint64_t
-xfs_max_file_offset(
-	unsigned int		blockshift)
-{
-	unsigned int		pagefactor = 1;
-	unsigned int		bitshift = BITS_PER_LONG - 1;
-
-	/* Figure out maximum filesize, on Linux this can depend on
-	 * the filesystem blocksize (on 32 bit platforms).
-	 * __block_write_begin does this in an [unsigned] long long...
-	 *      page->index << (PAGE_SHIFT - bbits)
-	 * So, for page sized blocks (4K on 32 bit platforms),
-	 * this wraps at around 8Tb (hence MAX_LFS_FILESIZE which is
-	 *      (((u64)PAGE_SIZE << (BITS_PER_LONG-1))-1)
-	 * but for smaller blocksizes it is less (bbits = log2 bsize).
-	 */
-
-#if BITS_PER_LONG == 32
-	ASSERT(sizeof(sector_t) == 8);
-	pagefactor = PAGE_SIZE;
-	bitshift = BITS_PER_LONG;
-#endif
-
-	return (((uint64_t)pagefactor) << bitshift) - 1;
-}
-
 /*
  * Set parameters for inode allocation heuristics, taking into account
  * filesystem size and inode32/inode64 mount options; i.e. specifically
@@ -1424,6 +1398,26 @@ xfs_fc_fill_super(
 	if (error)
 		goto out_free_sb;
 
+	/*
+	 * XFS block mappings use 54 bits to store the logical block offset.
+	 * This should suffice to handle the maximum file size that the VFS
+	 * supports (currently 2^63 bytes on 64-bit and ULONG_MAX << PAGE_SHIFT
+	 * bytes on 32-bit), but as XFS and VFS have gotten the s_maxbytes
+	 * calculation wrong on 32-bit kernels in the past, we'll add a WARN_ON
+	 * to check this assertion.
+	 *
+	 * Avoid integer overflow by comparing the maximum bmbt offset to the
+	 * maximum pagecache offset in units of fs blocks.
+	 */
+	if (XFS_MAX_FILEOFF < XFS_B_TO_FSBT(mp, MAX_LFS_FILESIZE)) {
+		xfs_warn(mp,
+"MAX_LFS_FILESIZE block offset (%llu) exceeds extent map maximum (%llu)!",
+			 XFS_B_TO_FSBT(mp, MAX_LFS_FILESIZE),
+			 XFS_MAX_FILEOFF);
+		error = -EINVAL;
+		goto out_free_sb;
+	}
+
 	error = xfs_filestream_mount(mp);
 	if (error)
 		goto out_free_sb;
@@ -1435,7 +1429,7 @@ xfs_fc_fill_super(
 	sb->s_magic = XFS_SUPER_MAGIC;
 	sb->s_blocksize = mp->m_sb.sb_blocksize;
 	sb->s_blocksize_bits = ffs(sb->s_blocksize) - 1;
-	sb->s_maxbytes = xfs_max_file_offset(sb->s_blocksize_bits);
+	sb->s_maxbytes = MAX_LFS_FILESIZE;
 	sb->s_max_links = XFS_MAXLINK;
 	sb->s_time_gran = 1;
 	sb->s_time_min = S32_MIN;


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] xfs: introduce XFS_MAX_FILEOFF
  2020-01-08  4:17 ` [PATCH 1/3] xfs: introduce XFS_MAX_FILEOFF Darrick J. Wong
@ 2020-01-08  8:09   ` Christoph Hellwig
  2020-01-08 16:37     ` Darrick J. Wong
  2020-01-08 20:40   ` Dave Chinner
  1 sibling, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2020-01-08  8:09 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

> --- a/fs/xfs/libxfs/xfs_format.h
> +++ b/fs/xfs/libxfs/xfs_format.h
> @@ -1540,6 +1540,7 @@ typedef struct xfs_bmdr_block {
>  #define BMBT_BLOCKCOUNT_BITLEN	21
>  
>  #define BMBT_STARTOFF_MASK	((1ULL << BMBT_STARTOFF_BITLEN) - 1)
> +#define XFS_MAX_FILEOFF		(BMBT_STARTOFF_MASK)

I think throwing in a comment here would be useful.

Otherwise the patch looks good to me.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/3] xfs: truncate should remove all blocks, not just to the end of the page cache
  2020-01-08  4:17 ` [PATCH 2/3] xfs: truncate should remove all blocks, not just to the end of the page cache Darrick J. Wong
@ 2020-01-08  8:11   ` Christoph Hellwig
  2020-01-08 16:37     ` Darrick J. Wong
  0 siblings, 1 reply; 15+ messages in thread
From: Christoph Hellwig @ 2020-01-08  8:11 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Tue, Jan 07, 2020 at 08:17:45PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> xfs_itruncate_extents_flags() is supposed to unmap every block in a file
> from EOF onwards.  Oddly, it uses s_maxbytes as the upper limit to the
> bunmapi range, even though s_maxbytes reflects the highest offset the
> pagecache can support, not the highest offset that XFS supports.
> 
> The result of this confusion is that if you create a 20T file on a
> 64-bit machine, mount the filesystem on a 32-bit machine, and remove the
> file, we leak everything above 16T.  Fix this by capping the bunmapi
> request at the maximum possible block offset, not s_maxbytes.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/xfs_inode.c |   23 +++++++++++------------
>  1 file changed, 11 insertions(+), 12 deletions(-)
> 
> 
> diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> index fc3aec26ef87..79799ab30c93 100644
> --- a/fs/xfs/xfs_inode.c
> +++ b/fs/xfs/xfs_inode.c
> @@ -1518,7 +1518,6 @@ xfs_itruncate_extents_flags(
>  	struct xfs_mount	*mp = ip->i_mount;
>  	struct xfs_trans	*tp = *tpp;
>  	xfs_fileoff_t		first_unmap_block;
> -	xfs_fileoff_t		last_block;
>  	xfs_filblks_t		unmap_len;
>  	int			error = 0;
>  
> @@ -1540,21 +1539,21 @@ xfs_itruncate_extents_flags(
>  	 * the end of the file (in a crash where the space is allocated
>  	 * but the inode size is not yet updated), simply remove any
>  	 * blocks which show up between the new EOF and the maximum
> -	 * possible file size.  If the first block to be removed is
> -	 * beyond the maximum file size (ie it is the same as last_block),
> -	 * then there is nothing to do.
> +	 * possible file size.
> +	 *
> +	 * We have to free all the blocks to the bmbt maximum offset, even if
> +	 * the page cache can't scale that far.
>  	 */
>  	first_unmap_block = XFS_B_TO_FSB(mp, (xfs_ufsize_t)new_size);
> -	last_block = XFS_B_TO_FSB(mp, mp->m_super->s_maxbytes);
> -	if (first_unmap_block == last_block)
> +	if (first_unmap_block == XFS_MAX_FILEOFF)
>  		return 0;
>  
> -	ASSERT(first_unmap_block < last_block);
> -	unmap_len = last_block - first_unmap_block + 1;
> -	while (!done) {
> +	ASSERT(first_unmap_block < XFS_MAX_FILEOFF);

Instead of the assert we could just do the early return for

	first_unmap_block >= XFS_MAX_FILEOFF

and throw in a WARN_ON_ONCE, as that condition really should be nothing
but a sanity check.

Otherwise this looks good to me.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/3] xfs: fix s_maxbytes computation on 32-bit kernels
  2020-01-08  4:17 ` [PATCH 3/3] xfs: fix s_maxbytes computation on 32-bit kernels Darrick J. Wong
@ 2020-01-08  8:12   ` Christoph Hellwig
  0 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2020-01-08  8:12 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] xfs: introduce XFS_MAX_FILEOFF
  2020-01-08  8:09   ` Christoph Hellwig
@ 2020-01-08 16:37     ` Darrick J. Wong
  0 siblings, 0 replies; 15+ messages in thread
From: Darrick J. Wong @ 2020-01-08 16:37 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Wed, Jan 08, 2020 at 12:09:27AM -0800, Christoph Hellwig wrote:
> > --- a/fs/xfs/libxfs/xfs_format.h
> > +++ b/fs/xfs/libxfs/xfs_format.h
> > @@ -1540,6 +1540,7 @@ typedef struct xfs_bmdr_block {
> >  #define BMBT_BLOCKCOUNT_BITLEN	21
> >  
> >  #define BMBT_STARTOFF_MASK	((1ULL << BMBT_STARTOFF_BITLEN) - 1)
> > +#define XFS_MAX_FILEOFF		(BMBT_STARTOFF_MASK)
> 
> I think throwing in a comment here would be useful.
> 
> Otherwise the patch looks good to me.

Ok, done.

--D

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 2/3] xfs: truncate should remove all blocks, not just to the end of the page cache
  2020-01-08  8:11   ` Christoph Hellwig
@ 2020-01-08 16:37     ` Darrick J. Wong
  0 siblings, 0 replies; 15+ messages in thread
From: Darrick J. Wong @ 2020-01-08 16:37 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-xfs

On Wed, Jan 08, 2020 at 12:11:57AM -0800, Christoph Hellwig wrote:
> On Tue, Jan 07, 2020 at 08:17:45PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > xfs_itruncate_extents_flags() is supposed to unmap every block in a file
> > from EOF onwards.  Oddly, it uses s_maxbytes as the upper limit to the
> > bunmapi range, even though s_maxbytes reflects the highest offset the
> > pagecache can support, not the highest offset that XFS supports.
> > 
> > The result of this confusion is that if you create a 20T file on a
> > 64-bit machine, mount the filesystem on a 32-bit machine, and remove the
> > file, we leak everything above 16T.  Fix this by capping the bunmapi
> > request at the maximum possible block offset, not s_maxbytes.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  fs/xfs/xfs_inode.c |   23 +++++++++++------------
> >  1 file changed, 11 insertions(+), 12 deletions(-)
> > 
> > 
> > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> > index fc3aec26ef87..79799ab30c93 100644
> > --- a/fs/xfs/xfs_inode.c
> > +++ b/fs/xfs/xfs_inode.c
> > @@ -1518,7 +1518,6 @@ xfs_itruncate_extents_flags(
> >  	struct xfs_mount	*mp = ip->i_mount;
> >  	struct xfs_trans	*tp = *tpp;
> >  	xfs_fileoff_t		first_unmap_block;
> > -	xfs_fileoff_t		last_block;
> >  	xfs_filblks_t		unmap_len;
> >  	int			error = 0;
> >  
> > @@ -1540,21 +1539,21 @@ xfs_itruncate_extents_flags(
> >  	 * the end of the file (in a crash where the space is allocated
> >  	 * but the inode size is not yet updated), simply remove any
> >  	 * blocks which show up between the new EOF and the maximum
> > -	 * possible file size.  If the first block to be removed is
> > -	 * beyond the maximum file size (ie it is the same as last_block),
> > -	 * then there is nothing to do.
> > +	 * possible file size.
> > +	 *
> > +	 * We have to free all the blocks to the bmbt maximum offset, even if
> > +	 * the page cache can't scale that far.
> >  	 */
> >  	first_unmap_block = XFS_B_TO_FSB(mp, (xfs_ufsize_t)new_size);
> > -	last_block = XFS_B_TO_FSB(mp, mp->m_super->s_maxbytes);
> > -	if (first_unmap_block == last_block)
> > +	if (first_unmap_block == XFS_MAX_FILEOFF)
> >  		return 0;
> >  
> > -	ASSERT(first_unmap_block < last_block);
> > -	unmap_len = last_block - first_unmap_block + 1;
> > -	while (!done) {
> > +	ASSERT(first_unmap_block < XFS_MAX_FILEOFF);
> 
> Instead of the assert we could just do the early return for
> 
> 	first_unmap_block >= XFS_MAX_FILEOFF
> 
> and throw in a WARN_ON_ONCE, as that condition really should be nothing
> but a sanity check.
> 
> Otherwise this looks good to me.

Ok, done.

--D

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] xfs: introduce XFS_MAX_FILEOFF
  2020-01-08  4:17 ` [PATCH 1/3] xfs: introduce XFS_MAX_FILEOFF Darrick J. Wong
  2020-01-08  8:09   ` Christoph Hellwig
@ 2020-01-08 20:40   ` Dave Chinner
  2020-01-08 22:32     ` Darrick J. Wong
  1 sibling, 1 reply; 15+ messages in thread
From: Dave Chinner @ 2020-01-08 20:40 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Tue, Jan 07, 2020 at 08:17:38PM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Introduce a new #define for the maximum supported file block offset.
> We'll use this in the next patch to make it more obvious that we're
> doing some operation for all possible inode fork mappings after a given
> offset.  We can't use ULLONG_MAX here because bunmapi uses that to
> detect when it's done.
> 
> Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> ---
>  fs/xfs/libxfs/xfs_format.h |    1 +
>  fs/xfs/xfs_reflink.c       |    3 ++-
>  2 files changed, 3 insertions(+), 1 deletion(-)
> 
> 
> diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h
> index 1b7dcbae051c..c2976e441d43 100644
> --- a/fs/xfs/libxfs/xfs_format.h
> +++ b/fs/xfs/libxfs/xfs_format.h
> @@ -1540,6 +1540,7 @@ typedef struct xfs_bmdr_block {
>  #define BMBT_BLOCKCOUNT_BITLEN	21
>  
>  #define BMBT_STARTOFF_MASK	((1ULL << BMBT_STARTOFF_BITLEN) - 1)
> +#define XFS_MAX_FILEOFF		(BMBT_STARTOFF_MASK)

Isn't the maximum file offset in the BMBT the max start offset + the
max length of the extent that is located at BMBT_STARTOFF_MASK?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] xfs: introduce XFS_MAX_FILEOFF
  2020-01-08 20:40   ` Dave Chinner
@ 2020-01-08 22:32     ` Darrick J. Wong
  2020-01-08 22:42       ` Dave Chinner
  0 siblings, 1 reply; 15+ messages in thread
From: Darrick J. Wong @ 2020-01-08 22:32 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs

On Thu, Jan 09, 2020 at 07:40:41AM +1100, Dave Chinner wrote:
> On Tue, Jan 07, 2020 at 08:17:38PM -0800, Darrick J. Wong wrote:
> > From: Darrick J. Wong <darrick.wong@oracle.com>
> > 
> > Introduce a new #define for the maximum supported file block offset.
> > We'll use this in the next patch to make it more obvious that we're
> > doing some operation for all possible inode fork mappings after a given
> > offset.  We can't use ULLONG_MAX here because bunmapi uses that to
> > detect when it's done.
> > 
> > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > ---
> >  fs/xfs/libxfs/xfs_format.h |    1 +
> >  fs/xfs/xfs_reflink.c       |    3 ++-
> >  2 files changed, 3 insertions(+), 1 deletion(-)
> > 
> > 
> > diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h
> > index 1b7dcbae051c..c2976e441d43 100644
> > --- a/fs/xfs/libxfs/xfs_format.h
> > +++ b/fs/xfs/libxfs/xfs_format.h
> > @@ -1540,6 +1540,7 @@ typedef struct xfs_bmdr_block {
> >  #define BMBT_BLOCKCOUNT_BITLEN	21
> >  
> >  #define BMBT_STARTOFF_MASK	((1ULL << BMBT_STARTOFF_BITLEN) - 1)
> > +#define XFS_MAX_FILEOFF		(BMBT_STARTOFF_MASK)
> 
> Isn't the maximum file offset in the BMBT the max start offset + the
> max length of the extent that is located at BMBT_STARTOFF_MASK?

Apologies for responding to a question with another question, but has
there ever been an XFS that supported an inode size of more than 8EB?

Linux supports at most a file offset of 8EB, which is 2^63-1, or
0x7FFF,FFFF,FFFF,FFFF.  On a filesystem with 512-byte blocks, the very
last byte in the file would be in block 2^54-1, or 0x3F,FFFF,FFFF,FFFF.
Larger blocksizes decrease that even further (e.g. 2^47-1, or
0x7FFF,FFFF,FFFF on 64k block filesystems).

Therefore, on Linux I conclude that the largest file offset (block)
possible is 2^54-1, which is BMBT_STARTOFF_MASK.  Unless there's an
XFS port that actually supports 16EB files, BMBT_STARTOFF_MASK will
suffice here.

--D

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] xfs: introduce XFS_MAX_FILEOFF
  2020-01-08 22:32     ` Darrick J. Wong
@ 2020-01-08 22:42       ` Dave Chinner
  2020-01-08 23:04         ` Darrick J. Wong
  0 siblings, 1 reply; 15+ messages in thread
From: Dave Chinner @ 2020-01-08 22:42 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Wed, Jan 08, 2020 at 02:32:38PM -0800, Darrick J. Wong wrote:
> On Thu, Jan 09, 2020 at 07:40:41AM +1100, Dave Chinner wrote:
> > On Tue, Jan 07, 2020 at 08:17:38PM -0800, Darrick J. Wong wrote:
> > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > 
> > > Introduce a new #define for the maximum supported file block offset.
> > > We'll use this in the next patch to make it more obvious that we're
> > > doing some operation for all possible inode fork mappings after a given
> > > offset.  We can't use ULLONG_MAX here because bunmapi uses that to
> > > detect when it's done.
> > > 
> > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > ---
> > >  fs/xfs/libxfs/xfs_format.h |    1 +
> > >  fs/xfs/xfs_reflink.c       |    3 ++-
> > >  2 files changed, 3 insertions(+), 1 deletion(-)
> > > 
> > > 
> > > diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h
> > > index 1b7dcbae051c..c2976e441d43 100644
> > > --- a/fs/xfs/libxfs/xfs_format.h
> > > +++ b/fs/xfs/libxfs/xfs_format.h
> > > @@ -1540,6 +1540,7 @@ typedef struct xfs_bmdr_block {
> > >  #define BMBT_BLOCKCOUNT_BITLEN	21
> > >  
> > >  #define BMBT_STARTOFF_MASK	((1ULL << BMBT_STARTOFF_BITLEN) - 1)
> > > +#define XFS_MAX_FILEOFF		(BMBT_STARTOFF_MASK)
> > 
> > Isn't the maximum file offset in the BMBT the max start offset + the
> > max length of the extent that is located at BMBT_STARTOFF_MASK?
> 
> Apologies for responding to a question with another question, but has
> there ever been an XFS that supported an inode size of more than 8EB?

Doubt it.

> Linux supports at most a file offset of 8EB, which is 2^63-1, or
> 0x7FFF,FFFF,FFFF,FFFF.  On a filesystem with 512-byte blocks, the very
> last byte in the file would be in block 2^54-1, or 0x3F,FFFF,FFFF,FFFF.
> Larger blocksizes decrease that even further (e.g. 2^47-1, or
> 0x7FFF,FFFF,FFFF on 64k block filesystems).
>
> Therefore, on Linux I conclude that the largest file offset (block)
> possible is 2^54-1, which is BMBT_STARTOFF_MASK.  Unless there's an
> XFS port that actually supports 16EB files, BMBT_STARTOFF_MASK will
> suffice here.

Sure, but my point was that checks against the max file offset
as a block count are applied to the startoff field, not the
startoff + blockcount value, so we can potentially get extents on
disk beyond the above definition of XFS_MAX_FILEOFF...

i.e. startoff can be < XFS_MAX_FILEOFF, but startoff + blockcount
can be > XFS_MAX_FILEOFF, and there's nothing in the code that
prevents that from occurring...

e.g. what's preventing speculative delalloc from going beyond
XFS_MAX_FILEOFF, even though the actual file offset that is being
written is within XFS_MAX_FILEOFF?

Cheers,

Dave.
-- 
Dave Chinner
david@fromorbit.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] xfs: introduce XFS_MAX_FILEOFF
  2020-01-08 22:42       ` Dave Chinner
@ 2020-01-08 23:04         ` Darrick J. Wong
  0 siblings, 0 replies; 15+ messages in thread
From: Darrick J. Wong @ 2020-01-08 23:04 UTC (permalink / raw)
  To: Dave Chinner; +Cc: linux-xfs

On Thu, Jan 09, 2020 at 09:42:16AM +1100, Dave Chinner wrote:
> On Wed, Jan 08, 2020 at 02:32:38PM -0800, Darrick J. Wong wrote:
> > On Thu, Jan 09, 2020 at 07:40:41AM +1100, Dave Chinner wrote:
> > > On Tue, Jan 07, 2020 at 08:17:38PM -0800, Darrick J. Wong wrote:
> > > > From: Darrick J. Wong <darrick.wong@oracle.com>
> > > > 
> > > > Introduce a new #define for the maximum supported file block offset.
> > > > We'll use this in the next patch to make it more obvious that we're
> > > > doing some operation for all possible inode fork mappings after a given
> > > > offset.  We can't use ULLONG_MAX here because bunmapi uses that to
> > > > detect when it's done.
> > > > 
> > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
> > > > ---
> > > >  fs/xfs/libxfs/xfs_format.h |    1 +
> > > >  fs/xfs/xfs_reflink.c       |    3 ++-
> > > >  2 files changed, 3 insertions(+), 1 deletion(-)
> > > > 
> > > > 
> > > > diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h
> > > > index 1b7dcbae051c..c2976e441d43 100644
> > > > --- a/fs/xfs/libxfs/xfs_format.h
> > > > +++ b/fs/xfs/libxfs/xfs_format.h
> > > > @@ -1540,6 +1540,7 @@ typedef struct xfs_bmdr_block {
> > > >  #define BMBT_BLOCKCOUNT_BITLEN	21
> > > >  
> > > >  #define BMBT_STARTOFF_MASK	((1ULL << BMBT_STARTOFF_BITLEN) - 1)
> > > > +#define XFS_MAX_FILEOFF		(BMBT_STARTOFF_MASK)
> > > 
> > > Isn't the maximum file offset in the BMBT the max start offset + the
> > > max length of the extent that is located at BMBT_STARTOFF_MASK?
> > 
> > Apologies for responding to a question with another question, but has
> > there ever been an XFS that supported an inode size of more than 8EB?
> 
> Doubt it.
> 
> > Linux supports at most a file offset of 8EB, which is 2^63-1, or
> > 0x7FFF,FFFF,FFFF,FFFF.  On a filesystem with 512-byte blocks, the very
> > last byte in the file would be in block 2^54-1, or 0x3F,FFFF,FFFF,FFFF.
> > Larger blocksizes decrease that even further (e.g. 2^47-1, or
> > 0x7FFF,FFFF,FFFF on 64k block filesystems).
> >
> > Therefore, on Linux I conclude that the largest file offset (block)
> > possible is 2^54-1, which is BMBT_STARTOFF_MASK.  Unless there's an
> > XFS port that actually supports 16EB files, BMBT_STARTOFF_MASK will
> > suffice here.
> 
> Sure, but my point was that checks against the max file offset
> as a block count are applied to the startoff field, not the
> startoff + blockcount value, so we can potentially get extents on
> disk beyond the above definition of XFS_MAX_FILEOFF...
> 
> i.e. startoff can be < XFS_MAX_FILEOFF, but startoff + blockcount
> can be > XFS_MAX_FILEOFF, and there's nothing in the code that
> prevents that from occurring...
> 
> e.g. what's preventing speculative delalloc from going beyond
> XFS_MAX_FILEOFF, even though the actual file offset that is being
> written is within XFS_MAX_FILEOFF?

I thought we constrained the @prealloc_blocks argument to
xfs_bmapi_reserve_delalloc based on s_maxbytes:

	p_end_fsb = min(p_end_fsb,
		XFS_B_TO_FSB(mp, mp->m_super->s_maxbytes));

But as there's nothing in the verifiers or anywhere else prohibiting
this, I'll change it to (BMBT_STARTOFF_MASK + MAXEXTLEN) for now.

--D

> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/3] xfs: introduce XFS_MAX_FILEOFF
  2020-01-09 18:44 ` [PATCH 1/3] xfs: introduce XFS_MAX_FILEOFF Darrick J. Wong
@ 2020-01-10 11:52   ` Christoph Hellwig
  0 siblings, 0 replies; 15+ messages in thread
From: Christoph Hellwig @ 2020-01-10 11:52 UTC (permalink / raw)
  To: Darrick J. Wong; +Cc: linux-xfs

On Thu, Jan 09, 2020 at 10:44:22AM -0800, Darrick J. Wong wrote:
> From: Darrick J. Wong <darrick.wong@oracle.com>
> 
> Introduce a new #define for the maximum supported file block offset.
> We'll use this in the next patch to make it more obvious that we're
> doing some operation for all possible inode fork mappings after a given
> offset.  We can't use ULLONG_MAX here because bunmapi uses that to
> detect when it's done.

Looks good,

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/3] xfs: introduce XFS_MAX_FILEOFF
  2020-01-09 18:44 [PATCH v3 0/3] xfs: fix maxbytes problems on 32-bit systems Darrick J. Wong
@ 2020-01-09 18:44 ` Darrick J. Wong
  2020-01-10 11:52   ` Christoph Hellwig
  0 siblings, 1 reply; 15+ messages in thread
From: Darrick J. Wong @ 2020-01-09 18:44 UTC (permalink / raw)
  To: darrick.wong; +Cc: linux-xfs

From: Darrick J. Wong <darrick.wong@oracle.com>

Introduce a new #define for the maximum supported file block offset.
We'll use this in the next patch to make it more obvious that we're
doing some operation for all possible inode fork mappings after a given
offset.  We can't use ULLONG_MAX here because bunmapi uses that to
detect when it's done.

Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
---
 fs/xfs/libxfs/xfs_format.h |    7 +++++++
 fs/xfs/xfs_reflink.c       |    3 ++-
 2 files changed, 9 insertions(+), 1 deletion(-)


diff --git a/fs/xfs/libxfs/xfs_format.h b/fs/xfs/libxfs/xfs_format.h
index 1b7dcbae051c..77e9fa385980 100644
--- a/fs/xfs/libxfs/xfs_format.h
+++ b/fs/xfs/libxfs/xfs_format.h
@@ -1540,6 +1540,13 @@ typedef struct xfs_bmdr_block {
 #define BMBT_BLOCKCOUNT_BITLEN	21
 
 #define BMBT_STARTOFF_MASK	((1ULL << BMBT_STARTOFF_BITLEN) - 1)
+#define BMBT_BLOCKCOUNT_MASK	((1ULL << BMBT_BLOCKCOUNT_BITLEN) - 1)
+
+/*
+ * bmbt records have a file offset (block) field that is 54 bits wide, so this
+ * is the largest xfs_fileoff_t that we ever expect to see.
+ */
+#define XFS_MAX_FILEOFF		(BMBT_STARTOFF_MASK + BMBT_BLOCKCOUNT_MASK)
 
 typedef struct xfs_bmbt_rec {
 	__be64			l0, l1;
diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index de451235c4ee..7a6c94295b8a 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -1457,7 +1457,8 @@ xfs_reflink_clear_inode_flag(
 	 * We didn't find any shared blocks so turn off the reflink flag.
 	 * First, get rid of any leftover CoW mappings.
 	 */
-	error = xfs_reflink_cancel_cow_blocks(ip, tpp, 0, NULLFILEOFF, true);
+	error = xfs_reflink_cancel_cow_blocks(ip, tpp, 0, XFS_MAX_FILEOFF,
+			true);
 	if (error)
 		return error;
 


^ permalink raw reply related	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2020-01-10 11:52 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-08  4:17 [PATCH v2 0/3] xfs: fix maxbytes problems on 32-bit systems Darrick J. Wong
2020-01-08  4:17 ` [PATCH 1/3] xfs: introduce XFS_MAX_FILEOFF Darrick J. Wong
2020-01-08  8:09   ` Christoph Hellwig
2020-01-08 16:37     ` Darrick J. Wong
2020-01-08 20:40   ` Dave Chinner
2020-01-08 22:32     ` Darrick J. Wong
2020-01-08 22:42       ` Dave Chinner
2020-01-08 23:04         ` Darrick J. Wong
2020-01-08  4:17 ` [PATCH 2/3] xfs: truncate should remove all blocks, not just to the end of the page cache Darrick J. Wong
2020-01-08  8:11   ` Christoph Hellwig
2020-01-08 16:37     ` Darrick J. Wong
2020-01-08  4:17 ` [PATCH 3/3] xfs: fix s_maxbytes computation on 32-bit kernels Darrick J. Wong
2020-01-08  8:12   ` Christoph Hellwig
2020-01-09 18:44 [PATCH v3 0/3] xfs: fix maxbytes problems on 32-bit systems Darrick J. Wong
2020-01-09 18:44 ` [PATCH 1/3] xfs: introduce XFS_MAX_FILEOFF Darrick J. Wong
2020-01-10 11:52   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).