All of lore.kernel.org
 help / color / mirror / Atom feed
From: Chandan Babu R <chandanrlinux@gmail.com>
To: "Darrick J. Wong" <darrick.wong@oracle.com>
Cc: linux-xfs@vger.kernel.org, hch@lst.de, allison.henderson@oracle.com
Subject: Re: [PATCH V12 04/14] xfs: Check for extent overflow when adding/removing dir entries
Date: Fri, 08 Jan 2021 10:25:35 +0530	[thread overview]
Message-ID: <2055679.3ixLThe9JB@garuda> (raw)
In-Reply-To: <20210108011713.GP38809@magnolia>

On Thu, 07 Jan 2021 17:17:13 -0800, Darrick J. Wong wrote:
> On Mon, Jan 04, 2021 at 04:01:10PM +0530, Chandan Babu R wrote:
> > Directory entry addition can cause the following,
> > 1. Data block can be added/removed.
> >    A new extent can cause extent count to increase by 1.
> > 2. Free disk block can be added/removed.
> >    Same behaviour as described above for Data block.
> > 3. Dabtree blocks.
> >    XFS_DA_NODE_MAXDEPTH blocks can be added. Each of these
> >    can be new extents. Hence extent count can increase by
> >    XFS_DA_NODE_MAXDEPTH.
> > 
> > Directory entry remove and rename (applicable only to the source
> > directory entry) operations are handled specially to allow them to
> > succeed in low extent count availability scenarios
> > i.e. xfs_bmap_del_extent_real() will now return -ENOSPC when a possible
> > extent count overflow is detected. -ENOSPC is already handled by higher
> > layers of XFS by letting,
> > 1. Empty Data/Free space index blocks to linger around until a future
> >    remove operation frees them.
> > 2. Dabtree blocks would be swapped with the last block in the leaf space
> >    followed by unmapping of the new last block.
> > 
> > Also, Extent overflow check is performed for the target directory entry
> > of the rename operation only when the entry does not exist and a
> > non-zero space reservation is obtained successfully.
> > 
> > Signed-off-by: Chandan Babu R <chandanrlinux@gmail.com>
> > ---
> >  fs/xfs/libxfs/xfs_bmap.c       | 15 ++++++++++++
> >  fs/xfs/libxfs/xfs_inode_fork.h | 13 ++++++++++
> >  fs/xfs/xfs_inode.c             | 45 ++++++++++++++++++++++++++++++++++
> >  fs/xfs/xfs_symlink.c           |  5 ++++
> >  4 files changed, 78 insertions(+)
> > 
> > diff --git a/fs/xfs/libxfs/xfs_bmap.c b/fs/xfs/libxfs/xfs_bmap.c
> > index 32aeacf6f055..5fd804534e67 100644
> > --- a/fs/xfs/libxfs/xfs_bmap.c
> > +++ b/fs/xfs/libxfs/xfs_bmap.c
> > @@ -5151,6 +5151,21 @@ xfs_bmap_del_extent_real(
> >  		/*
> >  		 * Deleting the middle of the extent.
> >  		 */
> > +
> > +		/*
> > +		 * For directories, -ENOSPC will be handled by higher layers of
> > +		 * XFS by letting the corresponding empty Data/Free blocks to
> > +		 * linger around until a future remove operation. Dabtree blocks
> > +		 * would be swapped with the last block in the leaf space and
> > +		 * then the new last block will be unmapped.
> > +		 */
> > +		if (S_ISDIR(VFS_I(ip)->i_mode) &&
> > +		    whichfork == XFS_DATA_FORK &&
> > +		    xfs_iext_count_may_overflow(ip, whichfork, 1)) {
> > +			error = -ENOSPC;
> > +			goto done;
> 
> Hmm... it strikes me as a little odd that we're checking file mode and
> fork type in the middle of the bmap code.  However, I think it's the
> case that the only place where anyone would punch a hole in the /middle/
> of an extent is xattr trees and regular files, right?  And both of those
> cases are checked before we end up in the bmap code, right?

Yes, your observation is correct. I will remove the file mode and fork type
checks.

> 
> So we only really need this check to prevent extent count overflows when
> removing dirents from directories, like the comment says, and only
> because directories don't have a hard requirement that the bunmapi
> succeeds.  And I think this logic covers xfs_remove too?  That's a bit
> subtle, but as there's no extent count check in that function, there's
> not much to attach a comment to... :)

Yes, To provide more clarity, I should replace the above comment with
following,

  /*
   * -ENOSPC is returned since a directory entry remove operation must not fail
   * due to low extent count availability. -ENOSPC will be handled by higher
   * layers of XFS by letting the corresponding empty Data/Free blocks to linger
   * around until a future remove operation. Dabtree blocks would be swapped with
   * the last block in the leaf space and then the new last block will be
   * unmapped.
   *
   * The above logic also applies to the source directory entry of a rename
   * operation.
   */

> 
> Hm.  I think I'd like xfs_rename to get a brief comment that we're
> protected from extent count overflows in xfs_remove() by virtue of this
> "leave the dir block in place if we ENOSPC" capability:
> 
> 	/*
> 	 * NOTE: We don't need to check for extent overflows here
> 	 * because the dir removename code will leave the dir block
> 	 * in place if the extent count would overflow.
> 	 */
> 	error = xfs_dir_removename(...);

Sure, I will add that.

> 
> Do xattr trees also have the same ability?  I think they do, at least
> for the dabtree part...?

The following code snippet from xfs_da_shrink_inode() does the special casing
only for the data fork i.e. for blocks holding directory entries.

   for (;;) {
           /*
            * Remove extents.  If we get ENOSPC for a dir we have to move
            * the last block to the place we want to kill.
            */
           error = xfs_bunmapi(tp, dp, dead_blkno, count,
                               xfs_bmapi_aflag(w), 0, &done);
           if (error == -ENOSPC) {
                   if (w != XFS_DATA_FORK)
                           break;
                   error = xfs_da3_swap_lastblock(args, &dead_blkno,
                                                 &dead_buf);

So this facility is available only for directory entries.

Hence for xattrs, if we ever reach the extent count limit, the only way out is
to delete the corresponding file.

> 
> I think I would've split this patch into three pieces:
> 
>  - create, link, and symlink in one patch (adding dirents),
>  - the xfs_bmap_del_extent_real change and a comment for xfs_remove
>    (removing dirents)
>  - all the xfs_rename changes (adding and removing dirents)
> 
> Though I dunno, this series is already 14 patches, and the part that I
> care most about is not leaving that subtlety in xfs_remove(). :)

I think you are right about that. I will split this patch according to what
you have mentioned above.

> 
> Other than that, I follow the logic in this patch and will give it a
> testrun tonight.

Thank you!

> 
> --D
> 
> > +		}
> > +
> >  		old = got;
> >  
> >  		got.br_blockcount = del->br_startoff - got.br_startoff;
> > diff --git a/fs/xfs/libxfs/xfs_inode_fork.h b/fs/xfs/libxfs/xfs_inode_fork.h
> > index bcac769a7df6..ea1a9dd8a763 100644
> > --- a/fs/xfs/libxfs/xfs_inode_fork.h
> > +++ b/fs/xfs/libxfs/xfs_inode_fork.h
> > @@ -47,6 +47,19 @@ struct xfs_ifork {
> >   */
> >  #define XFS_IEXT_PUNCH_HOLE_CNT		(1)
> >  
> > +/*
> > + * Directory entry addition can cause the following,
> > + * 1. Data block can be added/removed.
> > + *    A new extent can cause extent count to increase by 1.
> > + * 2. Free disk block can be added/removed.
> > + *    Same behaviour as described above for Data block.
> > + * 3. Dabtree blocks.
> > + *    XFS_DA_NODE_MAXDEPTH blocks can be added. Each of these can be new
> > + *    extents. Hence extent count can increase by XFS_DA_NODE_MAXDEPTH.
> > + */
> > +#define XFS_IEXT_DIR_MANIP_CNT(mp) \
> > +	((XFS_DA_NODE_MAXDEPTH + 1 + 1) * (mp)->m_dir_geo->fsbcount)
> > +
> >  /*
> >   * Fork handling.
> >   */
> > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
> > index b7352bc4c815..0db21368c7e1 100644
> > --- a/fs/xfs/xfs_inode.c
> > +++ b/fs/xfs/xfs_inode.c
> > @@ -1042,6 +1042,11 @@ xfs_create(
> >  	if (error)
> >  		goto out_trans_cancel;
> >  
> > +	error = xfs_iext_count_may_overflow(dp, XFS_DATA_FORK,
> > +			XFS_IEXT_DIR_MANIP_CNT(mp));
> > +	if (error)
> > +		goto out_trans_cancel;
> > +
> >  	/*
> >  	 * A newly created regular or special file just has one directory
> >  	 * entry pointing to them, but a directory also the "." entry
> > @@ -1258,6 +1263,11 @@ xfs_link(
> >  	xfs_trans_ijoin(tp, sip, XFS_ILOCK_EXCL);
> >  	xfs_trans_ijoin(tp, tdp, XFS_ILOCK_EXCL);
> >  
> > +	error = xfs_iext_count_may_overflow(tdp, XFS_DATA_FORK,
> > +			XFS_IEXT_DIR_MANIP_CNT(mp));
> > +	if (error)
> > +		goto error_return;
> > +
> >  	/*
> >  	 * If we are using project inheritance, we only allow hard link
> >  	 * creation in our tree when the project IDs are the same; else
> > @@ -3106,6 +3116,35 @@ xfs_rename(
> >  	/*
> >  	 * Check for expected errors before we dirty the transaction
> >  	 * so we can return an error without a transaction abort.
> > +	 *
> > +	 * Extent count overflow check:
> > +	 *
> > +	 * From the perspective of src_dp, a rename operation is essentially a
> > +	 * directory entry remove operation. Hence the only place where we check
> > +	 * for extent count overflow for src_dp is in
> > +	 * xfs_bmap_del_extent_real(). xfs_bmap_del_extent_real() returns
> > +	 * -ENOSPC when it detects a possible extent count overflow and in
> > +	 * response, the higher layers of directory handling code do the
> > +	 * following:
> > +	 * 1. Data/Free blocks: XFS lets these blocks linger around until a
> > +	 *    future remove operation removes them.
> > +	 * 2. Dabtree blocks: XFS swaps the blocks with the last block in the
> > +	 *    Leaf space and unmaps the last block.
> > +	 *
> > +	 * For target_dp, there are two cases depending on whether the
> > +	 * destination directory entry exists or not.
> > +	 *
> > +	 * When destination directory entry does not exist (i.e. target_ip ==
> > +	 * NULL), extent count overflow check is performed only when transaction
> > +	 * has a non-zero sized space reservation associated with it.  With a
> > +	 * zero-sized space reservation, XFS allows a rename operation to
> > +	 * continue only when the directory has sufficient free space in its
> > +	 * data/leaf/free space blocks to hold the new entry.
> > +	 *
> > +	 * When destination directory entry exists (i.e. target_ip != NULL), all
> > +	 * we need to do is change the inode number associated with the already
> > +	 * existing entry. Hence there is no need to perform an extent count
> > +	 * overflow check.
> >  	 */
> >  	if (target_ip == NULL) {
> >  		/*
> > @@ -3116,6 +3155,12 @@ xfs_rename(
> >  			error = xfs_dir_canenter(tp, target_dp, target_name);
> >  			if (error)
> >  				goto out_trans_cancel;
> > +		} else {
> > +			error = xfs_iext_count_may_overflow(target_dp,
> > +					XFS_DATA_FORK,
> > +					XFS_IEXT_DIR_MANIP_CNT(mp));
> > +			if (error)
> > +				goto out_trans_cancel;
> >  		}
> >  	} else {
> >  		/*
> > diff --git a/fs/xfs/xfs_symlink.c b/fs/xfs/xfs_symlink.c
> > index 1f43fd7f3209..0b8136a32484 100644
> > --- a/fs/xfs/xfs_symlink.c
> > +++ b/fs/xfs/xfs_symlink.c
> > @@ -220,6 +220,11 @@ xfs_symlink(
> >  	if (error)
> >  		goto out_trans_cancel;
> >  
> > +	error = xfs_iext_count_may_overflow(dp, XFS_DATA_FORK,
> > +			XFS_IEXT_DIR_MANIP_CNT(mp));
> > +	if (error)
> > +		goto out_trans_cancel;
> > +
> >  	/*
> >  	 * Allocate an inode for the symlink.
> >  	 */
> 


-- 
chandan




  reply	other threads:[~2021-01-08  4:56 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-04 10:31 [PATCH V12 00/14] Bail out if transaction can cause extent count to overflow Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 01/14] xfs: Add helper for checking per-inode extent count overflow Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 02/14] xfs: Check for extent overflow when trivally adding a new extent Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 03/14] xfs: Check for extent overflow when punching a hole Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 04/14] xfs: Check for extent overflow when adding/removing dir entries Chandan Babu R
2021-01-08  1:17   ` Darrick J. Wong
2021-01-08  4:55     ` Chandan Babu R [this message]
2021-01-09  1:33       ` Darrick J. Wong
2021-01-09  5:26         ` Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 05/14] xfs: Check for extent overflow when adding/removing xattrs Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 06/14] xfs: Check for extent overflow when writing to unwritten extent Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 07/14] xfs: Check for extent overflow when moving extent from cow to data fork Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 08/14] xfs: Check for extent overflow when remapping an extent Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 09/14] xfs: Check for extent overflow when swapping extents Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 10/14] xfs: Introduce error injection to reduce maximum inode fork extent count Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 11/14] xfs: Remove duplicate assert statement in xfs_bmap_btalloc() Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 12/14] xfs: Compute bmap extent alignments in a separate function Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 13/14] xfs: Process allocated extent " Chandan Babu R
2021-01-04 10:31 ` [PATCH V12 14/14] xfs: Introduce error injection to allocate only minlen size extents for files Chandan Babu R

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2055679.3ixLThe9JB@garuda \
    --to=chandanrlinux@gmail.com \
    --cc=allison.henderson@oracle.com \
    --cc=darrick.wong@oracle.com \
    --cc=hch@lst.de \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.