All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Christoph Hellwig <hch@lst.de>
Cc: linux-xfs@vger.kernel.org, Dave Chinner <dchinner@redhat.com>
Subject: Re: [PATCH 01/10] xfs: fix transaction leak in xfs_reflink_allocate_cow()
Date: Mon, 17 Sep 2018 16:51:10 -0700	[thread overview]
Message-ID: <20180917235110.GA20086@magnolia> (raw)
In-Reply-To: <20180917205354.15401-2-hch@lst.de>

On Mon, Sep 17, 2018 at 10:53:45PM +0200, Christoph Hellwig wrote:
> From: Dave Chinner <dchinner@redhat.com>
> 
> When xfs_reflink_allocate_cow() allocates a transaction, it drops
> the ILOCK to perform the operation. This Introduces a race condition
> where another thread modifying the file can perform the COW
> allocation operation underneath us. This result in the retry loop
> finding an allocated block and jumping straight to the conversion
> code. It does not, however, cancel the transaction it holds and so
> this gets leaked. This results in a lockdep warning:
> 
> ================================================
> WARNING: lock held when returning to user space!
> 4.18.5 #1 Not tainted
> ------------------------------------------------
> worker/6123 is leaving the kernel with locks still held!
> 1 lock held by worker/6123:
>  #0: 000000009eab4f1b (sb_internal#2){.+.+}, at: xfs_trans_alloc+0x17c/0x220
> 
> And eventually the filesystem deadlocks because it runs out of log
> space that is reserved by the leaked transaction and never gets
> released.
> 
> The logic flow in xfs_reflink_allocate_cow() is a convoluted mess of
> gotos - it's no surprise that it has bug where the flow through
> several goto jumps then fails to clean up context from a non-obvious
> logic path. CLean up the logic flow and make sure every path does
> the right thing.
> 
> Reported-by: Alexander Y. Fomichev <git.user@gmail.com>
> Tested-by: Alexander Y. Fomichev <git.user@gmail.com>
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=200981
> Signed-off-by: Dave Chinner <dchinner@redhat.com>
> [hch: slight refactor]
> Signed-off-by: Christoph Hellwig <hch@lst.de>

Looks ok,
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>

--D

> ---
>  fs/xfs/xfs_reflink.c | 127 ++++++++++++++++++++++++++-----------------
>  1 file changed, 77 insertions(+), 50 deletions(-)
> 
> diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
> index 38f405415b88..d60d0eeed7b9 100644
> --- a/fs/xfs/xfs_reflink.c
> +++ b/fs/xfs/xfs_reflink.c
> @@ -352,6 +352,47 @@ xfs_reflink_convert_cow(
>  	return error;
>  }
>  
> +/*
> + * Find the extent that maps the given range in the COW fork. Even if the extent
> + * is not shared we might have a preallocation for it in the COW fork. If so we
> + * use it that rather than trigger a new allocation.
> + */
> +static int
> +xfs_find_trim_cow_extent(
> +	struct xfs_inode	*ip,
> +	struct xfs_bmbt_irec	*imap,
> +	bool			*shared,
> +	bool			*found)
> +{
> +	xfs_fileoff_t		offset_fsb = imap->br_startoff;
> +	xfs_filblks_t		count_fsb = imap->br_blockcount;
> +	struct xfs_iext_cursor	icur;
> +	struct xfs_bmbt_irec	got;
> +	bool			trimmed;
> +
> +	*found = false;
> +
> +	/*
> +	 * If we don't find an overlapping extent, trim the range we need to
> +	 * allocate to fit the hole we found.
> +	 */
> +	if (!xfs_iext_lookup_extent(ip, ip->i_cowfp, offset_fsb, &icur, &got) ||
> +	    got.br_startoff > offset_fsb)
> +		return xfs_reflink_trim_around_shared(ip, imap, shared, &trimmed);
> +
> +	*shared = true;
> +	if (isnullstartblock(got.br_startblock)) {
> +		xfs_trim_extent(imap, got.br_startoff, got.br_blockcount);
> +		return 0;
> +	}
> +
> +	/* real extent found - no need to allocate */
> +	xfs_trim_extent(&got, offset_fsb, count_fsb);
> +	*imap = got;
> +	*found = true;
> +	return 0;
> +}
> +
>  /* Allocate all CoW reservations covering a range of blocks in a file. */
>  int
>  xfs_reflink_allocate_cow(
> @@ -363,78 +404,64 @@ xfs_reflink_allocate_cow(
>  	struct xfs_mount	*mp = ip->i_mount;
>  	xfs_fileoff_t		offset_fsb = imap->br_startoff;
>  	xfs_filblks_t		count_fsb = imap->br_blockcount;
> -	struct xfs_bmbt_irec	got;
> -	struct xfs_trans	*tp = NULL;
> +	struct xfs_trans	*tp;
>  	int			nimaps, error = 0;
> -	bool			trimmed;
> +	bool			found;
>  	xfs_filblks_t		resaligned;
>  	xfs_extlen_t		resblks = 0;
> -	struct xfs_iext_cursor	icur;
>  
> -retry:
> -	ASSERT(xfs_is_reflink_inode(ip));
>  	ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL));
> +	ASSERT(xfs_is_reflink_inode(ip));
>  
> -	/*
> -	 * Even if the extent is not shared we might have a preallocation for
> -	 * it in the COW fork.  If so use it.
> -	 */
> -	if (xfs_iext_lookup_extent(ip, ip->i_cowfp, offset_fsb, &icur, &got) &&
> -	    got.br_startoff <= offset_fsb) {
> -		*shared = true;
> -
> -		/* If we have a real allocation in the COW fork we're done. */
> -		if (!isnullstartblock(got.br_startblock)) {
> -			xfs_trim_extent(&got, offset_fsb, count_fsb);
> -			*imap = got;
> -			goto convert;
> -		}
> +	error = xfs_find_trim_cow_extent(ip, imap, shared, &found);
> +	if (error || !*shared)
> +		return error;
> +	if (found)
> +		goto convert;
>  
> -		xfs_trim_extent(imap, got.br_startoff, got.br_blockcount);
> -	} else {
> -		error = xfs_reflink_trim_around_shared(ip, imap, shared, &trimmed);
> -		if (error || !*shared)
> -			goto out;
> -	}
> +	resaligned = xfs_aligned_fsb_count(imap->br_startoff,
> +		imap->br_blockcount, xfs_get_cowextsz_hint(ip));
> +	resblks = XFS_DIOSTRAT_SPACE_RES(mp, resaligned);
>  
> -	if (!tp) {
> -		resaligned = xfs_aligned_fsb_count(imap->br_startoff,
> -			imap->br_blockcount, xfs_get_cowextsz_hint(ip));
> -		resblks = XFS_DIOSTRAT_SPACE_RES(mp, resaligned);
> +	xfs_iunlock(ip, *lockmode);
> +	error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, resblks, 0, 0, &tp);
> +	*lockmode = XFS_ILOCK_EXCL;
> +	xfs_ilock(ip, *lockmode);
>  
> -		xfs_iunlock(ip, *lockmode);
> -		error = xfs_trans_alloc(mp, &M_RES(mp)->tr_write, resblks, 0, 0, &tp);
> -		*lockmode = XFS_ILOCK_EXCL;
> -		xfs_ilock(ip, *lockmode);
> +	if (error)
> +		return error;
>  
> -		if (error)
> -			return error;
> +	error = xfs_qm_dqattach_locked(ip, false);
> +	if (error)
> +		goto out_trans_cancel;
>  
> -		error = xfs_qm_dqattach_locked(ip, false);
> -		if (error)
> -			goto out;
> -		goto retry;
> +	/*
> +	 * Check for an overlapping extent again now that we dropped the ilock.
> +	 */
> +	error = xfs_find_trim_cow_extent(ip, imap, shared, &found);
> +	if (error || !*shared)
> +		goto out_trans_cancel;
> +	if (found) {
> +		xfs_trans_cancel(tp);
> +		goto convert;
>  	}
>  
>  	error = xfs_trans_reserve_quota_nblks(tp, ip, resblks, 0,
>  			XFS_QMOPT_RES_REGBLKS);
>  	if (error)
> -		goto out;
> +		goto out_trans_cancel;
>  
>  	xfs_trans_ijoin(tp, ip, 0);
>  
> -	nimaps = 1;
> -
>  	/* Allocate the entire reservation as unwritten blocks. */
> +	nimaps = 1;
>  	error = xfs_bmapi_write(tp, ip, imap->br_startoff, imap->br_blockcount,
>  			XFS_BMAPI_COWFORK | XFS_BMAPI_PREALLOC,
>  			resblks, imap, &nimaps);
>  	if (error)
> -		goto out_trans_cancel;
> +		goto out_unreserve;
>  
>  	xfs_inode_set_cowblocks_tag(ip);
> -
> -	/* Finish up. */
>  	error = xfs_trans_commit(tp);
>  	if (error)
>  		return error;
> @@ -447,12 +474,12 @@ xfs_reflink_allocate_cow(
>  		return -ENOSPC;
>  convert:
>  	return xfs_reflink_convert_cow_extent(ip, imap, offset_fsb, count_fsb);
> -out_trans_cancel:
> +
> +out_unreserve:
>  	xfs_trans_unreserve_quota_nblks(tp, ip, (long)resblks, 0,
>  			XFS_QMOPT_RES_REGBLKS);
> -out:
> -	if (tp)
> -		xfs_trans_cancel(tp);
> +out_trans_cancel:
> +	xfs_trans_cancel(tp);
>  	return error;
>  }
>  
> -- 
> 2.18.0
> 

  reply	other threads:[~2018-09-18  5:21 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-17 20:53 delalloc and reflink fixes & tweaks Christoph Hellwig
2018-09-17 20:53 ` [PATCH 01/10] xfs: fix transaction leak in xfs_reflink_allocate_cow() Christoph Hellwig
2018-09-17 23:51   ` Darrick J. Wong [this message]
2018-09-17 20:53 ` [PATCH 02/10] xfs: don't bring in extents in xfs_bmap_punch_delalloc_range Christoph Hellwig
2018-09-20 20:23   ` Darrick J. Wong
2018-09-17 20:53 ` [PATCH 03/10] xfs: remove XFS_IO_INVALID Christoph Hellwig
2018-09-20 20:31   ` Darrick J. Wong
2018-09-27 18:38     ` Christoph Hellwig
2018-09-17 20:53 ` [PATCH 04/10] xfs: simplify the IOMAP_ZERO check in xfs_file_iomap_begin a bit Christoph Hellwig
2018-09-20 20:31   ` Darrick J. Wong
2018-09-26 15:17   ` Brian Foster
2018-09-27 18:40     ` Christoph Hellwig
2018-09-17 20:53 ` [PATCH 05/10] xfs: handle zeroing in xfs_file_iomap_begin_delay Christoph Hellwig
2018-09-17 20:53 ` [PATCH 06/10] xfs: always allocate blocks as unwritten for file data Christoph Hellwig
2018-09-17 20:53 ` [PATCH 07/10] xfs: handle extent size hints in xfs_file_iomap_begin_delay Christoph Hellwig
2018-09-26 15:17   ` Brian Foster
2018-10-01 12:38     ` Christoph Hellwig
2018-09-17 20:53 ` [PATCH 08/10] xfs: remove the unused shared argument to xfs_reflink_reserve_cow Christoph Hellwig
2018-09-17 20:53 ` [PATCH 09/10] xfs: remove the unused trimmed argument from xfs_reflink_trim_around_shared Christoph Hellwig
2018-09-17 20:53 ` [PATCH 10/10] xfs: use a separate iomap_ops for delalloc writes Christoph Hellwig
2018-09-26 15:18   ` Brian Foster
2018-10-01 12:40     ` Christoph Hellwig
2018-09-17 21:23 ` delalloc and reflink fixes & tweaks Dave Chinner
2018-09-18 18:17   ` Christoph Hellwig
2018-09-18 23:00     ` Dave Chinner
2018-09-19  5:40       ` Christoph Hellwig

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180917235110.GA20086@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=dchinner@redhat.com \
    --cc=hch@lst.de \
    --cc=linux-xfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.