Linux-Fsdevel Archive on lore.kernel.org
 help / color / Atom feed
From: "Darrick J. Wong" <darrick.wong@oracle.com>
To: Christoph Hellwig <hch@lst.de>
Cc: Goldwyn Rodrigues <rgoldwyn@suse.com>,
	linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH 09/19] xfs: remove xfs_reflink_dirty_extents
Date: Wed, 18 Sep 2019 10:17:33 -0700
Message-ID: <20190918171733.GA2229799@magnolia> (raw)
In-Reply-To: <20190909182722.16783-10-hch@lst.de>

On Mon, Sep 09, 2019 at 08:27:12PM +0200, Christoph Hellwig wrote:
> Now that xfs_file_unshare is not completely dumb we can just call it
> directly without iterating the extent and reflink btrees ourselves.
> 
> Signed-off-by: Christoph Hellwig <hch@lst.de>
> ---
>  fs/xfs/xfs_reflink.c | 108 ++++---------------------------------------
>  1 file changed, 10 insertions(+), 98 deletions(-)
> 
> diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
> index cadc0456804d..73f8cce4722d 100644
> --- a/fs/xfs/xfs_reflink.c
> +++ b/fs/xfs/xfs_reflink.c
> @@ -1381,85 +1381,6 @@ xfs_reflink_remap_prep(
>  	return ret;
>  }
>  
> -/*
> - * The user wants to preemptively CoW all shared blocks in this file,
> - * which enables us to turn off the reflink flag.  Iterate all
> - * extents which are not prealloc/delalloc to see which ranges are
> - * mentioned in the refcount tree, then read those blocks into the
> - * pagecache, dirty them, fsync them back out, and then we can update
> - * the inode flag.  What happens if we run out of memory? :)
> - */
> -STATIC int
> -xfs_reflink_dirty_extents(
> -	struct xfs_inode	*ip,
> -	xfs_fileoff_t		fbno,
> -	xfs_filblks_t		end,
> -	xfs_off_t		isize)
> -{
> -	struct xfs_mount	*mp = ip->i_mount;
> -	xfs_agnumber_t		agno;
> -	xfs_agblock_t		agbno;
> -	xfs_extlen_t		aglen;
> -	xfs_agblock_t		rbno;
> -	xfs_extlen_t		rlen;
> -	xfs_off_t		fpos;
> -	xfs_off_t		flen;
> -	struct xfs_bmbt_irec	map[2];
> -	int			nmaps;
> -	int			error = 0;
> -
> -	while (end - fbno > 0) {
> -		nmaps = 1;
> -		/*
> -		 * Look for extents in the file.  Skip holes, delalloc, or
> -		 * unwritten extents; they can't be reflinked.
> -		 */
> -		error = xfs_bmapi_read(ip, fbno, end - fbno, map, &nmaps, 0);
> -		if (error)
> -			goto out;
> -		if (nmaps == 0)
> -			break;
> -		if (!xfs_bmap_is_real_extent(&map[0]))
> -			goto next;
> -
> -		map[1] = map[0];
> -		while (map[1].br_blockcount) {
> -			agno = XFS_FSB_TO_AGNO(mp, map[1].br_startblock);
> -			agbno = XFS_FSB_TO_AGBNO(mp, map[1].br_startblock);
> -			aglen = map[1].br_blockcount;
> -
> -			error = xfs_reflink_find_shared(mp, NULL, agno, agbno,
> -					aglen, &rbno, &rlen, true);
> -			if (error)
> -				goto out;
> -			if (rbno == NULLAGBLOCK)
> -				break;
> -
> -			/* Dirty the pages */
> -			xfs_iunlock(ip, XFS_ILOCK_EXCL);
> -			fpos = XFS_FSB_TO_B(mp, map[1].br_startoff +
> -					(rbno - agbno));
> -			flen = XFS_FSB_TO_B(mp, rlen);
> -			if (fpos + flen > isize)
> -				flen = isize - fpos;
> -			error = iomap_file_unshare(VFS_I(ip), fpos, flen,
> -					&xfs_iomap_ops);
> -			xfs_ilock(ip, XFS_ILOCK_EXCL);
> -			if (error)
> -				goto out;
> -
> -			map[1].br_blockcount -= (rbno - agbno + rlen);
> -			map[1].br_startoff += (rbno - agbno + rlen);
> -			map[1].br_startblock += (rbno - agbno + rlen);
> -		}
> -
> -next:
> -		fbno = map[0].br_startoff + map[0].br_blockcount;
> -	}
> -out:
> -	return error;
> -}
> -
>  /* Does this inode need the reflink flag? */
>  int
>  xfs_reflink_inode_has_shared_extents(
> @@ -1589,6 +1510,11 @@ xfs_reflink_try_clear_inode_flag(
>  /*
>   * Pre-COW all shared blocks within a given byte range of a file and turn off
>   * the reflink flag if we unshare all of the file's blocks.
> + *
> + * Let iomap iterate all extents to see which are shared and not unwritten or
> + * delalloc and read them into the page cache, dirty them, fsync them back out,
> + * and then we can update the inode flag.  What happens if we run out of
> + * memory? :)

I don't know, what /does/ happen? :)

It /should/ be fine, right?  Writeback will start pushing the dirty
cache pages to disk, and since writeback only takes the ILOCK, it should
be able to perform the COW even while the unshare process sits on the
IOLOCK/MMAPLOCK.  True, the unshare process and writeback will both be
contending on the ILOCK, but that shouldn't be a problem...

...unless I'm missing something?  It sure does look nice to drain all
this other code out.

--D

>   */
>  int
>  xfs_reflink_unshare(
> @@ -1596,10 +1522,7 @@ xfs_reflink_unshare(
>  	xfs_off_t		offset,
>  	xfs_off_t		len)
>  {
> -	struct xfs_mount	*mp = ip->i_mount;
> -	xfs_fileoff_t		fbno;
> -	xfs_filblks_t		end;
> -	xfs_off_t		isize;
> +	struct inode		*inode = VFS_I(ip);
>  	int			error;
>  
>  	if (!xfs_is_reflink_inode(ip))
> @@ -1607,20 +1530,12 @@ xfs_reflink_unshare(
>  
>  	trace_xfs_reflink_unshare(ip, offset, len);
>  
> -	inode_dio_wait(VFS_I(ip));
> +	inode_dio_wait(inode);
>  
> -	/* Try to CoW the selected ranges */
> -	xfs_ilock(ip, XFS_ILOCK_EXCL);
> -	fbno = XFS_B_TO_FSBT(mp, offset);
> -	isize = i_size_read(VFS_I(ip));
> -	end = XFS_B_TO_FSB(mp, offset + len);
> -	error = xfs_reflink_dirty_extents(ip, fbno, end, isize);
> +	error = iomap_file_unshare(inode, offset, len, &xfs_iomap_ops);
>  	if (error)
> -		goto out_unlock;
> -	xfs_iunlock(ip, XFS_ILOCK_EXCL);
> -
> -	/* Wait for the IO to finish */
> -	error = filemap_write_and_wait(VFS_I(ip)->i_mapping);
> +		goto out;
> +	error = filemap_write_and_wait(inode->i_mapping);
>  	if (error)
>  		goto out;
>  
> @@ -1628,11 +1543,8 @@ xfs_reflink_unshare(
>  	error = xfs_reflink_try_clear_inode_flag(ip);
>  	if (error)
>  		goto out;
> -
>  	return 0;
>  
> -out_unlock:
> -	xfs_iunlock(ip, XFS_ILOCK_EXCL);
>  out:
>  	trace_xfs_reflink_unshare_error(ip, error, _RET_IP_);
>  	return error;
> -- 
> 2.20.1
> 

  reply index

Thread overview: 54+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-09-09 18:27 iomap and xfs COW cleanups Christoph Hellwig
2019-09-09 18:27 ` [PATCH 01/19] iomap: better document the IOMAP_F_* flags Christoph Hellwig
2019-09-14  0:42   ` Allison Collins
2019-09-16 18:08   ` Darrick J. Wong
2019-09-09 18:27 ` [PATCH 02/19] iomap: remove the unused iomap argument to __iomap_write_end Christoph Hellwig
2019-09-14  0:42   ` Allison Collins
2019-09-16 18:10   ` Darrick J. Wong
2019-09-09 18:27 ` [PATCH 03/19] iomap: always use AOP_FLAG_NOFS in iomap_write_begin Christoph Hellwig
2019-09-16 18:11   ` Darrick J. Wong
2019-09-09 18:27 ` [PATCH 04/19] iomap: ignore non-shared or non-data blocks in xfs_file_dirty Christoph Hellwig
2019-09-16 18:12   ` Darrick J. Wong
2019-09-09 18:27 ` [PATCH 05/19] iomap: move the zeroing case out of iomap_read_page_sync Christoph Hellwig
2019-09-16 18:17   ` Darrick J. Wong
2019-09-09 18:27 ` [PATCH 06/19] iomap: use write_begin to read pages to unshare Christoph Hellwig
2019-09-16 18:34   ` Darrick J. Wong
2019-09-30 11:07     ` Christoph Hellwig
2019-10-08 15:12       ` Darrick J. Wong
2019-09-09 18:27 ` [PATCH 07/19] iomap: use a srcmap for a read-modify-write I/O Christoph Hellwig
2019-09-10 12:48   ` Goldwyn Rodrigues
2019-09-10 14:39     ` hch
2019-09-16 17:57       ` Darrick J. Wong
2019-09-16 18:42   ` Darrick J. Wong
2019-09-18 18:15     ` Darrick J. Wong
2019-09-09 18:27 ` [PATCH 08/19] xfs: also call xfs_file_iomap_end_delalloc for zeroing operations Christoph Hellwig
2019-09-18 17:09   ` Darrick J. Wong
2019-09-09 18:27 ` [PATCH 09/19] xfs: remove xfs_reflink_dirty_extents Christoph Hellwig
2019-09-18 17:17   ` Darrick J. Wong [this message]
2019-09-18 17:25     ` Christoph Hellwig
2019-09-18 17:31       ` Darrick J. Wong
2019-09-09 18:27 ` [PATCH 10/19] xfs: pass two imaps to xfs_reflink_allocate_cow Christoph Hellwig
2019-09-18 17:26   ` Darrick J. Wong
2019-09-30 11:10     ` Christoph Hellwig
2019-09-09 18:27 ` [PATCH 11/19] xfs: refactor xfs_file_iomap_begin_delay Christoph Hellwig
2019-09-18 17:30   ` Darrick J. Wong
2019-09-09 18:27 ` [PATCH 12/19] xfs: fill out the srcmap in iomap_begin Christoph Hellwig
2019-09-18 17:52   ` Darrick J. Wong
2019-10-01  6:26     ` Christoph Hellwig
2019-09-09 18:27 ` [PATCH 13/19] xfs: factor out a helper to calculate the end_fsb Christoph Hellwig
2019-09-14  0:42   ` Allison Collins
2019-09-18 17:55   ` Darrick J. Wong
2019-09-09 18:27 ` [PATCH 14/19] xfs: split out a new set of read-only iomap ops Christoph Hellwig
2019-09-18 17:56   ` Darrick J. Wong
2019-09-09 18:27 ` [PATCH 15/19] xfs: move xfs_file_iomap_begin_delay around Christoph Hellwig
2019-09-18 17:59   ` Darrick J. Wong
2019-09-30 11:14     ` Christoph Hellwig
2019-09-09 18:27 ` [PATCH 16/19] xfs: split the iomap ops for buffered vs direct writes Christoph Hellwig
2019-09-18 18:00   ` Darrick J. Wong
2019-09-09 18:27 ` [PATCH 17/19] xfs: rename the whichfork variable in xfs_buffered_write_iomap_begin Christoph Hellwig
2019-09-14  0:42   ` Allison Collins
2019-09-18 18:00   ` Darrick J. Wong
2019-09-09 18:27 ` [PATCH 18/19] xfs: cleanup xfs_iomap_write_unwritten Christoph Hellwig
2019-09-18 18:06   ` Darrick J. Wong
2019-09-09 18:27 ` [PATCH 19/19] xfs: improve the IOMAP_NOWAIT check for COW inodes Christoph Hellwig
2019-09-18 18:09   ` Darrick J. Wong

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190918171733.GA2229799@magnolia \
    --to=darrick.wong@oracle.com \
    --cc=hch@lst.de \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=rgoldwyn@suse.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org
	public-inbox-index linux-fsdevel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git