* fix locking for the reflink operation @ 2016-10-17 12:05 Christoph Hellwig 2016-10-17 12:05 ` [PATCH 1/4] xfs: remove the same fs check from xfs_file_share_range Christoph Hellwig ` (4 more replies) 0 siblings, 5 replies; 11+ messages in thread From: Christoph Hellwig @ 2016-10-17 12:05 UTC (permalink / raw) To: linux-xfs; +Cc: darrick.wong When creating a reflink we need to take the iolock much earlier, as various early checks done in xfs_file_share_range currently are racy without it. Patches 1-3 sort that out in a minimal invasive way, but I think we should just merge xfs_file_share_range and xfs_reflink_remap_range, which is what patch 4 does. Patches 1-3 are something I'd like to see in 4.9, patch 4 might not fully qualify, but just getting it in might make everyones life easier. ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 1/4] xfs: remove the same fs check from xfs_file_share_range 2016-10-17 12:05 fix locking for the reflink operation Christoph Hellwig @ 2016-10-17 12:05 ` Christoph Hellwig 2016-10-17 21:11 ` Darrick J. Wong 2016-10-17 12:05 ` [PATCH 2/4] xfs: fix the same_inode check in xfs_file_share_range Christoph Hellwig ` (3 subsequent siblings) 4 siblings, 1 reply; 11+ messages in thread From: Christoph Hellwig @ 2016-10-17 12:05 UTC (permalink / raw) To: linux-xfs; +Cc: darrick.wong The VFS already does the check, and the placement of this duplicate is in the way of the following locking rework. Signed-off-by: Christoph Hellwig <hch@lst.de> --- fs/xfs/xfs_file.c | 3 --- 1 file changed, 3 deletions(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index a314fc7..20563f2 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -1003,9 +1003,6 @@ xfs_file_share_range( IS_SWAPFILE(inode_out)) return -ETXTBSY; - /* Reflink only works within this filesystem. */ - if (inode_in->i_sb != inode_out->i_sb) - return -EXDEV; same_inode = (inode_in->i_ino == inode_out->i_ino); /* Don't reflink dirs, pipes, sockets... */ -- 2.1.4 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/4] xfs: remove the same fs check from xfs_file_share_range 2016-10-17 12:05 ` [PATCH 1/4] xfs: remove the same fs check from xfs_file_share_range Christoph Hellwig @ 2016-10-17 21:11 ` Darrick J. Wong 0 siblings, 0 replies; 11+ messages in thread From: Darrick J. Wong @ 2016-10-17 21:11 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-xfs On Mon, Oct 17, 2016 at 02:05:17PM +0200, Christoph Hellwig wrote: > The VFS already does the check, and the placement of this duplicate is in > the way of the following locking rework. > > Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> > --- > fs/xfs/xfs_file.c | 3 --- > 1 file changed, 3 deletions(-) > > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c > index a314fc7..20563f2 100644 > --- a/fs/xfs/xfs_file.c > +++ b/fs/xfs/xfs_file.c > @@ -1003,9 +1003,6 @@ xfs_file_share_range( > IS_SWAPFILE(inode_out)) > return -ETXTBSY; > > - /* Reflink only works within this filesystem. */ > - if (inode_in->i_sb != inode_out->i_sb) > - return -EXDEV; > same_inode = (inode_in->i_ino == inode_out->i_ino); > > /* Don't reflink dirs, pipes, sockets... */ > -- > 2.1.4 > ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 2/4] xfs: fix the same_inode check in xfs_file_share_range 2016-10-17 12:05 fix locking for the reflink operation Christoph Hellwig 2016-10-17 12:05 ` [PATCH 1/4] xfs: remove the same fs check from xfs_file_share_range Christoph Hellwig @ 2016-10-17 12:05 ` Christoph Hellwig 2016-10-17 21:12 ` Darrick J. Wong 2016-10-17 12:05 ` [PATCH 3/4] xfs: move inode locking from xfs_reflink_remap_range to xfs_file_share_range Christoph Hellwig ` (2 subsequent siblings) 4 siblings, 1 reply; 11+ messages in thread From: Christoph Hellwig @ 2016-10-17 12:05 UTC (permalink / raw) To: linux-xfs; +Cc: darrick.wong The VFS i_ino is an unsigned long, while XFS inode numbers are 64-bit wide, so checking i_ino for equality could lead to rate false positives on 32-bit architectures. Just compare the inode pointers themselves to be safe. Signed-off-by: Christoph Hellwig <hch@lst.de> --- fs/xfs/xfs_file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 20563f2..012a960 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -1003,7 +1003,7 @@ xfs_file_share_range( IS_SWAPFILE(inode_out)) return -ETXTBSY; - same_inode = (inode_in->i_ino == inode_out->i_ino); + same_inode = (inode_in == inode_out); /* Don't reflink dirs, pipes, sockets... */ if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode)) -- 2.1.4 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 2/4] xfs: fix the same_inode check in xfs_file_share_range 2016-10-17 12:05 ` [PATCH 2/4] xfs: fix the same_inode check in xfs_file_share_range Christoph Hellwig @ 2016-10-17 21:12 ` Darrick J. Wong 0 siblings, 0 replies; 11+ messages in thread From: Darrick J. Wong @ 2016-10-17 21:12 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-xfs On Mon, Oct 17, 2016 at 02:05:18PM +0200, Christoph Hellwig wrote: > The VFS i_ino is an unsigned long, while XFS inode numbers are 64-bit > wide, so checking i_ino for equality could lead to rate false positives > on 32-bit architectures. Just compare the inode pointers themselves > to be safe. > > Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> > --- > fs/xfs/xfs_file.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c > index 20563f2..012a960 100644 > --- a/fs/xfs/xfs_file.c > +++ b/fs/xfs/xfs_file.c > @@ -1003,7 +1003,7 @@ xfs_file_share_range( > IS_SWAPFILE(inode_out)) > return -ETXTBSY; > > - same_inode = (inode_in->i_ino == inode_out->i_ino); > + same_inode = (inode_in == inode_out); > > /* Don't reflink dirs, pipes, sockets... */ > if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode)) > -- > 2.1.4 > ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 3/4] xfs: move inode locking from xfs_reflink_remap_range to xfs_file_share_range 2016-10-17 12:05 fix locking for the reflink operation Christoph Hellwig 2016-10-17 12:05 ` [PATCH 1/4] xfs: remove the same fs check from xfs_file_share_range Christoph Hellwig 2016-10-17 12:05 ` [PATCH 2/4] xfs: fix the same_inode check in xfs_file_share_range Christoph Hellwig @ 2016-10-17 12:05 ` Christoph Hellwig 2016-10-17 21:16 ` Darrick J. Wong 2016-10-17 12:05 ` [PATCH 4/4] xfs: merge xfs_reflink_remap_range and xfs_file_share_range Christoph Hellwig 2016-10-17 21:35 ` fix locking for the reflink operation Darrick J. Wong 4 siblings, 1 reply; 11+ messages in thread From: Christoph Hellwig @ 2016-10-17 12:05 UTC (permalink / raw) To: linux-xfs; +Cc: darrick.wong We need the iolock protection to stabilizie the IS_SWAPFILE and IS_IMMUTABLE values, as well as preventing new buffered writers re-dirtying the file data that we just wrote out. Signed-off-by: Christoph Hellwig <hch@lst.de> --- fs/xfs/xfs_file.c | 62 ++++++++++++++++++++++++++++++++++------------------ fs/xfs/xfs_reflink.c | 15 ------------- 2 files changed, 41 insertions(+), 36 deletions(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 012a960..0960264 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -996,38 +996,54 @@ xfs_file_share_range( inode_out = file_inode(file_out); bs = inode_out->i_sb->s_blocksize; + /* Lock both files against IO */ + same_inode = (inode_in == inode_out); + if (same_inode) { + xfs_ilock(XFS_I(inode_in), XFS_IOLOCK_EXCL); + xfs_ilock(XFS_I(inode_in), XFS_MMAPLOCK_EXCL); + } else { + xfs_lock_two_inodes(XFS_I(inode_in), XFS_I(inode_out), + XFS_IOLOCK_EXCL); + xfs_lock_two_inodes(XFS_I(inode_in), XFS_I(inode_out), + XFS_MMAPLOCK_EXCL); + } + /* Don't touch certain kinds of inodes */ + ret = -EPERM; if (IS_IMMUTABLE(inode_out)) - return -EPERM; - if (IS_SWAPFILE(inode_in) || - IS_SWAPFILE(inode_out)) - return -ETXTBSY; - - same_inode = (inode_in == inode_out); + goto out_unlock; + ret = -ETXTBSY; + if (IS_SWAPFILE(inode_in) || IS_SWAPFILE(inode_out)) + goto out_unlock; /* Don't reflink dirs, pipes, sockets... */ + ret = -EISDIR; if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode)) - return -EISDIR; + goto out_unlock; + ret = -EINVAL; if (S_ISFIFO(inode_in->i_mode) || S_ISFIFO(inode_out->i_mode)) - return -EINVAL; + goto out_unlock; if (!S_ISREG(inode_in->i_mode) || !S_ISREG(inode_out->i_mode)) - return -EINVAL; + goto out_unlock; /* Don't share DAX file data for now. */ if (IS_DAX(inode_in) || IS_DAX(inode_out)) - return -EINVAL; + goto out_unlock; /* Are we going all the way to the end? */ isize = i_size_read(inode_in); - if (isize == 0) - return 0; + if (isize == 0) { + ret = 0; + goto out_unlock; + } + if (len == 0) len = isize - pos_in; /* Ensure offsets don't wrap and the input is inside i_size */ if (pos_in + len < pos_in || pos_out + len < pos_out || pos_in + len > isize) - return -EINVAL; + goto out_unlock; /* Don't allow dedupe past EOF in the dest file */ if (is_dedupe) { @@ -1035,7 +1051,7 @@ xfs_file_share_range( disize = i_size_read(inode_out); if (pos_out >= disize || pos_out + len > disize) - return -EINVAL; + goto out_unlock; } /* If we're linking to EOF, continue to the block boundary. */ @@ -1047,28 +1063,32 @@ xfs_file_share_range( /* Only reflink if we're aligned to block boundaries */ if (!IS_ALIGNED(pos_in, bs) || !IS_ALIGNED(pos_in + blen, bs) || !IS_ALIGNED(pos_out, bs) || !IS_ALIGNED(pos_out + blen, bs)) - return -EINVAL; + goto out_unlock; /* Don't allow overlapped reflink within the same file */ if (same_inode && pos_out + blen > pos_in && pos_out < pos_in + blen) - return -EINVAL; + goto out_unlock; /* Wait for the completion of any pending IOs on srcfile */ ret = xfs_file_wait_for_io(inode_in, pos_in, len); if (ret) - goto out; + goto out_unlock; ret = xfs_file_wait_for_io(inode_out, pos_out, len); if (ret) - goto out; + goto out_unlock; if (is_dedupe) flags |= XFS_REFLINK_DEDUPE; ret = xfs_reflink_remap_range(XFS_I(inode_in), pos_in, XFS_I(inode_out), pos_out, len, flags); - if (ret < 0) - goto out; -out: +out_unlock: + xfs_iunlock(XFS_I(inode_in), XFS_MMAPLOCK_EXCL); + xfs_iunlock(XFS_I(inode_in), XFS_IOLOCK_EXCL); + if (!same_inode) { + xfs_iunlock(XFS_I(inode_out), XFS_MMAPLOCK_EXCL); + xfs_iunlock(XFS_I(inode_out), XFS_IOLOCK_EXCL); + } return ret; } diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 5965e94..d012746 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -1341,15 +1341,6 @@ xfs_reflink_remap_range( trace_xfs_reflink_remap_range(src, srcoff, len, dest, destoff); - /* Lock both files against IO */ - if (src->i_ino == dest->i_ino) { - xfs_ilock(src, XFS_IOLOCK_EXCL); - xfs_ilock(src, XFS_MMAPLOCK_EXCL); - } else { - xfs_lock_two_inodes(src, dest, XFS_IOLOCK_EXCL); - xfs_lock_two_inodes(src, dest, XFS_MMAPLOCK_EXCL); - } - /* * Check that the extents are the same. */ @@ -1401,12 +1392,6 @@ xfs_reflink_remap_range( goto out_error; out_error: - xfs_iunlock(src, XFS_MMAPLOCK_EXCL); - xfs_iunlock(src, XFS_IOLOCK_EXCL); - if (src->i_ino != dest->i_ino) { - xfs_iunlock(dest, XFS_MMAPLOCK_EXCL); - xfs_iunlock(dest, XFS_IOLOCK_EXCL); - } if (error) trace_xfs_reflink_remap_range_error(dest, error, _RET_IP_); return error; -- 2.1.4 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 3/4] xfs: move inode locking from xfs_reflink_remap_range to xfs_file_share_range 2016-10-17 12:05 ` [PATCH 3/4] xfs: move inode locking from xfs_reflink_remap_range to xfs_file_share_range Christoph Hellwig @ 2016-10-17 21:16 ` Darrick J. Wong 0 siblings, 0 replies; 11+ messages in thread From: Darrick J. Wong @ 2016-10-17 21:16 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-xfs On Mon, Oct 17, 2016 at 02:05:19PM +0200, Christoph Hellwig wrote: > We need the iolock protection to stabilizie the IS_SWAPFILE and IS_IMMUTABLE > values, as well as preventing new buffered writers re-dirtying the file data > that we just wrote out. > > Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> > --- > fs/xfs/xfs_file.c | 62 ++++++++++++++++++++++++++++++++++------------------ > fs/xfs/xfs_reflink.c | 15 ------------- > 2 files changed, 41 insertions(+), 36 deletions(-) > > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c > index 012a960..0960264 100644 > --- a/fs/xfs/xfs_file.c > +++ b/fs/xfs/xfs_file.c > @@ -996,38 +996,54 @@ xfs_file_share_range( > inode_out = file_inode(file_out); > bs = inode_out->i_sb->s_blocksize; > > + /* Lock both files against IO */ > + same_inode = (inode_in == inode_out); > + if (same_inode) { > + xfs_ilock(XFS_I(inode_in), XFS_IOLOCK_EXCL); > + xfs_ilock(XFS_I(inode_in), XFS_MMAPLOCK_EXCL); > + } else { > + xfs_lock_two_inodes(XFS_I(inode_in), XFS_I(inode_out), > + XFS_IOLOCK_EXCL); > + xfs_lock_two_inodes(XFS_I(inode_in), XFS_I(inode_out), > + XFS_MMAPLOCK_EXCL); > + } > + > /* Don't touch certain kinds of inodes */ > + ret = -EPERM; > if (IS_IMMUTABLE(inode_out)) > - return -EPERM; > - if (IS_SWAPFILE(inode_in) || > - IS_SWAPFILE(inode_out)) > - return -ETXTBSY; > - > - same_inode = (inode_in == inode_out); > + goto out_unlock; > + ret = -ETXTBSY; > + if (IS_SWAPFILE(inode_in) || IS_SWAPFILE(inode_out)) > + goto out_unlock; > > /* Don't reflink dirs, pipes, sockets... */ > + ret = -EISDIR; > if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode)) > - return -EISDIR; > + goto out_unlock; > + ret = -EINVAL; > if (S_ISFIFO(inode_in->i_mode) || S_ISFIFO(inode_out->i_mode)) > - return -EINVAL; > + goto out_unlock; > if (!S_ISREG(inode_in->i_mode) || !S_ISREG(inode_out->i_mode)) > - return -EINVAL; > + goto out_unlock; > > /* Don't share DAX file data for now. */ > if (IS_DAX(inode_in) || IS_DAX(inode_out)) > - return -EINVAL; > + goto out_unlock; > > /* Are we going all the way to the end? */ > isize = i_size_read(inode_in); > - if (isize == 0) > - return 0; > + if (isize == 0) { > + ret = 0; > + goto out_unlock; > + } > + > if (len == 0) > len = isize - pos_in; > > /* Ensure offsets don't wrap and the input is inside i_size */ > if (pos_in + len < pos_in || pos_out + len < pos_out || > pos_in + len > isize) > - return -EINVAL; > + goto out_unlock; > > /* Don't allow dedupe past EOF in the dest file */ > if (is_dedupe) { > @@ -1035,7 +1051,7 @@ xfs_file_share_range( > > disize = i_size_read(inode_out); > if (pos_out >= disize || pos_out + len > disize) > - return -EINVAL; > + goto out_unlock; > } > > /* If we're linking to EOF, continue to the block boundary. */ > @@ -1047,28 +1063,32 @@ xfs_file_share_range( > /* Only reflink if we're aligned to block boundaries */ > if (!IS_ALIGNED(pos_in, bs) || !IS_ALIGNED(pos_in + blen, bs) || > !IS_ALIGNED(pos_out, bs) || !IS_ALIGNED(pos_out + blen, bs)) > - return -EINVAL; > + goto out_unlock; > > /* Don't allow overlapped reflink within the same file */ > if (same_inode && pos_out + blen > pos_in && pos_out < pos_in + blen) > - return -EINVAL; > + goto out_unlock; > > /* Wait for the completion of any pending IOs on srcfile */ > ret = xfs_file_wait_for_io(inode_in, pos_in, len); > if (ret) > - goto out; > + goto out_unlock; > ret = xfs_file_wait_for_io(inode_out, pos_out, len); > if (ret) > - goto out; > + goto out_unlock; > > if (is_dedupe) > flags |= XFS_REFLINK_DEDUPE; > ret = xfs_reflink_remap_range(XFS_I(inode_in), pos_in, XFS_I(inode_out), > pos_out, len, flags); > - if (ret < 0) > - goto out; > > -out: > +out_unlock: > + xfs_iunlock(XFS_I(inode_in), XFS_MMAPLOCK_EXCL); > + xfs_iunlock(XFS_I(inode_in), XFS_IOLOCK_EXCL); > + if (!same_inode) { > + xfs_iunlock(XFS_I(inode_out), XFS_MMAPLOCK_EXCL); > + xfs_iunlock(XFS_I(inode_out), XFS_IOLOCK_EXCL); > + } > return ret; > } > > diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c > index 5965e94..d012746 100644 > --- a/fs/xfs/xfs_reflink.c > +++ b/fs/xfs/xfs_reflink.c > @@ -1341,15 +1341,6 @@ xfs_reflink_remap_range( > > trace_xfs_reflink_remap_range(src, srcoff, len, dest, destoff); > > - /* Lock both files against IO */ > - if (src->i_ino == dest->i_ino) { > - xfs_ilock(src, XFS_IOLOCK_EXCL); > - xfs_ilock(src, XFS_MMAPLOCK_EXCL); > - } else { > - xfs_lock_two_inodes(src, dest, XFS_IOLOCK_EXCL); > - xfs_lock_two_inodes(src, dest, XFS_MMAPLOCK_EXCL); > - } > - > /* > * Check that the extents are the same. > */ > @@ -1401,12 +1392,6 @@ xfs_reflink_remap_range( > goto out_error; > > out_error: > - xfs_iunlock(src, XFS_MMAPLOCK_EXCL); > - xfs_iunlock(src, XFS_IOLOCK_EXCL); > - if (src->i_ino != dest->i_ino) { > - xfs_iunlock(dest, XFS_MMAPLOCK_EXCL); > - xfs_iunlock(dest, XFS_IOLOCK_EXCL); > - } > if (error) > trace_xfs_reflink_remap_range_error(dest, error, _RET_IP_); > return error; > -- > 2.1.4 > ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH 4/4] xfs: merge xfs_reflink_remap_range and xfs_file_share_range 2016-10-17 12:05 fix locking for the reflink operation Christoph Hellwig ` (2 preceding siblings ...) 2016-10-17 12:05 ` [PATCH 3/4] xfs: move inode locking from xfs_reflink_remap_range to xfs_file_share_range Christoph Hellwig @ 2016-10-17 12:05 ` Christoph Hellwig 2016-10-17 21:29 ` Darrick J. Wong 2016-10-17 21:35 ` fix locking for the reflink operation Darrick J. Wong 4 siblings, 1 reply; 11+ messages in thread From: Christoph Hellwig @ 2016-10-17 12:05 UTC (permalink / raw) To: linux-xfs; +Cc: darrick.wong There is no clear division of responsibility between those functions, so just merge them into one to keep the code simple. Also move xfs_file_wait_for_io to xfs_reflink.c together with its only caller. Signed-off-by: Christoph Hellwig <hch@lst.de> --- fs/xfs/xfs_file.c | 145 ------------------------------------- fs/xfs/xfs_reflink.c | 199 ++++++++++++++++++++++++++++++++++++++++----------- fs/xfs/xfs_reflink.h | 8 +-- 3 files changed, 161 insertions(+), 191 deletions(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 0960264..cd37573 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -947,151 +947,6 @@ xfs_file_fallocate( return error; } -/* - * Flush all file writes out to disk. - */ -static int -xfs_file_wait_for_io( - struct inode *inode, - loff_t offset, - size_t len) -{ - loff_t rounding; - loff_t ioffset; - loff_t iendoffset; - loff_t bs; - int ret; - - bs = inode->i_sb->s_blocksize; - inode_dio_wait(inode); - - rounding = max_t(xfs_off_t, bs, PAGE_SIZE); - ioffset = round_down(offset, rounding); - iendoffset = round_up(offset + len, rounding) - 1; - ret = filemap_write_and_wait_range(inode->i_mapping, ioffset, - iendoffset); - return ret; -} - -/* Hook up to the VFS reflink function */ -STATIC int -xfs_file_share_range( - struct file *file_in, - loff_t pos_in, - struct file *file_out, - loff_t pos_out, - u64 len, - bool is_dedupe) -{ - struct inode *inode_in; - struct inode *inode_out; - ssize_t ret; - loff_t bs; - loff_t isize; - int same_inode; - loff_t blen; - unsigned int flags = 0; - - inode_in = file_inode(file_in); - inode_out = file_inode(file_out); - bs = inode_out->i_sb->s_blocksize; - - /* Lock both files against IO */ - same_inode = (inode_in == inode_out); - if (same_inode) { - xfs_ilock(XFS_I(inode_in), XFS_IOLOCK_EXCL); - xfs_ilock(XFS_I(inode_in), XFS_MMAPLOCK_EXCL); - } else { - xfs_lock_two_inodes(XFS_I(inode_in), XFS_I(inode_out), - XFS_IOLOCK_EXCL); - xfs_lock_two_inodes(XFS_I(inode_in), XFS_I(inode_out), - XFS_MMAPLOCK_EXCL); - } - - /* Don't touch certain kinds of inodes */ - ret = -EPERM; - if (IS_IMMUTABLE(inode_out)) - goto out_unlock; - ret = -ETXTBSY; - if (IS_SWAPFILE(inode_in) || IS_SWAPFILE(inode_out)) - goto out_unlock; - - /* Don't reflink dirs, pipes, sockets... */ - ret = -EISDIR; - if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode)) - goto out_unlock; - ret = -EINVAL; - if (S_ISFIFO(inode_in->i_mode) || S_ISFIFO(inode_out->i_mode)) - goto out_unlock; - if (!S_ISREG(inode_in->i_mode) || !S_ISREG(inode_out->i_mode)) - goto out_unlock; - - /* Don't share DAX file data for now. */ - if (IS_DAX(inode_in) || IS_DAX(inode_out)) - goto out_unlock; - - /* Are we going all the way to the end? */ - isize = i_size_read(inode_in); - if (isize == 0) { - ret = 0; - goto out_unlock; - } - - if (len == 0) - len = isize - pos_in; - - /* Ensure offsets don't wrap and the input is inside i_size */ - if (pos_in + len < pos_in || pos_out + len < pos_out || - pos_in + len > isize) - goto out_unlock; - - /* Don't allow dedupe past EOF in the dest file */ - if (is_dedupe) { - loff_t disize; - - disize = i_size_read(inode_out); - if (pos_out >= disize || pos_out + len > disize) - goto out_unlock; - } - - /* If we're linking to EOF, continue to the block boundary. */ - if (pos_in + len == isize) - blen = ALIGN(isize, bs) - pos_in; - else - blen = len; - - /* Only reflink if we're aligned to block boundaries */ - if (!IS_ALIGNED(pos_in, bs) || !IS_ALIGNED(pos_in + blen, bs) || - !IS_ALIGNED(pos_out, bs) || !IS_ALIGNED(pos_out + blen, bs)) - goto out_unlock; - - /* Don't allow overlapped reflink within the same file */ - if (same_inode && pos_out + blen > pos_in && pos_out < pos_in + blen) - goto out_unlock; - - /* Wait for the completion of any pending IOs on srcfile */ - ret = xfs_file_wait_for_io(inode_in, pos_in, len); - if (ret) - goto out_unlock; - ret = xfs_file_wait_for_io(inode_out, pos_out, len); - if (ret) - goto out_unlock; - - if (is_dedupe) - flags |= XFS_REFLINK_DEDUPE; - ret = xfs_reflink_remap_range(XFS_I(inode_in), pos_in, XFS_I(inode_out), - pos_out, len, flags); - -out_unlock: - xfs_iunlock(XFS_I(inode_in), XFS_MMAPLOCK_EXCL); - xfs_iunlock(XFS_I(inode_in), XFS_IOLOCK_EXCL); - if (!same_inode) { - xfs_iunlock(XFS_I(inode_out), XFS_MMAPLOCK_EXCL); - xfs_iunlock(XFS_I(inode_out), XFS_IOLOCK_EXCL); - } - return ret; -} - STATIC ssize_t xfs_file_copy_range( struct file *file_in, diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index d012746..11ed2ad 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -1308,23 +1308,56 @@ xfs_compare_extents( } /* + * Flush all file writes out to disk. + */ +static int +xfs_file_wait_for_io( + struct inode *inode, + loff_t offset, + size_t len) +{ + loff_t rounding; + loff_t ioffset; + loff_t iendoffset; + loff_t bs; + int ret; + + bs = inode->i_sb->s_blocksize; + inode_dio_wait(inode); + + rounding = max_t(xfs_off_t, bs, PAGE_SIZE); + ioffset = round_down(offset, rounding); + iendoffset = round_up(offset + len, rounding) - 1; + ret = filemap_write_and_wait_range(inode->i_mapping, ioffset, + iendoffset); + return ret; +} + +/* * Link a range of blocks from one file to another. */ int -xfs_reflink_remap_range( - struct xfs_inode *src, - xfs_off_t srcoff, - struct xfs_inode *dest, - xfs_off_t destoff, - xfs_off_t len, - unsigned int flags) +xfs_file_share_range( + struct file *file_in, + loff_t pos_in, + struct file *file_out, + loff_t pos_out, + u64 len, + bool is_dedupe) { + struct inode *inode_in = file_inode(file_in); + struct xfs_inode *src = XFS_I(inode_in); + struct inode *inode_out = file_inode(file_out); + struct xfs_inode *dest = XFS_I(inode_out); struct xfs_mount *mp = src->i_mount; + loff_t bs = inode_out->i_sb->s_blocksize; + bool same_inode = (inode_in == inode_out); xfs_fileoff_t sfsbno, dfsbno; xfs_filblks_t fsblen; - int error; xfs_extlen_t cowextsize; - bool is_same; + loff_t isize; + ssize_t ret; + loff_t blen; if (!xfs_sb_version_hasreflink(&mp->m_sb)) return -EOPNOTSUPP; @@ -1332,48 +1365,128 @@ xfs_reflink_remap_range( if (XFS_FORCED_SHUTDOWN(mp)) return -EIO; + /* Lock both files against IO */ + if (same_inode) { + xfs_ilock(src, XFS_IOLOCK_EXCL); + xfs_ilock(src, XFS_MMAPLOCK_EXCL); + } else { + xfs_lock_two_inodes(src, dest, XFS_IOLOCK_EXCL); + xfs_lock_two_inodes(src, dest, XFS_MMAPLOCK_EXCL); + } + + /* Don't touch certain kinds of inodes */ + ret = -EPERM; + if (IS_IMMUTABLE(inode_out)) + goto out_unlock; + + ret = -ETXTBSY; + if (IS_SWAPFILE(inode_in) || IS_SWAPFILE(inode_out)) + goto out_unlock; + + + /* Don't reflink dirs, pipes, sockets... */ + ret = -EISDIR; + if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode)) + goto out_unlock; + ret = -EINVAL; + if (S_ISFIFO(inode_in->i_mode) || S_ISFIFO(inode_out->i_mode)) + goto out_unlock; + if (!S_ISREG(inode_in->i_mode) || !S_ISREG(inode_out->i_mode)) + goto out_unlock; + /* Don't reflink realtime inodes */ if (XFS_IS_REALTIME_INODE(src) || XFS_IS_REALTIME_INODE(dest)) - return -EINVAL; + goto out_unlock; + + /* Don't share DAX file data for now. */ + if (IS_DAX(inode_in) || IS_DAX(inode_out)) + goto out_unlock; + + /* Are we going all the way to the end? */ + isize = i_size_read(inode_in); + if (isize == 0) { + ret = 0; + goto out_unlock; + } + + if (len == 0) + len = isize - pos_in; - if (flags & ~XFS_REFLINK_ALL) - return -EINVAL; + /* Ensure offsets don't wrap and the input is inside i_size */ + if (pos_in + len < pos_in || pos_out + len < pos_out || + pos_in + len > isize) + goto out_unlock; + + /* Don't allow dedupe past EOF in the dest file */ + if (is_dedupe) { + loff_t disize; + + disize = i_size_read(inode_out); + if (pos_out >= disize || pos_out + len > disize) + goto out_unlock; + } - trace_xfs_reflink_remap_range(src, srcoff, len, dest, destoff); + /* If we're linking to EOF, continue to the block boundary. */ + if (pos_in + len == isize) + blen = ALIGN(isize, bs) - pos_in; + else + blen = len; + + /* Only reflink if we're aligned to block boundaries */ + if (!IS_ALIGNED(pos_in, bs) || !IS_ALIGNED(pos_in + blen, bs) || + !IS_ALIGNED(pos_out, bs) || !IS_ALIGNED(pos_out + blen, bs)) + goto out_unlock; + + /* Don't allow overlapped reflink within the same file */ + if (same_inode) { + if (pos_out + blen > pos_in && pos_out < pos_in + blen) + goto out_unlock; + } + + /* Wait for the completion of any pending IOs on srcfile */ + ret = xfs_file_wait_for_io(inode_in, pos_in, len); + if (ret) + goto out_unlock; + ret = xfs_file_wait_for_io(inode_out, pos_out, len); + if (ret) + goto out_unlock; + + trace_xfs_reflink_remap_range(src, pos_in, len, dest, pos_out); /* * Check that the extents are the same. */ - if (flags & XFS_REFLINK_DEDUPE) { - is_same = false; - error = xfs_compare_extents(VFS_I(src), srcoff, VFS_I(dest), - destoff, len, &is_same); - if (error) - goto out_error; + if (is_dedupe) { + bool is_same = false; + + ret = xfs_compare_extents(inode_in, pos_in, inode_out, pos_out, + len, &is_same); + if (ret) + goto out_unlock;; if (!is_same) { - error = -EBADE; - goto out_error; + ret = -EBADE; + goto out_unlock; } } - error = xfs_reflink_set_inode_flag(src, dest); - if (error) - goto out_error; + ret = xfs_reflink_set_inode_flag(src, dest); + if (ret) + goto out_unlock; /* * Invalidate the page cache so that we can clear any CoW mappings * in the destination file. */ - truncate_inode_pages_range(&VFS_I(dest)->i_data, destoff, - PAGE_ALIGN(destoff + len) - 1); + truncate_inode_pages_range(&inode_out->i_data, pos_out, + PAGE_ALIGN(pos_out + len) - 1); - dfsbno = XFS_B_TO_FSBT(mp, destoff); - sfsbno = XFS_B_TO_FSBT(mp, srcoff); + dfsbno = XFS_B_TO_FSBT(mp, pos_out); + sfsbno = XFS_B_TO_FSBT(mp, pos_in); fsblen = XFS_B_TO_FSB(mp, len); - error = xfs_reflink_remap_blocks(src, sfsbno, dest, dfsbno, fsblen, - destoff + len); - if (error) - goto out_error; + ret = xfs_reflink_remap_blocks(src, sfsbno, dest, dfsbno, fsblen, + pos_out + len); + if (ret) + goto out_unlock; /* * Carry the cowextsize hint from src to dest if we're sharing the @@ -1381,20 +1494,24 @@ xfs_reflink_remap_range( * has a cowextsize hint, and the destination file does not. */ cowextsize = 0; - if (srcoff == 0 && len == i_size_read(VFS_I(src)) && + if (pos_in == 0 && len == i_size_read(inode_in) && (src->i_d.di_flags2 & XFS_DIFLAG2_COWEXTSIZE) && - destoff == 0 && len >= i_size_read(VFS_I(dest)) && + pos_out == 0 && len >= i_size_read(inode_out) && !(dest->i_d.di_flags2 & XFS_DIFLAG2_COWEXTSIZE)) cowextsize = src->i_d.di_cowextsize; - error = xfs_reflink_update_dest(dest, destoff + len, cowextsize); - if (error) - goto out_error; + ret = xfs_reflink_update_dest(dest, pos_out + len, cowextsize); -out_error: - if (error) - trace_xfs_reflink_remap_range_error(dest, error, _RET_IP_); - return error; +out_unlock: + xfs_iunlock(src, XFS_MMAPLOCK_EXCL); + xfs_iunlock(src, XFS_IOLOCK_EXCL); + if (src->i_ino != dest->i_ino) { + xfs_iunlock(dest, XFS_MMAPLOCK_EXCL); + xfs_iunlock(dest, XFS_IOLOCK_EXCL); + } + if (ret) + trace_xfs_reflink_remap_range_error(dest, ret, _RET_IP_); + return ret; } /* diff --git a/fs/xfs/xfs_reflink.h b/fs/xfs/xfs_reflink.h index 5dc3c8a..25a8956 100644 --- a/fs/xfs/xfs_reflink.h +++ b/fs/xfs/xfs_reflink.h @@ -43,11 +43,6 @@ extern int xfs_reflink_cancel_cow_range(struct xfs_inode *ip, xfs_off_t offset, extern int xfs_reflink_end_cow(struct xfs_inode *ip, xfs_off_t offset, xfs_off_t count); extern int xfs_reflink_recover_cow(struct xfs_mount *mp); -#define XFS_REFLINK_DEDUPE 1 /* only reflink if contents match */ -#define XFS_REFLINK_ALL (XFS_REFLINK_DEDUPE) -extern int xfs_reflink_remap_range(struct xfs_inode *src, xfs_off_t srcoff, - struct xfs_inode *dest, xfs_off_t destoff, xfs_off_t len, - unsigned int flags); extern int xfs_reflink_clear_inode_flag(struct xfs_inode *ip, struct xfs_trans **tpp); extern int xfs_reflink_unshare(struct xfs_inode *ip, xfs_off_t offset, @@ -55,4 +50,7 @@ extern int xfs_reflink_unshare(struct xfs_inode *ip, xfs_off_t offset, extern bool xfs_reflink_has_real_cow_blocks(struct xfs_inode *ip); +int xfs_file_share_range(struct file *file_in, loff_t pos_in, + struct file *file_out, loff_t pos_out, u64 len, bool is_dedupe); + #endif /* __XFS_REFLINK_H */ -- 2.1.4 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 4/4] xfs: merge xfs_reflink_remap_range and xfs_file_share_range 2016-10-17 12:05 ` [PATCH 4/4] xfs: merge xfs_reflink_remap_range and xfs_file_share_range Christoph Hellwig @ 2016-10-17 21:29 ` Darrick J. Wong 2016-10-18 5:17 ` Christoph Hellwig 0 siblings, 1 reply; 11+ messages in thread From: Darrick J. Wong @ 2016-10-17 21:29 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-xfs On Mon, Oct 17, 2016 at 02:05:20PM +0200, Christoph Hellwig wrote: > There is no clear division of responsibility between those functions, so > just merge them into one to keep the code simple. Also move > xfs_file_wait_for_io to xfs_reflink.c together with its only caller. > > Signed-off-by: Christoph Hellwig <hch@lst.de> > --- > fs/xfs/xfs_file.c | 145 ------------------------------------- > fs/xfs/xfs_reflink.c | 199 ++++++++++++++++++++++++++++++++++++++++----------- > fs/xfs/xfs_reflink.h | 8 +-- > 3 files changed, 161 insertions(+), 191 deletions(-) > > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c > index 0960264..cd37573 100644 > --- a/fs/xfs/xfs_file.c > +++ b/fs/xfs/xfs_file.c > @@ -947,151 +947,6 @@ xfs_file_fallocate( > return error; > } > > -/* > - * Flush all file writes out to disk. > - */ > -static int > -xfs_file_wait_for_io( > - struct inode *inode, > - loff_t offset, > - size_t len) > -{ > - loff_t rounding; > - loff_t ioffset; > - loff_t iendoffset; > - loff_t bs; > - int ret; > - > - bs = inode->i_sb->s_blocksize; > - inode_dio_wait(inode); > - > - rounding = max_t(xfs_off_t, bs, PAGE_SIZE); > - ioffset = round_down(offset, rounding); > - iendoffset = round_up(offset + len, rounding) - 1; > - ret = filemap_write_and_wait_range(inode->i_mapping, ioffset, > - iendoffset); > - return ret; > -} > - > -/* Hook up to the VFS reflink function */ > -STATIC int > -xfs_file_share_range( > - struct file *file_in, > - loff_t pos_in, > - struct file *file_out, > - loff_t pos_out, > - u64 len, > - bool is_dedupe) > -{ > - struct inode *inode_in; > - struct inode *inode_out; > - ssize_t ret; > - loff_t bs; > - loff_t isize; > - int same_inode; > - loff_t blen; > - unsigned int flags = 0; > - > - inode_in = file_inode(file_in); > - inode_out = file_inode(file_out); > - bs = inode_out->i_sb->s_blocksize; > - > - /* Lock both files against IO */ > - same_inode = (inode_in == inode_out); > - if (same_inode) { > - xfs_ilock(XFS_I(inode_in), XFS_IOLOCK_EXCL); > - xfs_ilock(XFS_I(inode_in), XFS_MMAPLOCK_EXCL); > - } else { > - xfs_lock_two_inodes(XFS_I(inode_in), XFS_I(inode_out), > - XFS_IOLOCK_EXCL); > - xfs_lock_two_inodes(XFS_I(inode_in), XFS_I(inode_out), > - XFS_MMAPLOCK_EXCL); > - } > - > - /* Don't touch certain kinds of inodes */ > - ret = -EPERM; > - if (IS_IMMUTABLE(inode_out)) > - goto out_unlock; > - ret = -ETXTBSY; > - if (IS_SWAPFILE(inode_in) || IS_SWAPFILE(inode_out)) > - goto out_unlock; > - > - /* Don't reflink dirs, pipes, sockets... */ > - ret = -EISDIR; > - if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode)) > - goto out_unlock; > - ret = -EINVAL; > - if (S_ISFIFO(inode_in->i_mode) || S_ISFIFO(inode_out->i_mode)) > - goto out_unlock; > - if (!S_ISREG(inode_in->i_mode) || !S_ISREG(inode_out->i_mode)) > - goto out_unlock; > - > - /* Don't share DAX file data for now. */ > - if (IS_DAX(inode_in) || IS_DAX(inode_out)) > - goto out_unlock; > - > - /* Are we going all the way to the end? */ > - isize = i_size_read(inode_in); > - if (isize == 0) { > - ret = 0; > - goto out_unlock; > - } > - > - if (len == 0) > - len = isize - pos_in; > - > - /* Ensure offsets don't wrap and the input is inside i_size */ > - if (pos_in + len < pos_in || pos_out + len < pos_out || > - pos_in + len > isize) > - goto out_unlock; > - > - /* Don't allow dedupe past EOF in the dest file */ > - if (is_dedupe) { > - loff_t disize; > - > - disize = i_size_read(inode_out); > - if (pos_out >= disize || pos_out + len > disize) > - goto out_unlock; > - } > - > - /* If we're linking to EOF, continue to the block boundary. */ > - if (pos_in + len == isize) > - blen = ALIGN(isize, bs) - pos_in; > - else > - blen = len; > - > - /* Only reflink if we're aligned to block boundaries */ > - if (!IS_ALIGNED(pos_in, bs) || !IS_ALIGNED(pos_in + blen, bs) || > - !IS_ALIGNED(pos_out, bs) || !IS_ALIGNED(pos_out + blen, bs)) > - goto out_unlock; > - > - /* Don't allow overlapped reflink within the same file */ > - if (same_inode && pos_out + blen > pos_in && pos_out < pos_in + blen) > - goto out_unlock; > - > - /* Wait for the completion of any pending IOs on srcfile */ > - ret = xfs_file_wait_for_io(inode_in, pos_in, len); > - if (ret) > - goto out_unlock; > - ret = xfs_file_wait_for_io(inode_out, pos_out, len); > - if (ret) > - goto out_unlock; > - > - if (is_dedupe) > - flags |= XFS_REFLINK_DEDUPE; > - ret = xfs_reflink_remap_range(XFS_I(inode_in), pos_in, XFS_I(inode_out), > - pos_out, len, flags); > - > -out_unlock: > - xfs_iunlock(XFS_I(inode_in), XFS_MMAPLOCK_EXCL); > - xfs_iunlock(XFS_I(inode_in), XFS_IOLOCK_EXCL); > - if (!same_inode) { > - xfs_iunlock(XFS_I(inode_out), XFS_MMAPLOCK_EXCL); > - xfs_iunlock(XFS_I(inode_out), XFS_IOLOCK_EXCL); > - } > - return ret; > -} > - > STATIC ssize_t > xfs_file_copy_range( > struct file *file_in, > diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c > index d012746..11ed2ad 100644 > --- a/fs/xfs/xfs_reflink.c > +++ b/fs/xfs/xfs_reflink.c > @@ -1308,23 +1308,56 @@ xfs_compare_extents( > } > > /* > + * Flush all file writes out to disk. > + */ > +static int > +xfs_file_wait_for_io( > + struct inode *inode, > + loff_t offset, > + size_t len) > +{ > + loff_t rounding; > + loff_t ioffset; > + loff_t iendoffset; > + loff_t bs; > + int ret; > + > + bs = inode->i_sb->s_blocksize; > + inode_dio_wait(inode); > + > + rounding = max_t(xfs_off_t, bs, PAGE_SIZE); > + ioffset = round_down(offset, rounding); > + iendoffset = round_up(offset + len, rounding) - 1; > + ret = filemap_write_and_wait_range(inode->i_mapping, ioffset, > + iendoffset); > + return ret; > +} This seems like a file action not specific to reflink. > + > +/* > * Link a range of blocks from one file to another. > */ > int > -xfs_reflink_remap_range( > - struct xfs_inode *src, > - xfs_off_t srcoff, > - struct xfs_inode *dest, > - xfs_off_t destoff, > - xfs_off_t len, > - unsigned int flags) > +xfs_file_share_range( xfs_reflink_share_file_range() ? I'd like to maintain the convention that all reflink functions start with xfs_reflink_*, particularly since the xfs_file_* functions largely live in xfs_file.c. > + struct file *file_in, > + loff_t pos_in, > + struct file *file_out, > + loff_t pos_out, > + u64 len, > + bool is_dedupe) > { > + struct inode *inode_in = file_inode(file_in); > + struct xfs_inode *src = XFS_I(inode_in); > + struct inode *inode_out = file_inode(file_out); > + struct xfs_inode *dest = XFS_I(inode_out); > struct xfs_mount *mp = src->i_mount; > + loff_t bs = inode_out->i_sb->s_blocksize; > + bool same_inode = (inode_in == inode_out); > xfs_fileoff_t sfsbno, dfsbno; > xfs_filblks_t fsblen; > - int error; > xfs_extlen_t cowextsize; > - bool is_same; > + loff_t isize; > + ssize_t ret; > + loff_t blen; > > if (!xfs_sb_version_hasreflink(&mp->m_sb)) > return -EOPNOTSUPP; > @@ -1332,48 +1365,128 @@ xfs_reflink_remap_range( > if (XFS_FORCED_SHUTDOWN(mp)) > return -EIO; > > + /* Lock both files against IO */ > + if (same_inode) { > + xfs_ilock(src, XFS_IOLOCK_EXCL); > + xfs_ilock(src, XFS_MMAPLOCK_EXCL); > + } else { > + xfs_lock_two_inodes(src, dest, XFS_IOLOCK_EXCL); > + xfs_lock_two_inodes(src, dest, XFS_MMAPLOCK_EXCL); > + } > + > + /* Don't touch certain kinds of inodes */ > + ret = -EPERM; > + if (IS_IMMUTABLE(inode_out)) > + goto out_unlock; > + > + ret = -ETXTBSY; > + if (IS_SWAPFILE(inode_in) || IS_SWAPFILE(inode_out)) > + goto out_unlock; > + > + > + /* Don't reflink dirs, pipes, sockets... */ > + ret = -EISDIR; > + if (S_ISDIR(inode_in->i_mode) || S_ISDIR(inode_out->i_mode)) > + goto out_unlock; > + ret = -EINVAL; > + if (S_ISFIFO(inode_in->i_mode) || S_ISFIFO(inode_out->i_mode)) > + goto out_unlock; > + if (!S_ISREG(inode_in->i_mode) || !S_ISREG(inode_out->i_mode)) > + goto out_unlock; > + > /* Don't reflink realtime inodes */ > if (XFS_IS_REALTIME_INODE(src) || XFS_IS_REALTIME_INODE(dest)) > - return -EINVAL; > + goto out_unlock; > + > + /* Don't share DAX file data for now. */ > + if (IS_DAX(inode_in) || IS_DAX(inode_out)) > + goto out_unlock; > + > + /* Are we going all the way to the end? */ > + isize = i_size_read(inode_in); > + if (isize == 0) { > + ret = 0; > + goto out_unlock; > + } > + > + if (len == 0) > + len = isize - pos_in; > > - if (flags & ~XFS_REFLINK_ALL) > - return -EINVAL; > + /* Ensure offsets don't wrap and the input is inside i_size */ > + if (pos_in + len < pos_in || pos_out + len < pos_out || > + pos_in + len > isize) > + goto out_unlock; > + > + /* Don't allow dedupe past EOF in the dest file */ > + if (is_dedupe) { > + loff_t disize; > + > + disize = i_size_read(inode_out); > + if (pos_out >= disize || pos_out + len > disize) > + goto out_unlock; > + } > > - trace_xfs_reflink_remap_range(src, srcoff, len, dest, destoff); > + /* If we're linking to EOF, continue to the block boundary. */ > + if (pos_in + len == isize) > + blen = ALIGN(isize, bs) - pos_in; > + else > + blen = len; > + > + /* Only reflink if we're aligned to block boundaries */ > + if (!IS_ALIGNED(pos_in, bs) || !IS_ALIGNED(pos_in + blen, bs) || > + !IS_ALIGNED(pos_out, bs) || !IS_ALIGNED(pos_out + blen, bs)) > + goto out_unlock; > + > + /* Don't allow overlapped reflink within the same file */ > + if (same_inode) { > + if (pos_out + blen > pos_in && pos_out < pos_in + blen) > + goto out_unlock; > + } > + > + /* Wait for the completion of any pending IOs on srcfile */ > + ret = xfs_file_wait_for_io(inode_in, pos_in, len); > + if (ret) > + goto out_unlock; > + ret = xfs_file_wait_for_io(inode_out, pos_out, len); > + if (ret) > + goto out_unlock; > + > + trace_xfs_reflink_remap_range(src, pos_in, len, dest, pos_out); > > /* > * Check that the extents are the same. > */ > - if (flags & XFS_REFLINK_DEDUPE) { > - is_same = false; > - error = xfs_compare_extents(VFS_I(src), srcoff, VFS_I(dest), > - destoff, len, &is_same); > - if (error) > - goto out_error; > + if (is_dedupe) { > + bool is_same = false; > + > + ret = xfs_compare_extents(inode_in, pos_in, inode_out, pos_out, > + len, &is_same); > + if (ret) > + goto out_unlock;; Double-semicolon here. > if (!is_same) { > - error = -EBADE; > - goto out_error; > + ret = -EBADE; > + goto out_unlock; > } > } > > - error = xfs_reflink_set_inode_flag(src, dest); > - if (error) > - goto out_error; > + ret = xfs_reflink_set_inode_flag(src, dest); > + if (ret) > + goto out_unlock; > > /* > * Invalidate the page cache so that we can clear any CoW mappings > * in the destination file. > */ > - truncate_inode_pages_range(&VFS_I(dest)->i_data, destoff, > - PAGE_ALIGN(destoff + len) - 1); > + truncate_inode_pages_range(&inode_out->i_data, pos_out, > + PAGE_ALIGN(pos_out + len) - 1); > > - dfsbno = XFS_B_TO_FSBT(mp, destoff); > - sfsbno = XFS_B_TO_FSBT(mp, srcoff); > + dfsbno = XFS_B_TO_FSBT(mp, pos_out); > + sfsbno = XFS_B_TO_FSBT(mp, pos_in); > fsblen = XFS_B_TO_FSB(mp, len); > - error = xfs_reflink_remap_blocks(src, sfsbno, dest, dfsbno, fsblen, > - destoff + len); > - if (error) > - goto out_error; > + ret = xfs_reflink_remap_blocks(src, sfsbno, dest, dfsbno, fsblen, > + pos_out + len); > + if (ret) > + goto out_unlock; > > /* > * Carry the cowextsize hint from src to dest if we're sharing the > @@ -1381,20 +1494,24 @@ xfs_reflink_remap_range( > * has a cowextsize hint, and the destination file does not. > */ > cowextsize = 0; > - if (srcoff == 0 && len == i_size_read(VFS_I(src)) && > + if (pos_in == 0 && len == i_size_read(inode_in) && > (src->i_d.di_flags2 & XFS_DIFLAG2_COWEXTSIZE) && > - destoff == 0 && len >= i_size_read(VFS_I(dest)) && > + pos_out == 0 && len >= i_size_read(inode_out) && > !(dest->i_d.di_flags2 & XFS_DIFLAG2_COWEXTSIZE)) > cowextsize = src->i_d.di_cowextsize; > > - error = xfs_reflink_update_dest(dest, destoff + len, cowextsize); > - if (error) > - goto out_error; > + ret = xfs_reflink_update_dest(dest, pos_out + len, cowextsize); > > -out_error: > - if (error) > - trace_xfs_reflink_remap_range_error(dest, error, _RET_IP_); > - return error; > +out_unlock: > + xfs_iunlock(src, XFS_MMAPLOCK_EXCL); > + xfs_iunlock(src, XFS_IOLOCK_EXCL); > + if (src->i_ino != dest->i_ino) { > + xfs_iunlock(dest, XFS_MMAPLOCK_EXCL); > + xfs_iunlock(dest, XFS_IOLOCK_EXCL); > + } > + if (ret) > + trace_xfs_reflink_remap_range_error(dest, ret, _RET_IP_); > + return ret; > } > > /* > diff --git a/fs/xfs/xfs_reflink.h b/fs/xfs/xfs_reflink.h > index 5dc3c8a..25a8956 100644 > --- a/fs/xfs/xfs_reflink.h > +++ b/fs/xfs/xfs_reflink.h > @@ -43,11 +43,6 @@ extern int xfs_reflink_cancel_cow_range(struct xfs_inode *ip, xfs_off_t offset, > extern int xfs_reflink_end_cow(struct xfs_inode *ip, xfs_off_t offset, > xfs_off_t count); > extern int xfs_reflink_recover_cow(struct xfs_mount *mp); > -#define XFS_REFLINK_DEDUPE 1 /* only reflink if contents match */ > -#define XFS_REFLINK_ALL (XFS_REFLINK_DEDUPE) Hooray, uglyflags went away! :) Excepting the minor things I complained about, the rest is Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> --D > -extern int xfs_reflink_remap_range(struct xfs_inode *src, xfs_off_t srcoff, > - struct xfs_inode *dest, xfs_off_t destoff, xfs_off_t len, > - unsigned int flags); > extern int xfs_reflink_clear_inode_flag(struct xfs_inode *ip, > struct xfs_trans **tpp); > extern int xfs_reflink_unshare(struct xfs_inode *ip, xfs_off_t offset, > @@ -55,4 +50,7 @@ extern int xfs_reflink_unshare(struct xfs_inode *ip, xfs_off_t offset, > > extern bool xfs_reflink_has_real_cow_blocks(struct xfs_inode *ip); > > +int xfs_file_share_range(struct file *file_in, loff_t pos_in, > + struct file *file_out, loff_t pos_out, u64 len, bool is_dedupe); > + > #endif /* __XFS_REFLINK_H */ > -- > 2.1.4 > ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [PATCH 4/4] xfs: merge xfs_reflink_remap_range and xfs_file_share_range 2016-10-17 21:29 ` Darrick J. Wong @ 2016-10-18 5:17 ` Christoph Hellwig 0 siblings, 0 replies; 11+ messages in thread From: Christoph Hellwig @ 2016-10-18 5:17 UTC (permalink / raw) To: Darrick J. Wong; +Cc: Christoph Hellwig, linux-xfs On Mon, Oct 17, 2016 at 02:29:33PM -0700, Darrick J. Wong wrote: > > + bs = inode->i_sb->s_blocksize; > > + inode_dio_wait(inode); > > + > > + rounding = max_t(xfs_off_t, bs, PAGE_SIZE); > > + ioffset = round_down(offset, rounding); > > + iendoffset = round_up(offset + len, rounding) - 1; > > + ret = filemap_write_and_wait_range(inode->i_mapping, ioffset, > > + iendoffset); > > + return ret; > > +} > > This seems like a file action not specific to reflink. But it's only used by the reflink code :) That being said given that filemap_write_and_wait_range operates on pages there is no need for the rounding anyway, and we could just replace it with open coded calls to inode_dio_wait and filemap_write_and_wait_range. Maybe I should do that before this patch so we don't have to bother moving it. > > +xfs_file_share_range( > > xfs_reflink_share_file_range() ? > > I'd like to maintain the convention that all reflink functions > start with xfs_reflink_*, particularly since the xfs_file_* functions > largely live in xfs_file.c. Ok, fine. > > + if (is_dedupe) { > > + bool is_same = false; > > + > > + ret = xfs_compare_extents(inode_in, pos_in, inode_out, pos_out, > > + len, &is_same); > > + if (ret) > > + goto out_unlock;; > > Double-semicolon here. I'll fix it. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: fix locking for the reflink operation 2016-10-17 12:05 fix locking for the reflink operation Christoph Hellwig ` (3 preceding siblings ...) 2016-10-17 12:05 ` [PATCH 4/4] xfs: merge xfs_reflink_remap_range and xfs_file_share_range Christoph Hellwig @ 2016-10-17 21:35 ` Darrick J. Wong 4 siblings, 0 replies; 11+ messages in thread From: Darrick J. Wong @ 2016-10-17 21:35 UTC (permalink / raw) To: Christoph Hellwig; +Cc: linux-xfs On Mon, Oct 17, 2016 at 02:05:16PM +0200, Christoph Hellwig wrote: > When creating a reflink we need to take the iolock much earlier, as > various early checks done in xfs_file_share_range currently are racy > without it. Patches 1-3 sort that out in a minimal invasive way, > but I think we should just merge xfs_file_share_range and > xfs_reflink_remap_range, which is what patch 4 does. > > Patches 1-3 are something I'd like to see in 4.9, patch 4 might not > fully qualify, but just getting it in might make everyones life easier. This series (+ the CoW optimization series before it) seem to run ok here. I'm ok with (more soak testing and) sending it in for 4.9. --D > > -- > To unsubscribe from this list: send the line "unsubscribe linux-xfs" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2016-10-18 5:18 UTC | newest] Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-10-17 12:05 fix locking for the reflink operation Christoph Hellwig 2016-10-17 12:05 ` [PATCH 1/4] xfs: remove the same fs check from xfs_file_share_range Christoph Hellwig 2016-10-17 21:11 ` Darrick J. Wong 2016-10-17 12:05 ` [PATCH 2/4] xfs: fix the same_inode check in xfs_file_share_range Christoph Hellwig 2016-10-17 21:12 ` Darrick J. Wong 2016-10-17 12:05 ` [PATCH 3/4] xfs: move inode locking from xfs_reflink_remap_range to xfs_file_share_range Christoph Hellwig 2016-10-17 21:16 ` Darrick J. Wong 2016-10-17 12:05 ` [PATCH 4/4] xfs: merge xfs_reflink_remap_range and xfs_file_share_range Christoph Hellwig 2016-10-17 21:29 ` Darrick J. Wong 2016-10-18 5:17 ` Christoph Hellwig 2016-10-17 21:35 ` fix locking for the reflink operation Darrick J. Wong
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.