From: "Darrick J. Wong" <darrick.wong@oracle.com> To: david@fromorbit.com, darrick.wong@oracle.com Cc: sandeen@redhat.com, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, linux-unionfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-mm@kvack.org, linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, ocfs2-devel@oss.oracle.com Subject: [PATCH 05/25] vfs: avoid problematic remapping requests into partial EOF block Date: Fri, 12 Oct 2018 17:06:17 -0700 [thread overview] Message-ID: <153938917765.8361.15966712047859994604.stgit@magnolia> (raw) In-Reply-To: <153938912912.8361.13446310416406388958.stgit@magnolia> From: Darrick J. Wong <darrick.wong@oracle.com> A deduplication data corruption is exposed in XFS and btrfs. It is caused by extending the block match range to include the partial EOF block, but then allowing unknown data beyond EOF to be considered a "match" to data in the destination file because the comparison is only made to the end of the source file. This corrupts the destination file when the source extent is shared with it. The VFS remapping prep functions only support whole block dedupe, but we still need to appear to support whole file dedupe correctly. Hence if the dedupe request includes the last block of the souce file, don't include it in the actual dedupe operation. If the rest of the range dedupes successfully, then reject the entire request. A subsequent patch will enable us to shorten dedupe requests correctly. When reflinking sub-file ranges, a data corruption can occur when the source file range includes a partial EOF block. This shares the unknown data beyond EOF into the second file at a position inside EOF, exposing stale data in the second file. If the reflink request includes the last block of the souce file, only proceed with the reflink operation if it lands at or past the destination file's current EOF. If it lands within the destination file EOF, reject the entire request with -EINVAL and make the caller go the hard way. A subsequent patch will enable us to shorten reflink requests correctly. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> --- fs/read_write.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/fs/read_write.c b/fs/read_write.c index d6e8e242a15f..067ff5698e0b 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -1723,6 +1723,7 @@ int vfs_clone_file_prep(struct file *file_in, loff_t pos_in, { struct inode *inode_in = file_inode(file_in); struct inode *inode_out = file_inode(file_out); + u64 blkmask = i_blocksize(inode_in) - 1; bool same_inode = (inode_in == inode_out); int ret; @@ -1785,6 +1786,22 @@ int vfs_clone_file_prep(struct file *file_in, loff_t pos_in, return -EBADE; } + /* Are we doing a partial EOF block remapping of some kind? */ + if (*len & blkmask) { + /* + * If the dedupe data matches, chop off the partial EOF block + * from the source file so we don't try to dedupe the partial + * EOF block. + * + * If the user is attempting to remap a partial EOF block and + * it's inside the destination EOF then reject it. + */ + if (is_dedupe) + *len &= ~blkmask; + else if (pos_out + *len < i_size_read(inode_out)) + return -EINVAL; + } + return 1; } EXPORT_SYMBOL(vfs_clone_file_prep);
WARNING: multiple messages have this Message-ID (diff)
From: Darrick J. Wong <darrick.wong@oracle.com> To: david@fromorbit.com, darrick.wong@oracle.com Cc: sandeen@redhat.com, linux-nfs@vger.kernel.org, linux-cifs@vger.kernel.org, linux-unionfs@vger.kernel.org, linux-xfs@vger.kernel.org, linux-mm@kvack.org, linux-btrfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, ocfs2-devel@oss.oracle.com Subject: [Ocfs2-devel] [PATCH 05/25] vfs: avoid problematic remapping requests into partial EOF block Date: Fri, 12 Oct 2018 17:06:17 -0700 [thread overview] Message-ID: <153938917765.8361.15966712047859994604.stgit@magnolia> (raw) In-Reply-To: <153938912912.8361.13446310416406388958.stgit@magnolia> From: Darrick J. Wong <darrick.wong@oracle.com> A deduplication data corruption is exposed in XFS and btrfs. It is caused by extending the block match range to include the partial EOF block, but then allowing unknown data beyond EOF to be considered a "match" to data in the destination file because the comparison is only made to the end of the source file. This corrupts the destination file when the source extent is shared with it. The VFS remapping prep functions only support whole block dedupe, but we still need to appear to support whole file dedupe correctly. Hence if the dedupe request includes the last block of the souce file, don't include it in the actual dedupe operation. If the rest of the range dedupes successfully, then reject the entire request. A subsequent patch will enable us to shorten dedupe requests correctly. When reflinking sub-file ranges, a data corruption can occur when the source file range includes a partial EOF block. This shares the unknown data beyond EOF into the second file at a position inside EOF, exposing stale data in the second file. If the reflink request includes the last block of the souce file, only proceed with the reflink operation if it lands at or past the destination file's current EOF. If it lands within the destination file EOF, reject the entire request with -EINVAL and make the caller go the hard way. A subsequent patch will enable us to shorten reflink requests correctly. Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> --- fs/read_write.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+) diff --git a/fs/read_write.c b/fs/read_write.c index d6e8e242a15f..067ff5698e0b 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -1723,6 +1723,7 @@ int vfs_clone_file_prep(struct file *file_in, loff_t pos_in, { struct inode *inode_in = file_inode(file_in); struct inode *inode_out = file_inode(file_out); + u64 blkmask = i_blocksize(inode_in) - 1; bool same_inode = (inode_in == inode_out); int ret; @@ -1785,6 +1786,22 @@ int vfs_clone_file_prep(struct file *file_in, loff_t pos_in, return -EBADE; } + /* Are we doing a partial EOF block remapping of some kind? */ + if (*len & blkmask) { + /* + * If the dedupe data matches, chop off the partial EOF block + * from the source file so we don't try to dedupe the partial + * EOF block. + * + * If the user is attempting to remap a partial EOF block and + * it's inside the destination EOF then reject it. + */ + if (is_dedupe) + *len &= ~blkmask; + else if (pos_out + *len < i_size_read(inode_out)) + return -EINVAL; + } + return 1; } EXPORT_SYMBOL(vfs_clone_file_prep);
next prev parent reply other threads:[~2018-10-13 0:06 UTC|newest] Thread overview: 112+ messages / expand[flat|nested] mbox.gz Atom feed top 2018-10-13 0:05 [PATCH v4 00/25] fs: fixes for serious clone/dedupe problems Darrick J. Wong 2018-10-13 0:05 ` [Ocfs2-devel] " Darrick J. Wong 2018-10-13 0:05 ` [PATCH 01/25] xfs: add a per-xfs trace_printk macro Darrick J. Wong 2018-10-13 0:05 ` [Ocfs2-devel] " Darrick J. Wong 2018-10-13 0:05 ` [PATCH 02/25] vfs: vfs_clone_file_prep_inodes should return EINVAL for a clone from beyond EOF Darrick J. Wong 2018-10-13 0:05 ` [Ocfs2-devel] " Darrick J. Wong 2018-10-13 0:06 ` [PATCH 03/25] vfs: check file ranges before cloning files Darrick J. Wong 2018-10-13 0:06 ` [Ocfs2-devel] " Darrick J. Wong 2018-10-13 0:06 ` [PATCH 04/25] vfs: strengthen checking of file range inputs to generic_remap_checks Darrick J. Wong 2018-10-13 0:06 ` [Ocfs2-devel] " Darrick J. Wong 2018-10-13 0:06 ` Darrick J. Wong [this message] 2018-10-13 0:06 ` [Ocfs2-devel] [PATCH 05/25] vfs: avoid problematic remapping requests into partial EOF block Darrick J. Wong 2018-10-14 17:11 ` Christoph Hellwig 2018-10-14 17:11 ` [Ocfs2-devel] " Christoph Hellwig 2018-10-13 0:06 ` [PATCH 06/25] vfs: skip zero-length dedupe requests Darrick J. Wong 2018-10-13 0:06 ` [Ocfs2-devel] " Darrick J. Wong 2018-10-13 0:06 ` [PATCH 07/25] vfs: combine the clone and dedupe into a single remap_file_range Darrick J. Wong 2018-10-13 0:06 ` [Ocfs2-devel] " Darrick J. Wong 2018-10-14 17:19 ` Christoph Hellwig 2018-10-14 17:19 ` [Ocfs2-devel] " Christoph Hellwig 2018-10-15 6:04 ` Amir Goldstein 2018-10-15 12:47 ` Christoph Hellwig 2018-10-15 12:47 ` [Ocfs2-devel] " Christoph Hellwig 2018-10-15 12:54 ` Amir Goldstein 2018-10-15 17:13 ` Darrick J. Wong 2018-10-15 17:13 ` [Ocfs2-devel] " Darrick J. Wong 2018-10-15 18:32 ` Christoph Hellwig 2018-10-15 18:32 ` [Ocfs2-devel] " Christoph Hellwig 2018-10-15 13:18 ` Matthew Wilcox 2018-10-15 16:42 ` Darrick J. Wong 2018-10-15 16:42 ` [Ocfs2-devel] " Darrick J. Wong 2018-10-13 0:06 ` [PATCH 08/25] vfs: rename vfs_clone_file_prep to be more descriptive Darrick J. Wong 2018-10-13 0:06 ` [Ocfs2-devel] " Darrick J. Wong 2018-10-13 0:06 ` [PATCH 09/25] vfs: rename clone_verify_area to remap_verify_area Darrick J. Wong 2018-10-13 0:06 ` [Ocfs2-devel] " Darrick J. Wong 2018-10-13 0:06 ` [PATCH 10/25] vfs: create generic_remap_file_range_touch to update inode metadata Darrick J. Wong 2018-10-13 0:06 ` [Ocfs2-devel] " Darrick J. Wong 2018-10-14 17:21 ` Christoph Hellwig 2018-10-14 17:21 ` [Ocfs2-devel] " Christoph Hellwig 2018-10-15 16:30 ` Darrick J. Wong 2018-10-15 16:30 ` [Ocfs2-devel] " Darrick J. Wong 2018-10-15 18:19 ` Christoph Hellwig 2018-10-15 18:19 ` [Ocfs2-devel] " Christoph Hellwig 2018-10-13 0:06 ` [PATCH 11/25] vfs: pass remap flags to generic_remap_file_range_prep Darrick J. Wong 2018-10-13 0:06 ` [Ocfs2-devel] " Darrick J. Wong 2018-10-14 17:22 ` Christoph Hellwig 2018-10-14 17:22 ` [Ocfs2-devel] " Christoph Hellwig 2018-10-14 17:37 ` Christoph Hellwig 2018-10-14 17:37 ` [Ocfs2-devel] " Christoph Hellwig 2018-10-15 15:42 ` Darrick J. Wong 2018-10-15 15:42 ` [Ocfs2-devel] " Darrick J. Wong 2018-10-13 0:07 ` [PATCH 12/25] vfs: pass remap flags to generic_remap_checks Darrick J. Wong 2018-10-13 0:07 ` [Ocfs2-devel] " Darrick J. Wong 2018-10-13 0:07 ` [PATCH 13/25] vfs: make remap_file_range functions take and return bytes completed Darrick J. Wong 2018-10-13 0:07 ` [Ocfs2-devel] " Darrick J. Wong 2018-10-13 0:07 ` [PATCH 14/25] vfs: plumb RFR_* remap flags through the vfs clone functions Darrick J. Wong 2018-10-13 0:07 ` [Ocfs2-devel] " Darrick J. Wong 2018-10-13 0:07 ` Darrick J. Wong 2018-10-13 0:07 ` [PATCH 15/25] vfs: plumb RFR_* remap flags through the vfs dedupe functions Darrick J. Wong 2018-10-13 0:07 ` [Ocfs2-devel] " Darrick J. Wong 2018-10-13 0:07 ` [PATCH 16/25] vfs: make remapping to source file eof more explicit Darrick J. Wong 2018-10-13 0:07 ` [Ocfs2-devel] " Darrick J. Wong 2018-10-14 17:24 ` Christoph Hellwig 2018-10-14 17:24 ` [Ocfs2-devel] " Christoph Hellwig 2018-10-14 17:24 ` Christoph Hellwig 2018-10-14 17:24 ` Christoph Hellwig 2018-10-15 15:32 ` Darrick J. Wong 2018-10-15 15:32 ` [Ocfs2-devel] " Darrick J. Wong 2018-10-15 18:28 ` Christoph Hellwig 2018-10-15 18:28 ` [Ocfs2-devel] " Christoph Hellwig 2018-10-13 0:07 ` [PATCH 17/25] vfs: enable remap callers that can handle short operations Darrick J. Wong 2018-10-13 0:07 ` [Ocfs2-devel] " Darrick J. Wong 2018-10-13 0:07 ` [PATCH 18/25] vfs: hide file range comparison function Darrick J. Wong 2018-10-13 0:07 ` [Ocfs2-devel] " Darrick J. Wong 2018-10-14 17:43 ` Christoph Hellwig 2018-10-14 17:43 ` [Ocfs2-devel] " Christoph Hellwig 2018-10-13 0:07 ` [PATCH 19/25] vfs: implement opportunistic short dedupe Darrick J. Wong 2018-10-13 0:07 ` [Ocfs2-devel] " Darrick J. Wong 2018-10-14 17:26 ` Christoph Hellwig 2018-10-14 17:26 ` [Ocfs2-devel] " Christoph Hellwig 2018-10-13 0:08 ` [PATCH 20/25] ocfs2: truncate page cache for clone destination file before remapping Darrick J. Wong 2018-10-13 0:08 ` [Ocfs2-devel] " Darrick J. Wong 2018-10-13 0:08 ` [PATCH 21/25] ocfs2: fix pagecache truncation prior to reflink Darrick J. Wong 2018-10-13 0:08 ` [Ocfs2-devel] " Darrick J. Wong 2018-10-13 0:08 ` [PATCH 22/25] ocfs2: support partial clone range and dedupe range Darrick J. Wong 2018-10-13 0:08 ` [Ocfs2-devel] " Darrick J. Wong 2018-10-14 17:41 ` Christoph Hellwig 2018-10-14 17:41 ` [Ocfs2-devel] " Christoph Hellwig 2018-10-13 0:08 ` [PATCH 23/25] xfs: fix pagecache truncation prior to reflink Darrick J. Wong 2018-10-13 0:08 ` [Ocfs2-devel] " Darrick J. Wong 2018-10-13 0:08 ` [PATCH 24/25] xfs: support returning partial reflink results Darrick J. Wong 2018-10-13 0:08 ` [Ocfs2-devel] " Darrick J. Wong 2018-10-14 17:35 ` Christoph Hellwig 2018-10-14 17:35 ` [Ocfs2-devel] " Christoph Hellwig 2018-10-14 23:05 ` Dave Chinner 2018-10-14 23:05 ` [Ocfs2-devel] " Dave Chinner 2018-10-15 15:49 ` Darrick J. Wong 2018-10-15 15:49 ` [Ocfs2-devel] " Darrick J. Wong 2018-10-13 0:08 ` [PATCH 25/25] xfs: remove redundant remap partial EOF block checks Darrick J. Wong 2018-10-13 0:08 ` [Ocfs2-devel] " Darrick J. Wong -- strict thread matches above, loose matches on Subject: below -- 2018-10-11 4:12 [PATCH v3 00/25] fs: fixes for serious clone/dedupe problems Darrick J. Wong 2018-10-11 4:12 ` [PATCH 05/25] vfs: avoid problematic remapping requests into partial EOF block Darrick J. Wong 2018-10-12 0:16 ` Dave Chinner 2018-10-12 16:07 ` Darrick J. Wong 2018-10-12 20:22 ` Filipe Manana 2018-10-12 20:22 ` Filipe Manana 2018-10-15 0:31 ` Dave Chinner 2018-11-02 12:04 ` Filipe Manana 2018-11-02 12:04 ` Filipe Manana 2018-11-02 17:42 ` Darrick J. Wong 2018-11-02 17:42 ` Darrick J. Wong 2018-11-02 18:18 ` Filipe Manana 2018-11-02 19:05 ` Filipe Manana
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=153938917765.8361.15966712047859994604.stgit@magnolia \ --to=darrick.wong@oracle.com \ --cc=david@fromorbit.com \ --cc=linux-btrfs@vger.kernel.org \ --cc=linux-cifs@vger.kernel.org \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=linux-nfs@vger.kernel.org \ --cc=linux-unionfs@vger.kernel.org \ --cc=linux-xfs@vger.kernel.org \ --cc=ocfs2-devel@oss.oracle.com \ --cc=sandeen@redhat.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.