linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Andrew Morton <akpm@linux-foundation.org>
To: akpm@linux-foundation.org, gechangwei@live.cn, ghe@suse.com,
	jack@suse.cz, jlbec@evilplan.org, joseph.qi@linux.alibaba.com,
	junxiao.bi@oracle.com, linux-mm@kvack.org, mark@fasheh.com,
	mm-commits@vger.kernel.org, piaojun@huawei.com,
	stable@vger.kernel.org, torvalds@linux-foundation.org
Subject: [patch 12/13] ocfs2: fix data corruption by fallocate
Date: Fri, 04 Jun 2021 20:01:42 -0700	[thread overview]
Message-ID: <20210605030142.pXgHl9E4K%akpm@linux-foundation.org> (raw)
In-Reply-To: <20210604200040.d8d0406caf195525620c0f3d@linux-foundation.org>

From: Junxiao Bi <junxiao.bi@oracle.com>
Subject: ocfs2: fix data corruption by fallocate

When fallocate punches holes out of inode size, if original isize is in
the middle of last cluster, then the part from isize to the end of the
cluster will be zeroed with buffer write, at that time isize is not yet
updated to match the new size, if writeback is kicked in, it will invoke
ocfs2_writepage()->block_write_full_page() where the pages out of inode
size will be dropped.  That will cause file corruption.  Fix this by zero
out eof blocks when extending the inode size.

Running the following command with qemu-image 4.2.1 can get a corrupted
coverted image file easily.

    qemu-img convert -p -t none -T none -f qcow2 $qcow_image \
             -O qcow2 -o compat=1.1 $qcow_image.conv

The usage of fallocate in qemu is like this, it first punches holes out of
inode size, then extend the inode size.

    fallocate(11, FALLOC_FL_KEEP_SIZE|FALLOC_FL_PUNCH_HOLE, 2276196352, 65536) = 0
    fallocate(11, 0, 2276196352, 65536) = 0

v1: https://www.spinics.net/lists/linux-fsdevel/msg193999.html
v2: https://lore.kernel.org/linux-fsdevel/20210525093034.GB4112@quack2.suse.cz/T/

Link: https://lkml.kernel.org/r/20210528210648.9124-1-junxiao.bi@oracle.com
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
Reviewed-by: Joseph Qi <joseph.qi@linux.alibaba.com>
Cc: Jan Kara <jack@suse.cz>
Cc: Mark Fasheh <mark@fasheh.com>
Cc: Joel Becker <jlbec@evilplan.org>
Cc: Changwei Ge <gechangwei@live.cn>
Cc: Gang He <ghe@suse.com>
Cc: Jun Piao <piaojun@huawei.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---

 fs/ocfs2/file.c |   55 +++++++++++++++++++++++++++++++++++++++++-----
 1 file changed, 50 insertions(+), 5 deletions(-)

--- a/fs/ocfs2/file.c~ocfs2-fix-data-corruption-by-fallocate
+++ a/fs/ocfs2/file.c
@@ -1856,6 +1856,45 @@ out:
 }
 
 /*
+ * zero out partial blocks of one cluster.
+ *
+ * start: file offset where zero starts, will be made upper block aligned.
+ * len: it will be trimmed to the end of current cluster if "start + len"
+ *      is bigger than it.
+ */
+static int ocfs2_zeroout_partial_cluster(struct inode *inode,
+					u64 start, u64 len)
+{
+	int ret;
+	u64 start_block, end_block, nr_blocks;
+	u64 p_block, offset;
+	u32 cluster, p_cluster, nr_clusters;
+	struct super_block *sb = inode->i_sb;
+	u64 end = ocfs2_align_bytes_to_clusters(sb, start);
+
+	if (start + len < end)
+		end = start + len;
+
+	start_block = ocfs2_blocks_for_bytes(sb, start);
+	end_block = ocfs2_blocks_for_bytes(sb, end);
+	nr_blocks = end_block - start_block;
+	if (!nr_blocks)
+		return 0;
+
+	cluster = ocfs2_bytes_to_clusters(sb, start);
+	ret = ocfs2_get_clusters(inode, cluster, &p_cluster,
+				&nr_clusters, NULL);
+	if (ret)
+		return ret;
+	if (!p_cluster)
+		return 0;
+
+	offset = start_block - ocfs2_clusters_to_blocks(sb, cluster);
+	p_block = ocfs2_clusters_to_blocks(sb, p_cluster) + offset;
+	return sb_issue_zeroout(sb, p_block, nr_blocks, GFP_NOFS);
+}
+
+/*
  * Parts of this function taken from xfs_change_file_space()
  */
 static int __ocfs2_change_file_space(struct file *file, struct inode *inode,
@@ -1865,7 +1904,7 @@ static int __ocfs2_change_file_space(str
 {
 	int ret;
 	s64 llen;
-	loff_t size;
+	loff_t size, orig_isize;
 	struct ocfs2_super *osb = OCFS2_SB(inode->i_sb);
 	struct buffer_head *di_bh = NULL;
 	handle_t *handle;
@@ -1896,6 +1935,7 @@ static int __ocfs2_change_file_space(str
 		goto out_inode_unlock;
 	}
 
+	orig_isize = i_size_read(inode);
 	switch (sr->l_whence) {
 	case 0: /*SEEK_SET*/
 		break;
@@ -1903,7 +1943,7 @@ static int __ocfs2_change_file_space(str
 		sr->l_start += f_pos;
 		break;
 	case 2: /*SEEK_END*/
-		sr->l_start += i_size_read(inode);
+		sr->l_start += orig_isize;
 		break;
 	default:
 		ret = -EINVAL;
@@ -1957,6 +1997,14 @@ static int __ocfs2_change_file_space(str
 	default:
 		ret = -EINVAL;
 	}
+
+	/* zeroout eof blocks in the cluster. */
+	if (!ret && change_size && orig_isize < size) {
+		ret = ocfs2_zeroout_partial_cluster(inode, orig_isize,
+					size - orig_isize);
+		if (!ret)
+			i_size_write(inode, size);
+	}
 	up_write(&OCFS2_I(inode)->ip_alloc_sem);
 	if (ret) {
 		mlog_errno(ret);
@@ -1973,9 +2021,6 @@ static int __ocfs2_change_file_space(str
 		goto out_inode_unlock;
 	}
 
-	if (change_size && i_size_read(inode) < size)
-		i_size_write(inode, size);
-
 	inode->i_ctime = inode->i_mtime = current_time(inode);
 	ret = ocfs2_mark_inode_dirty(handle, inode, di_bh);
 	if (ret < 0)
_


  parent reply	other threads:[~2021-06-05  7:16 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-05  3:00 incoming Andrew Morton
2021-06-05  3:01 ` [patch 01/13] Revert "MIPS: make userspace mapping young by default" Andrew Morton
2021-06-05  3:01 ` [patch 02/13] kfence: use TASK_IDLE when awaiting allocation Andrew Morton
2021-06-05  3:01 ` [patch 03/13] pid: take a reference when initializing `cad_pid` Andrew Morton
2021-06-05  3:01 ` [patch 04/13] mm/debug_vm_pgtable: fix alignment for pmd/pud_advanced_tests() Andrew Morton
2021-06-05  3:01 ` [patch 05/13] mm/page_alloc: fix counting of free pages after take off from buddy Andrew Morton
2021-06-05  3:01 ` [patch 06/13] drivers/base/memory: fix trying offlining memory blocks with memory holes on aarch64 Andrew Morton
2021-06-05  3:01 ` [patch 07/13] hugetlb: pass head page to remove_hugetlb_page() Andrew Morton
2021-06-05  3:01 ` [patch 08/13] proc: add .gitignore for proc-subset-pid selftest Andrew Morton
2021-06-05  3:01 ` [patch 09/13] mm/kasan/init.c: fix doc warning Andrew Morton
2021-06-05  3:01 ` [patch 10/13] mm, hugetlb: fix simple resv_huge_pages underflow on UFFDIO_COPY Andrew Morton
2021-06-05  3:01 ` [patch 11/13] lib: crc64: fix kernel-doc warning Andrew Morton
2021-06-05  3:01 ` Andrew Morton [this message]
2021-06-05  3:01 ` [patch 13/13] mailmap: use private address for Michel Lespinasse Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210605030142.pXgHl9E4K%akpm@linux-foundation.org \
    --to=akpm@linux-foundation.org \
    --cc=gechangwei@live.cn \
    --cc=ghe@suse.com \
    --cc=jack@suse.cz \
    --cc=jlbec@evilplan.org \
    --cc=joseph.qi@linux.alibaba.com \
    --cc=junxiao.bi@oracle.com \
    --cc=linux-mm@kvack.org \
    --cc=mark@fasheh.com \
    --cc=mm-commits@vger.kernel.org \
    --cc=piaojun@huawei.com \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --subject='Re: [patch 12/13] ocfs2: fix data corruption by fallocate' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).