All of lore.kernel.org
 help / color / mirror / Atom feed
From: Zhihao Cheng <chengzhihao1@huawei.com>
To: <tytso@mit.edu>, <adilger.kernel@dilger.ca>, <jack@suse.com>,
	<tudor.ambarus@linaro.org>
Cc: <linux-ext4@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<chengzhihao1@huawei.com>, <yi.zhang@huawei.com>
Subject: [PATCH] ext4: Fix i_disksize exceeding i_size problem in paritally written case
Date: Fri, 17 Mar 2023 09:35:53 +0800	[thread overview]
Message-ID: <20230317013553.1009553-1-chengzhihao1@huawei.com> (raw)

Following process makes i_disksize exceed i_size:

generic_perform_write
 copied = iov_iter_copy_from_user_atomic(len) // copied < len
 ext4_da_write_end
 | ext4_update_i_disksize
 |  new_i_size = pos + copied;
 |  WRITE_ONCE(EXT4_I(inode)->i_disksize, newsize) // update i_disksize
 | generic_write_end
 |  copied = block_write_end(copied, len) // copied = 0
 |   if (unlikely(copied < len))
 |    if (!PageUptodate(page))
 |     copied = 0;
 |  if (pos + copied > inode->i_size) // return false
 if (unlikely(copied == 0))
  goto again;
 if (unlikely(iov_iter_fault_in_readable(i, bytes))) {
  status = -EFAULT;
  break;
 }

We get i_disksize greater than i_size here, which could trigger WARNING
check 'i_size_read(inode) < EXT4_I(inode)->i_disksize' while doing dio:

ext4_dio_write_iter
 iomap_dio_rw
  __iomap_dio_rw // return err, length is not aligned to 512
 ext4_handle_inode_extension
  WARN_ON_ONCE(i_size_read(inode) < EXT4_I(inode)->i_disksize) // Oops

 WARNING: CPU: 2 PID: 2609 at fs/ext4/file.c:319
 CPU: 2 PID: 2609 Comm: aa Not tainted 6.3.0-rc2
 RIP: 0010:ext4_file_write_iter+0xbc7
 Call Trace:
  vfs_write+0x3b1
  ksys_write+0x77
  do_syscall_64+0x39

Fix it by putting block_write_end() before i_disksize updating just
like ext4_write_end() does.

Fetch a reproducer in [Link].

Link: https://bugzilla.kernel.org/show_bug.cgi?id=217209
Fixes: 64769240bd07f ("ext4: Add delayed allocation support in data=writeback mode")
Signed-off-by: Zhihao Cheng <chengzhihao1@huawei.com>
---
 fs/ext4/inode.c | 32 +++++++++++++++++++++++++-------
 1 file changed, 25 insertions(+), 7 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index bf0b7dea4900..577dc23f3b78 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3136,6 +3136,8 @@ static int ext4_da_write_end(struct file *file,
 	loff_t new_i_size;
 	unsigned long start, end;
 	int write_mode = (int)(unsigned long)fsdata;
+	bool i_size_changed = false;
+	loff_t old_size = inode->i_size;
 
 	if (write_mode == FALL_BACK_TO_NONDELALLOC)
 		return ext4_write_end(file, mapping, pos,
@@ -3148,6 +3150,8 @@ static int ext4_da_write_end(struct file *file,
 	    ext4_has_inline_data(inode))
 		return ext4_write_inline_data_end(inode, pos, len, copied, page);
 
+	copied = block_write_end(file, mapping, pos, len, copied, page, fsdata);
+
 	start = pos & (PAGE_SIZE - 1);
 	end = start + copied - 1;
 
@@ -3162,16 +3166,30 @@ static int ext4_da_write_end(struct file *file,
 	 * check), we need to update i_disksize here as neither
 	 * ext4_writepage() nor certain ext4_writepages() paths not
 	 * allocating blocks update i_disksize.
-	 *
-	 * Note that we defer inode dirtying to generic_write_end() /
-	 * ext4_da_write_inline_data_end().
 	 */
 	new_i_size = pos + copied;
-	if (copied && new_i_size > inode->i_size &&
-	    ext4_da_should_update_i_disksize(page, end))
-		ext4_update_i_disksize(inode, new_i_size);
+	if (new_i_size > inode->i_size) {
+		i_size_write(inode, new_i_size);
+		i_size_changed = true;
+		if (copied && ext4_da_should_update_i_disksize(page, end))
+			ext4_update_i_disksize(inode, new_i_size);
+	}
+
+	unlock_page(page);
+	put_page(page);
+
+	if (old_size < pos)
+		pagecache_isize_extended(inode, old_size, pos);
+	/*
+	 * Don't mark the inode dirty under page lock. First, it unnecessarily
+	 * makes the holding time of page lock longer. Second, it forces lock
+	 * ordering of page lock and transaction start for journaling
+	 * filesystems.
+	 */
+	if (i_size_changed)
+		mark_inode_dirty(inode);
 
-	return generic_write_end(file, mapping, pos, len, copied, page, fsdata);
+	return copied;
 }
 
 /*
-- 
2.31.1


             reply	other threads:[~2023-03-17  1:36 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-17  1:35 Zhihao Cheng [this message]
2023-03-17 12:45 ` [PATCH] ext4: Fix i_disksize exceeding i_size problem in paritally written case Jan Kara
2023-03-18  7:03   ` Zhihao Cheng
2023-03-18  8:51     ` Zhihao Cheng
2023-03-20 11:20       ` Jan Kara
2023-03-20 12:49         ` Zhihao Cheng
2023-03-20 16:24           ` Jan Kara
2023-03-21  1:29             ` Zhihao Cheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230317013553.1009553-1-chengzhihao1@huawei.com \
    --to=chengzhihao1@huawei.com \
    --cc=adilger.kernel@dilger.ca \
    --cc=jack@suse.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tudor.ambarus@linaro.org \
    --cc=tytso@mit.edu \
    --cc=yi.zhang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.