All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junxiao Bi <junxiao.bi@oracle.com>
To: ocfs2-devel@oss.oracle.com, cluster-devel@redhat.com,
	linux-fsdevel@vger.kernel.org
Cc: junxiao.bi@oracle.com
Subject: [PATCH 2/3] ocfs2: allow writing back pages out of inode size
Date: Mon, 26 Apr 2021 15:05:51 -0700	[thread overview]
Message-ID: <20210426220552.45413-2-junxiao.bi@oracle.com> (raw)
In-Reply-To: <20210426220552.45413-1-junxiao.bi@oracle.com>

When fallocate/truncate extend inode size, if the original isize is in
the middle of last cluster, then the part from isize to the end of the
cluster needs to be zeroed with buffer write, at that time isize is not
yet updated to match the new size, if writeback is kicked in, it will
invoke ocfs2_writepage()->block_write_full_page() where the pages out
of inode size will be dropped. That will cause file corruption.

Running the following command with qemu-image 4.2.1 can get a corrupted
coverted image file easily.

    qemu-img convert -p -t none -T none -f qcow2 $qcow_image \
             -O qcow2 -o compat=1.1 $qcow_image.conv

Cc: <stable@vger.kernel.org>
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
---
 fs/ocfs2/aops.c | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
index ad20403b383f..7a3e3d59f6a9 100644
--- a/fs/ocfs2/aops.c
+++ b/fs/ocfs2/aops.c
@@ -402,11 +402,28 @@ static void ocfs2_readahead(struct readahead_control *rac)
  */
 static int ocfs2_writepage(struct page *page, struct writeback_control *wbc)
 {
+	struct inode * const inode = page->mapping->host;
+	loff_t i_size = i_size_read(inode);
+	const pgoff_t end_index = i_size >> PAGE_SHIFT;
+	unsigned int offset;
+
 	trace_ocfs2_writepage(
 		(unsigned long long)OCFS2_I(page->mapping->host)->ip_blkno,
 		page->index);
 
-	return block_write_full_page(page, ocfs2_get_block, wbc);
+	/*
+	 * The page straddles i_size.  It must be zeroed out on each and every
+	 * writepage invocation because it may be mmapped.  "A file is mapped
+	 * in multiples of the page size.  For a file that is not a multiple of
+	 * the  page size, the remaining memory is zeroed when mapped, and
+	 * writes to that region are not written out to the file."
+	 */
+	offset = i_size & (PAGE_SIZE-1);
+	if (page->index == end_index && offset)
+		zero_user_segment(page, offset, PAGE_SIZE);
+
+	return __block_write_full_page_eof(inode, page, ocfs2_get_block, wbc,
+			end_buffer_async_write, true);
 }
 
 /* Taken from ext3. We don't necessarily need the full blown
-- 
2.24.3 (Apple Git-128)


WARNING: multiple messages have this Message-ID (diff)
From: Junxiao Bi <junxiao.bi@oracle.com>
To: ocfs2-devel@oss.oracle.com, cluster-devel@redhat.com,
	linux-fsdevel@vger.kernel.org
Subject: [Ocfs2-devel] [PATCH 2/3] ocfs2: allow writing back pages out of inode size
Date: Mon, 26 Apr 2021 15:05:51 -0700	[thread overview]
Message-ID: <20210426220552.45413-2-junxiao.bi@oracle.com> (raw)
In-Reply-To: <20210426220552.45413-1-junxiao.bi@oracle.com>

When fallocate/truncate extend inode size, if the original isize is in
the middle of last cluster, then the part from isize to the end of the
cluster needs to be zeroed with buffer write, at that time isize is not
yet updated to match the new size, if writeback is kicked in, it will
invoke ocfs2_writepage()->block_write_full_page() where the pages out
of inode size will be dropped. That will cause file corruption.

Running the following command with qemu-image 4.2.1 can get a corrupted
coverted image file easily.

    qemu-img convert -p -t none -T none -f qcow2 $qcow_image \
             -O qcow2 -o compat=1.1 $qcow_image.conv

Cc: <stable@vger.kernel.org>
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
---
 fs/ocfs2/aops.c | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
index ad20403b383f..7a3e3d59f6a9 100644
--- a/fs/ocfs2/aops.c
+++ b/fs/ocfs2/aops.c
@@ -402,11 +402,28 @@ static void ocfs2_readahead(struct readahead_control *rac)
  */
 static int ocfs2_writepage(struct page *page, struct writeback_control *wbc)
 {
+	struct inode * const inode = page->mapping->host;
+	loff_t i_size = i_size_read(inode);
+	const pgoff_t end_index = i_size >> PAGE_SHIFT;
+	unsigned int offset;
+
 	trace_ocfs2_writepage(
 		(unsigned long long)OCFS2_I(page->mapping->host)->ip_blkno,
 		page->index);
 
-	return block_write_full_page(page, ocfs2_get_block, wbc);
+	/*
+	 * The page straddles i_size.  It must be zeroed out on each and every
+	 * writepage invocation because it may be mmapped.  "A file is mapped
+	 * in multiples of the page size.  For a file that is not a multiple of
+	 * the  page size, the remaining memory is zeroed when mapped, and
+	 * writes to that region are not written out to the file."
+	 */
+	offset = i_size & (PAGE_SIZE-1);
+	if (page->index == end_index && offset)
+		zero_user_segment(page, offset, PAGE_SIZE);
+
+	return __block_write_full_page_eof(inode, page, ocfs2_get_block, wbc,
+			end_buffer_async_write, true);
 }
 
 /* Taken from ext3. We don't necessarily need the full blown
-- 
2.24.3 (Apple Git-128)


_______________________________________________
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/listinfo/ocfs2-devel

WARNING: multiple messages have this Message-ID (diff)
From: Junxiao Bi <junxiao.bi@oracle.com>
To: cluster-devel.redhat.com
Subject: [Cluster-devel] [PATCH 2/3] ocfs2: allow writing back pages out of inode size
Date: Mon, 26 Apr 2021 15:05:51 -0700	[thread overview]
Message-ID: <20210426220552.45413-2-junxiao.bi@oracle.com> (raw)
In-Reply-To: <20210426220552.45413-1-junxiao.bi@oracle.com>

When fallocate/truncate extend inode size, if the original isize is in
the middle of last cluster, then the part from isize to the end of the
cluster needs to be zeroed with buffer write, at that time isize is not
yet updated to match the new size, if writeback is kicked in, it will
invoke ocfs2_writepage()->block_write_full_page() where the pages out
of inode size will be dropped. That will cause file corruption.

Running the following command with qemu-image 4.2.1 can get a corrupted
coverted image file easily.

    qemu-img convert -p -t none -T none -f qcow2 $qcow_image \
             -O qcow2 -o compat=1.1 $qcow_image.conv

Cc: <stable@vger.kernel.org>
Signed-off-by: Junxiao Bi <junxiao.bi@oracle.com>
---
 fs/ocfs2/aops.c | 19 ++++++++++++++++++-
 1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
index ad20403b383f..7a3e3d59f6a9 100644
--- a/fs/ocfs2/aops.c
+++ b/fs/ocfs2/aops.c
@@ -402,11 +402,28 @@ static void ocfs2_readahead(struct readahead_control *rac)
  */
 static int ocfs2_writepage(struct page *page, struct writeback_control *wbc)
 {
+	struct inode * const inode = page->mapping->host;
+	loff_t i_size = i_size_read(inode);
+	const pgoff_t end_index = i_size >> PAGE_SHIFT;
+	unsigned int offset;
+
 	trace_ocfs2_writepage(
 		(unsigned long long)OCFS2_I(page->mapping->host)->ip_blkno,
 		page->index);
 
-	return block_write_full_page(page, ocfs2_get_block, wbc);
+	/*
+	 * The page straddles i_size.  It must be zeroed out on each and every
+	 * writepage invocation because it may be mmapped.  "A file is mapped
+	 * in multiples of the page size.  For a file that is not a multiple of
+	 * the  page size, the remaining memory is zeroed when mapped, and
+	 * writes to that region are not written out to the file."
+	 */
+	offset = i_size & (PAGE_SIZE-1);
+	if (page->index == end_index && offset)
+		zero_user_segment(page, offset, PAGE_SIZE);
+
+	return __block_write_full_page_eof(inode, page, ocfs2_get_block, wbc,
+			end_buffer_async_write, true);
 }
 
 /* Taken from ext3. We don't necessarily need the full blown
-- 
2.24.3 (Apple Git-128)



  reply	other threads:[~2021-04-26 22:07 UTC|newest]

Thread overview: 60+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-26 22:05 [PATCH 1/3] fs/buffer.c: add new api to allow eof writeback Junxiao Bi
2021-04-26 22:05 ` [Cluster-devel] " Junxiao Bi
2021-04-26 22:05 ` [Ocfs2-devel] " Junxiao Bi
2021-04-26 22:05 ` Junxiao Bi [this message]
2021-04-26 22:05   ` [Cluster-devel] [PATCH 2/3] ocfs2: allow writing back pages out of inode size Junxiao Bi
2021-04-26 22:05   ` [Ocfs2-devel] " Junxiao Bi
2021-04-28 16:00   ` Junxiao Bi
2021-04-28 16:00     ` [Cluster-devel] " Junxiao Bi
2021-04-28 16:00     ` [Ocfs2-devel] " Junxiao Bi
2021-04-29 13:09   ` Joseph Qi
2021-04-29 13:09     ` [Cluster-devel] " Joseph Qi
2021-04-29 13:09     ` Joseph Qi
2021-04-26 22:05 ` [PATCH 3/3] gfs2: fix out of inode size writeback Junxiao Bi
2021-04-26 22:05   ` [Cluster-devel] " Junxiao Bi
2021-04-26 22:05   ` [Ocfs2-devel] " Junxiao Bi
2021-04-28 16:02   ` Junxiao Bi
2021-04-28 16:02     ` [Cluster-devel] " Junxiao Bi
2021-04-28 16:02     ` [Ocfs2-devel] " Junxiao Bi
2021-04-29 11:58 ` [Ocfs2-devel] [PATCH 1/3] fs/buffer.c: add new api to allow eof writeback Joseph Qi
2021-04-29 11:58   ` [Cluster-devel] " Joseph Qi
2021-04-29 11:58   ` Joseph Qi
2021-04-29 17:14 ` [Cluster-devel] " Andreas Gruenbacher
2021-04-29 17:14   ` Andreas Gruenbacher
2021-04-29 17:14   ` [Ocfs2-devel] " Andreas Gruenbacher
2021-04-29 18:07   ` Junxiao Bi
2021-04-29 18:07     ` Junxiao Bi
2021-04-29 18:07     ` [Ocfs2-devel] " Junxiao Bi
2021-04-30 12:47     ` Jan Kara
2021-04-30 12:47       ` Jan Kara
2021-04-30 12:47       ` [Ocfs2-devel] " Jan Kara
2021-04-30 21:18       ` Junxiao Bi
2021-04-30 21:18         ` Junxiao Bi
2021-04-30 21:18         ` [Ocfs2-devel] " Junxiao Bi
2021-05-03 10:29         ` Jan Kara
2021-05-03 10:29           ` Jan Kara
2021-05-03 10:29           ` [Ocfs2-devel] " Jan Kara
2021-05-03 17:25           ` Junxiao Bi
2021-05-03 17:25             ` Junxiao Bi
2021-05-03 17:25             ` [Ocfs2-devel] " Junxiao Bi
2021-05-04  9:02             ` Jan Kara
2021-05-04  9:02               ` Jan Kara
2021-05-04  9:02               ` [Ocfs2-devel] " Jan Kara
2021-05-04 23:35               ` Junxiao Bi
2021-05-04 23:35                 ` Junxiao Bi
2021-05-04 23:35                 ` [Ocfs2-devel] " Junxiao Bi
2021-05-05 11:43                 ` Jan Kara
2021-05-05 11:43                   ` Jan Kara
2021-05-05 11:43                   ` [Ocfs2-devel] " Jan Kara
2021-05-05 15:54                   ` Junxiao Bi
2021-05-05 15:54                     ` Junxiao Bi
2021-05-05 15:54                     ` [Ocfs2-devel] " Junxiao Bi
2021-05-09 23:23 ` [Ocfs2-devel] " Andrew Morton
2021-05-09 23:23   ` [Cluster-devel] " Andrew Morton
2021-05-09 23:23   ` Andrew Morton
2021-05-10 22:15   ` Junxiao Bi
2021-05-10 22:15     ` [Cluster-devel] " Junxiao Bi
2021-05-10 22:15     ` Junxiao Bi
2021-05-11 12:19     ` Bob Peterson
2021-05-11 12:19       ` [Cluster-devel] " Bob Peterson
2021-05-11 12:19       ` Bob Peterson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210426220552.45413-2-junxiao.bi@oracle.com \
    --to=junxiao.bi@oracle.com \
    --cc=cluster-devel@redhat.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=ocfs2-devel@oss.oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.