All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: ceph-devel@vger.kernel.org
Cc: linux-cachefs@redhat.com, pfmeec@rit.edu, willy@infradead.org,
	dhowells@redhat.com, idryomov@gmail.com, stable@vger.kernel.org,
	Andrew W Elble <aweits@rit.edu>
Subject: [PATCH v3] ceph: fix write_begin optimization when write is beyond EOF
Date: Sat, 12 Jun 2021 14:35:31 -0400	[thread overview]
Message-ID: <20210612183531.17074-1-jlayton@kernel.org> (raw)
In-Reply-To: <YMS4TOw8txQQ7VGr@casper.infradead.org>

It's not sufficient to skip reading when the pos is beyond the EOF.
There may be data at the head of the page that we need to fill in
before the write.

Add a new helper function that corrects and clarifies the logic.

Cc: <stable@vger.kernel.org> # v5.10+
Fixes: 1cc1699070bd ("ceph: fold ceph_update_writeable_page into ceph_write_begin")
Reported-by: Andrew W Elble <aweits@rit.edu>
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
 fs/ceph/addr.c | 60 +++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 47 insertions(+), 13 deletions(-)

Willy pointed out that I had missed the i_size == 0 case in my earlier
patch. Also, the whole condition was getting a bit messy. This factors
it out into a new helper (and we can maybe copy this helper into netfs
code).

diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c
index 26e66436f005..ba53e9a3f0c1 100644
--- a/fs/ceph/addr.c
+++ b/fs/ceph/addr.c
@@ -1302,6 +1302,51 @@ ceph_find_incompatible(struct page *page)
 	return NULL;
 }
 
+/**
+ * prep_noread_page - prep a page for writing without reading first
+ * @page: page being prepared
+ * @pos: starting position for the write
+ * @len: length of write
+ *
+ * In some cases we don't need to read at all:
+ * - full page write
+ * - file is currently zero-length
+ * - write that lies in a page that is completely beyond EOF
+ * - write that covers the the page from start to EOF or beyond it
+ *
+ * If any of these criteria are met, then zero out the unwritten parts
+ * of the page and return true. Otherwise, return false.
+ */
+static bool prep_noread_page(struct page *page, loff_t pos, unsigned int len)
+{
+	struct inode *inode = page->mapping->host;
+	loff_t i_size = i_size_read(inode);
+	pgoff_t index = pos / PAGE_SIZE;
+	int pos_in_page = pos & ~PAGE_MASK;
+
+	/* full page write */
+	if (pos_in_page == 0 && len == PAGE_SIZE)
+		goto zero_out;
+
+	/* zero-length file */
+	if (i_size == 0)
+		goto zero_out;
+
+	/* position beyond last page in the file */
+	if (index > ((i_size - 1) / PAGE_SIZE))
+		goto zero_out;
+
+	/* write that covers the the page from start to EOF or beyond it */
+	if (pos_in_page == 0 && (pos + len) >= i_size)
+		goto zero_out;
+
+	return false;
+zero_out:
+	zero_user_segments(page, 0, pos_in_page,
+			   pos_in_page + len, PAGE_SIZE);
+	return true;
+}
+
 /*
  * We are only allowed to write into/dirty the page if the page is
  * clean, or already dirty within the same snap context.
@@ -1315,7 +1360,6 @@ static int ceph_write_begin(struct file *file, struct address_space *mapping,
 	struct ceph_snap_context *snapc;
 	struct page *page = NULL;
 	pgoff_t index = pos >> PAGE_SHIFT;
-	int pos_in_page = pos & ~PAGE_MASK;
 	int r = 0;
 
 	dout("write_begin file %p inode %p page %p %d~%d\n", file, inode, page, (int)pos, (int)len);
@@ -1350,19 +1394,9 @@ static int ceph_write_begin(struct file *file, struct address_space *mapping,
 			break;
 		}
 
-		/*
-		 * In some cases we don't need to read at all:
-		 * - full page write
-		 * - write that lies completely beyond EOF
-		 * - write that covers the the page from start to EOF or beyond it
-		 */
-		if ((pos_in_page == 0 && len == PAGE_SIZE) ||
-		    (pos >= i_size_read(inode)) ||
-		    (pos_in_page == 0 && (pos + len) >= i_size_read(inode))) {
-			zero_user_segments(page, 0, pos_in_page,
-					   pos_in_page + len, PAGE_SIZE);
+		/* No need to read in some cases */
+		if (prep_noread_page(page, pos, len))
 			break;
-		}
 
 		/*
 		 * We need to read it. If we get back -EINPROGRESS, then the page was
-- 
2.31.1


  reply	other threads:[~2021-06-12 18:35 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-11 19:59 [PATCH] ceph: fix write_begin optimization when write is beyond EOF Jeff Layton
2021-06-11 20:48 ` Matthew Wilcox
2021-06-11 22:20   ` Jeff Layton
2021-06-12  0:11 ` [PATCH v2] " Jeff Layton
2021-06-12 13:36   ` Matthew Wilcox
2021-06-12 18:35     ` Jeff Layton [this message]
2021-06-13 11:04       ` [PATCH v3] " Matthew Wilcox
2021-06-13 11:36         ` [PATCH v4] " Jeff Layton
2021-06-13 12:02           ` Jeff Layton
2021-06-13 15:15             ` Matthew Wilcox
2021-06-13 15:25               ` Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210612183531.17074-1-jlayton@kernel.org \
    --to=jlayton@kernel.org \
    --cc=aweits@rit.edu \
    --cc=ceph-devel@vger.kernel.org \
    --cc=dhowells@redhat.com \
    --cc=idryomov@gmail.com \
    --cc=linux-cachefs@redhat.com \
    --cc=pfmeec@rit.edu \
    --cc=stable@vger.kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.