linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Matthew Wilcox (Oracle)" <willy@infradead.org>
To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Subject: [PATCH 13/19] mm/filemap: Support readpage splitting a page
Date: Thu, 29 Oct 2020 19:33:59 +0000	[thread overview]
Message-ID: <20201029193405.29125-14-willy@infradead.org> (raw)
In-Reply-To: <20201029193405.29125-1-willy@infradead.org>

We need to tell readpage which subpage we're actually interested in
(by passing the subpage to gfbr_read_page()), and if it does split the
THP, we need to update the page in the page array to be the subpage.

For page splitting to succeed, the thread asking to split the
page has to be the only one with a reference to the page.  Calling
wait_on_page_locked() while holding a reference to the page will
effectively prevent this from happening with sufficient threads waiting
on the same page.  Use put_and_wait_on_page_locked() to sleep without
holding a reference to the page, then retry the page lookup after the
page is unlocked.

Since we now get the page lock a little earlier in gfbr_update_page(),
we can eliminate a number of duplicate checks.  The original intent
(commit ebded02788b5 ("avoid unnecessary calls to lock_page when waiting
for IO to complete during a read") behind getting the page lock later
was to avoid re-locking the page after it has been brought uptodate by
another thread.  We will still avoid that because we go through the normal
lookup path again after the winning thread has brought the page uptodate.

Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
---
 mm/filemap.c | 76 +++++++++++++++++-----------------------------------
 1 file changed, 24 insertions(+), 52 deletions(-)

diff --git a/mm/filemap.c b/mm/filemap.c
index 215729048cbd..87f89e5dd64e 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1358,14 +1358,6 @@ static int __wait_on_page_locked_async(struct page *page,
 	return ret;
 }
 
-static int wait_on_page_locked_async(struct page *page,
-				     struct wait_page_queue *wait)
-{
-	if (!PageLocked(page))
-		return 0;
-	return __wait_on_page_locked_async(compound_head(page), wait, false);
-}
-
 /**
  * put_and_wait_on_page_locked - Drop a reference and wait for it to be unlocked
  * @page: The page to wait for.
@@ -2259,6 +2251,7 @@ static struct page *gfbr_read_page(struct kiocb *iocb,
 		return error != AOP_TRUNCATED_PAGE ? ERR_PTR(error) : NULL;
 	}
 
+	page = thp_head(page);
 	if (!PageUptodate(page)) {
 		error = lock_page_for_iocb(iocb, page);
 		if (unlikely(error)) {
@@ -2292,64 +2285,42 @@ static struct page *gfbr_update_page(struct kiocb *iocb,
 	struct inode *inode = mapping->host;
 	int error;
 
-	/*
-	 * See comment in do_read_cache_page on why
-	 * wait_on_page_locked is used to avoid unnecessarily
-	 * serialisations and why it's safe.
-	 */
 	if (iocb->ki_flags & IOCB_WAITQ) {
-		error = wait_on_page_locked_async(page,
-						iocb->ki_waitq);
-	} else {
-		error = wait_on_page_locked_killable(page);
-	}
-	if (unlikely(error)) {
-		put_page(page);
-		return ERR_PTR(error);
+		error = lock_page_async(page, iocb->ki_waitq);
+		if (error) {
+			put_page(page);
+			return ERR_PTR(error);
+		}
+	} else if (!trylock_page(page)) {
+		put_and_wait_on_page_locked(page, TASK_KILLABLE);
+		return NULL;
 	}
+
 	if (PageUptodate(page))
-		return page;
+		goto uptodate;
 
 	if (inode->i_blkbits == PAGE_SHIFT ||
 			!mapping->a_ops->is_partially_uptodate)
-		goto page_not_up_to_date;
+		goto readpage;
 	/* pipes can't handle partially uptodate pages */
 	if (unlikely(iov_iter_is_pipe(iter)))
-		goto page_not_up_to_date;
-	if (!trylock_page(page))
-		goto page_not_up_to_date;
-	/* Did it get truncated before we got the lock? */
+		goto readpage;
 	if (!page->mapping)
-		goto page_not_up_to_date_locked;
+		goto truncated;
 	if (!mapping->a_ops->is_partially_uptodate(page,
-				pos & ~PAGE_MASK, count))
-		goto page_not_up_to_date_locked;
+				pos & (thp_size(page) - 1), count))
+		goto readpage;
+uptodate:
 	unlock_page(page);
 	return page;
 
-page_not_up_to_date:
-	/* Get exclusive access to the page ... */
-	error = lock_page_for_iocb(iocb, page);
-	if (unlikely(error)) {
-		put_page(page);
-		return ERR_PTR(error);
-	}
-
-page_not_up_to_date_locked:
-	/* Did it get truncated before we got the lock? */
-	if (!page->mapping) {
-		unlock_page(page);
-		put_page(page);
-		return NULL;
-	}
-
-	/* Did somebody else fill it already? */
-	if (PageUptodate(page)) {
-		unlock_page(page);
-		return page;
-	}
-
+readpage:
+	page += (pos / PAGE_SIZE) - page->index;
 	return gfbr_read_page(iocb, mapping, page);
+truncated:
+	unlock_page(page);
+	put_page(page);
+	return NULL;
 }
 
 static struct page *gfbr_create_page(struct kiocb *iocb,
@@ -2443,6 +2414,7 @@ static int gfbr_get_pages(struct kiocb *iocb, struct iov_iter *iter,
 				err = PTR_ERR_OR_ZERO(page);
 				break;
 			}
+			pages[i] = page;
 		}
 	}
 
-- 
2.28.0



  parent reply	other threads:[~2020-10-29 19:34 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-29 19:33 [PATCH 00/19] Transparent Hugepages for non-tmpfs filesystems Matthew Wilcox (Oracle)
2020-10-29 19:33 ` [PATCH 01/19] XArray: Expose xas_destroy Matthew Wilcox (Oracle)
2020-10-29 20:33   ` Zi Yan
2020-10-29 19:33 ` [PATCH 02/19] mm: Use multi-index entries in the page cache Matthew Wilcox (Oracle)
2020-10-29 20:49   ` Zi Yan
2020-10-29 21:54     ` Matthew Wilcox
2020-10-30 14:48       ` Zi Yan
2020-11-03 14:04   ` Kirill A. Shutemov
2020-10-29 19:33 ` [PATCH 03/19] mm: Support arbitrary THP sizes Matthew Wilcox (Oracle)
2020-10-29 20:50   ` Zi Yan
2020-10-29 19:33 ` [PATCH 04/19] mm: Change NR_FILE_THPS to account in base pages Matthew Wilcox (Oracle)
2020-10-29 19:33 ` [PATCH 05/19] mm/filemap: Rename generic_file_buffered_read subfunctions Matthew Wilcox (Oracle)
2020-10-30  0:04   ` Kent Overstreet
2020-10-30  8:56   ` Christoph Hellwig
2020-11-03 14:16   ` Kirill A. Shutemov
2020-11-03 14:40     ` Matthew Wilcox
2020-11-03 15:02       ` Kirill A. Shutemov
2020-10-29 19:33 ` [PATCH 06/19] mm/filemap: Change calling convention for gfbr_ functions Matthew Wilcox (Oracle)
2020-10-30  0:05   ` Kent Overstreet
2020-10-29 19:33 ` [PATCH 07/19] mm/filemap: Use head pages in generic_file_buffered_read Matthew Wilcox (Oracle)
2020-10-30  0:19   ` Kent Overstreet
2020-10-30  1:03     ` Matthew Wilcox
2020-10-29 19:33 ` [PATCH 08/19] mm/filemap: Add __page_cache_alloc_order Matthew Wilcox (Oracle)
2020-10-29 19:33 ` [PATCH 09/19] mm/filemap: Allow THPs to be added to the page cache Matthew Wilcox (Oracle)
2020-10-29 19:33 ` [PATCH 10/19] mm/vmscan: Optimise shrink_page_list for smaller THPs Matthew Wilcox (Oracle)
2020-10-29 19:33 ` [PATCH 11/19] mm/filemap: Allow PageReadahead to be set on head pages Matthew Wilcox (Oracle)
2020-10-29 19:33 ` [PATCH 12/19] mm: Pass a sleep state to put_and_wait_on_page_locked Matthew Wilcox (Oracle)
2020-10-29 19:33 ` Matthew Wilcox (Oracle) [this message]
2020-10-29 19:34 ` [PATCH 14/19] mm/filemap: Inline __wait_on_page_locked_async into caller Matthew Wilcox (Oracle)
2020-10-29 19:34 ` [PATCH 15/19] mm/readahead: Add THP readahead Matthew Wilcox (Oracle)
2020-10-29 19:34 ` [PATCH 16/19] mm/readahead: Align THP mappings for non-DAX Matthew Wilcox (Oracle)
2020-10-29 19:34 ` [PATCH 17/19] mm/readahead: Switch to page_cache_ra_order Matthew Wilcox (Oracle)
2020-10-29 19:34 ` [PATCH 18/19] mm/filemap: Support VM_HUGEPAGE for file mappings Matthew Wilcox (Oracle)
2020-10-29 19:34 ` [PATCH 19/19] selftests/vm/transhuge-stress: Support file-backed THPs Matthew Wilcox (Oracle)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201029193405.29125-14-willy@infradead.org \
    --to=willy@infradead.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).