From: Matthew Wilcox <willy@infradead.org>
To: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
Huang Ying <ying.huang@intel.com>
Subject: Re: [PATCH 02/11] mm/memory: Remove page fault assumption of compound page size
Date: Wed, 9 Sep 2020 15:50:35 +0100 [thread overview]
Message-ID: <20200909145035.GH6583@casper.infradead.org> (raw)
In-Reply-To: <20200909142904.acca6gthbffk3jwq@box>
On Wed, Sep 09, 2020 at 05:29:04PM +0300, Kirill A. Shutemov wrote:
> On Tue, Sep 08, 2020 at 08:55:29PM +0100, Matthew Wilcox (Oracle) wrote:
> > A compound page in the page cache will not necessarily be of PMD size,
> > so check explicitly.
> >
> > Signed-off-by: Matthew Wilcox (Oracle) <willy@infradead.org>
> > ---
> > mm/memory.c | 7 ++++---
> > 1 file changed, 4 insertions(+), 3 deletions(-)
> >
> > diff --git a/mm/memory.c b/mm/memory.c
> > index 602f4283122f..4b35b4e71e64 100644
> > --- a/mm/memory.c
> > +++ b/mm/memory.c
> > @@ -3562,13 +3562,14 @@ static vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page)
> > unsigned long haddr = vmf->address & HPAGE_PMD_MASK;
> > pmd_t entry;
> > int i;
> > - vm_fault_t ret;
> > + vm_fault_t ret = VM_FAULT_FALLBACK;
> >
> > if (!transhuge_vma_suitable(vma, haddr))
> > - return VM_FAULT_FALLBACK;
> > + return ret;
> >
> > - ret = VM_FAULT_FALLBACK;
> > page = compound_head(page);
> > + if (page_order(page) != HPAGE_PMD_ORDER)
> > + return ret;
>
> Maybe also VM_BUG_ON_PAGE(page_order(page) > HPAGE_PMD_ORDER, page)?
> Just in case.
In the patch where I actually start creating THPs, I limit the order to
HPAGE_PMD_ORDER, so we're not going to see this today. At some point
in the future, I can imagine that we allow THPs larger than PMD size,
and what we'd want alloc_set_pte() to look like is:
if (pud_none(*vmf->pud) && PageTransCompound(page)) {
ret = do_set_pud(vmf, page);
if (ret != VM_FAULT_FALLBACK)
return ret;
}
if (pmd_none(*vmf->pmd) && PageTransCompound(page)) {
ret = do_set_pmd(vmf, page);
if (ret != VM_FAULT_FALLBACK)
return ret;
}
Once we're in that situation, in do_set_pmd(), we'd want to figure out
which sub-page of the >PMD-sized page to insert. But I don't want to
write code for that now.
So, what's the right approach if somebody does call alloc_set_pte()
with a >PMD sized page? It's not exported, so the only two ways to get
it called with a >PMD sized page is to (1) persuade filemap_map_pages()
to call it, which means putting it in the page cache or (2) return it
from vm_ops->fault. If someone actually does that (an interesting
device driver, perhaps), I don't think hitting it with a BUG is the
right response. I think it should actually be to map the right PMD-sized
chunk of the page, but we don't even do that today -- we map the first
PMD-sized chunk of the page.
With this patch, we'll simply map the appropriate PAGE_SIZE chunk at the
requested address. So this would be a bugfix for such a demented driver.
At some point, it'd be nice to handle this with a PMD, but I don't want
to write that code without a test-case. We could probably simulate
it with the page cache THP code and be super-aggressive about creating
order-10 pages ... but this is feeling more and more out of scope for
this patch set, which today hit 99 patches.
next prev parent reply other threads:[~2020-09-09 14:50 UTC|newest]
Thread overview: 39+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-09-08 19:55 [PATCH 00/11] Remove assumptions of THP size Matthew Wilcox (Oracle)
2020-09-08 19:55 ` [PATCH 01/11] mm/filemap: Fix page cache removal for arbitrary sized THPs Matthew Wilcox (Oracle)
2020-09-09 14:27 ` Kirill A. Shutemov
2020-09-15 7:13 ` SeongJae Park
2020-09-08 19:55 ` [PATCH 02/11] mm/memory: Remove page fault assumption of compound page size Matthew Wilcox (Oracle)
2020-09-09 14:29 ` Kirill A. Shutemov
2020-09-09 14:50 ` Matthew Wilcox [this message]
2020-09-11 14:51 ` Kirill A. Shutemov
2020-09-08 19:55 ` [PATCH 03/11] mm/page_owner: Change split_page_owner to take a count Matthew Wilcox (Oracle)
2020-09-09 14:42 ` Kirill A. Shutemov
2020-09-15 7:17 ` SeongJae Park
2020-10-13 13:52 ` Matthew Wilcox
2020-09-08 19:55 ` [PATCH 04/11] mm/huge_memory: Fix total_mapcount assumption of page size Matthew Wilcox (Oracle)
2020-09-15 7:21 ` SeongJae Park
2020-09-08 19:55 ` [PATCH 05/11] mm/huge_memory: Fix split " Matthew Wilcox (Oracle)
2020-09-15 7:23 ` SeongJae Park
2020-09-08 19:55 ` [PATCH 06/11] mm/huge_memory: Fix page_trans_huge_mapcount assumption of THP size Matthew Wilcox (Oracle)
2020-09-09 14:45 ` Kirill A. Shutemov
2020-09-15 7:24 ` SeongJae Park
2020-09-08 19:55 ` [PATCH 07/11] mm/huge_memory: Fix can_split_huge_page " Matthew Wilcox (Oracle)
2020-09-09 14:46 ` Kirill A. Shutemov
2020-09-15 7:25 ` SeongJae Park
2020-09-16 1:44 ` Huang, Ying
2020-09-08 19:55 ` [PATCH 08/11] mm/rmap: Fix assumptions " Matthew Wilcox (Oracle)
2020-09-09 14:47 ` Kirill A. Shutemov
2020-09-15 7:27 ` SeongJae Park
2020-09-08 19:55 ` [PATCH 09/11] mm/truncate: Fix truncation for pages of arbitrary size Matthew Wilcox (Oracle)
2020-09-09 14:50 ` Kirill A. Shutemov
2020-09-15 7:36 ` SeongJae Park
2020-09-08 19:55 ` [PATCH 10/11] mm/page-writeback: Support tail pages in wait_for_stable_page Matthew Wilcox (Oracle)
2020-09-09 14:53 ` Kirill A. Shutemov
2020-09-15 7:37 ` SeongJae Park
2020-09-08 19:55 ` [PATCH 11/11] mm/vmscan: Allow arbitrary sized pages to be paged out Matthew Wilcox (Oracle)
2020-09-09 14:55 ` Kirill A. Shutemov
2020-09-15 7:40 ` SeongJae Park
2020-09-15 12:52 ` Matthew Wilcox
2020-09-16 1:40 ` Huang, Ying
2020-09-16 6:09 ` SeongJae Park
2020-09-30 12:13 ` Matthew Wilcox
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200909145035.GH6583@casper.infradead.org \
--to=willy@infradead.org \
--cc=akpm@linux-foundation.org \
--cc=kirill@shutemov.name \
--cc=linux-mm@kvack.org \
--cc=ying.huang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).