linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: "Kirill A. Shutemov" <kirill@shutemov.name>
To: William Kucharski <william.kucharski@oracle.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
	linux-mm@kvack.org, linux-fsdevel@vger.kernel.org,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Song Liu <songliubraving@fb.com>,
	Bob Kasten <robert.a.kasten@intel.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Chad Mynhier <chad.mynhier@oracle.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Johannes Weiner <jweiner@fb.com>,
	Matthew Wilcox <willy@infradead.org>
Subject: Re: [PATCH v3 2/2] mm,thp: Add experimental config option RO_EXEC_FILEMAP_HUGE_FAULT_THP
Date: Tue, 6 Aug 2019 14:12:10 +0300	[thread overview]
Message-ID: <20190806111210.7xpmjsd4hq54vuml@box> (raw)
In-Reply-To: <19A86A16-B440-4B73-98FE-922A09484DFD@oracle.com>

On Mon, Aug 05, 2019 at 09:56:45AM -0600, William Kucharski wrote:
> >> I don't really care if the start of the VMA is suitable, just whether I can map
> >> the current faulting page with a THP. As far as I know, there's nothing wrong
> >> with mapping all the pages before the VMA hits a properly aligned bound with
> >> PAGESIZE pages and then aligned chunks in the middle with THP.
> > 
> > You cannot map any paged as huge into wrongly aligned VMA.
> > 
> > THP's ->index must be aligned to HPAGE_PMD_NR, so if the combination VMA's
> > ->vm_start and ->vm_pgoff doesn't allow for this, you must fallback to
> > mapping the page with PTEs. I don't see it handled properly here.
> 
> It was my assumption that if say a VMA started at an address say one page
> before a large page alignment, you could map that page with a PAGESIZE
> page but if VMA size allowed, there was a fault on the next page, and
> VMA size allowed, you could map that next range with a large page, taking
> taking the approach of mapping chunks of the VMA with the largest page
> possible.
> 
> Is it that the start of the VMA must always align or that the entire VMA
> must be properly aligned and a multiple of the PMD size (so you either map
> with all large pages or none)?

IIUC, you are missing ->vm_pgoff from the picture. The newly allocated
page must land into page cache aligned on HPAGE_PMD_NR boundary. In other
word you cannout have huge page with ->index, let say, 1.

VMA is only suitable for at least one file-THP page if:

 - (vma->vm_start >> PAGE_SHIFT) % (HPAGE_PMD_NR - 1) is equal to
    vma->vm_pgoff % (HPAGE_PMD_NR - 1)

    This guarantees right alignment in the backing page cache.

 - *and* vma->vm_end - round_up(vma->vm_start, HPAGE_PMD_SIZE) is equal or
   greater than HPAGE_PMD_SIZE.

Does it make sense?

> 
> >> This is the page that content was just read to; readpage() will unlock the page
> >> when it is done with I/O, but the page needs to be locked before it's inserted
> >> into the page cache.
> > 
> > Then you must to lock the page properly with lock_page().
> > 
> > __SetPageLocked() is fine for just allocated pages that was not exposed
> > anywhere. After ->readpage() it's not the case and it's not safe to use
> > __SetPageLocked() for them.
> 
> In the current code, it's assumed it is not exposed, because a single read
> of a large page that does no readahead before the page is inserted into the
> cache means there are no external users of the page.

You've exposed the page to the filesystem once you call ->readpage().
It *may* track the page somehow after the call.

-- 
 Kirill A. Shutemov


  reply	other threads:[~2019-08-06 11:12 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-31  8:25 [PATCH v3 0/2] mm,thp: Add filemap_huge_fault() for THP William Kucharski
2019-07-31  8:25 ` [PATCH v3 1/2] mm: Allow the page cache to allocate large pages William Kucharski
2019-07-31  8:25 ` [PATCH v3 2/2] mm,thp: Add experimental config option RO_EXEC_FILEMAP_HUGE_FAULT_THP William Kucharski
2019-08-01 12:36   ` Kirill A. Shutemov
2019-08-03 22:27     ` William Kucharski
2019-08-05 13:28       ` Kirill A. Shutemov
2019-08-05 15:56         ` William Kucharski
2019-08-06 11:12           ` Kirill A. Shutemov [this message]
2019-08-07 16:12             ` William Kucharski
2019-07-31  8:35 ` [PATCH v3 0/2] mm,thp: Add filemap_huge_fault() for THP Song Liu
2019-07-31  8:58   ` William Kucharski
2019-07-31 10:20 ` Dave Chinner
2019-07-31 11:32   ` Matthew Wilcox
2019-07-31 22:19     ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190806111210.7xpmjsd4hq54vuml@box \
    --to=kirill@shutemov.name \
    --cc=chad.mynhier@oracle.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=jweiner@fb.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=robert.a.kasten@intel.com \
    --cc=songliubraving@fb.com \
    --cc=william.kucharski@oracle.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).