linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Matthew Wilcox <willy@infradead.org>
To: Dave Chinner <david@fromorbit.com>
Cc: William Kucharski <william.kucharski@oracle.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	linux-fsdevel@vger.kernel.org,
	Dave Hansen <dave.hansen@linux.intel.com>,
	Song Liu <songliubraving@fb.com>,
	Bob Kasten <robert.a.kasten@intel.com>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Chad Mynhier <chad.mynhier@oracle.com>,
	"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
	Johannes Weiner <jweiner@fb.com>
Subject: Re: [PATCH v3 0/2] mm,thp: Add filemap_huge_fault() for THP
Date: Wed, 31 Jul 2019 04:32:21 -0700	[thread overview]
Message-ID: <20190731113221.GE4700@bombadil.infradead.org> (raw)
In-Reply-To: <20190731102053.GZ7689@dread.disaster.area>

On Wed, Jul 31, 2019 at 08:20:53PM +1000, Dave Chinner wrote:
> On Wed, Jul 31, 2019 at 02:25:11AM -0600, William Kucharski wrote:
> > This set of patches is the first step towards a mechanism for automatically
> > mapping read-only text areas of appropriate size and alignment to THPs
> > whenever possible.
> > 
> > For now, the central routine, filemap_huge_fault(), amd various support
> > routines are only included if the experimental kernel configuration option
> > 
> > 	RO_EXEC_FILEMAP_HUGE_FAULT_THP
> > 
> > is enabled.
> > 
> > This is because filemap_huge_fault() is dependent upon the
> > address_space_operations vector readpage() pointing to a routine that will
> > read and fill an entire large page at a time without poulluting the page
> > cache with PAGESIZE entries
> 
> How is the readpage code supposed to stuff a THP page into a bio?
> 
> i.e. Do bio's support huge pages, and if not, what is needed to
> stuff a huge page in a bio chain?

I believe that the current BIO code (after Ming Lei's multipage patches
from late last year / earlier this year) is capable of handling a
PMD-sized page.

> Once you can answer that question, you should be able to easily
> convert the iomap_readpage/iomap_readpage_actor code to support THP
> pages without having to care about much else as iomap_readpage()
> is already coded in a way that will iterate IO over the entire THP
> for you....

Christoph drafted a patch which illustrates the changes needed to the
iomap code.  The biggest problem is:

struct iomap_page {
        atomic_t                read_count;
        atomic_t                write_count;
        DECLARE_BITMAP(uptodate, PAGE_SIZE / 512);
};

All of a sudden that needs to go from a single unsigned long bitmap (or
two on 64kB page size machines) to 512 bytes on x86 and even larger on,
eg, POWER.

It's egregious because no sane filesystem is going to fragment a PMD
sized page into that number of discontiguous blocks, so we never need
to allocate the 520 byte data structure this suddenly becomes.  It'd be
nice to have a more efficient data structure (maybe that tracks uptodate
by extent instead of by individual sector?)  But I don't understand the
iomap layer at all, and I never understood buggerheads, so I don't have
a useful contribution here.


  reply	other threads:[~2019-07-31 11:32 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-07-31  8:25 [PATCH v3 0/2] mm,thp: Add filemap_huge_fault() for THP William Kucharski
2019-07-31  8:25 ` [PATCH v3 1/2] mm: Allow the page cache to allocate large pages William Kucharski
2019-07-31  8:25 ` [PATCH v3 2/2] mm,thp: Add experimental config option RO_EXEC_FILEMAP_HUGE_FAULT_THP William Kucharski
2019-08-01 12:36   ` Kirill A. Shutemov
2019-08-03 22:27     ` William Kucharski
2019-08-05 13:28       ` Kirill A. Shutemov
2019-08-05 15:56         ` William Kucharski
2019-08-06 11:12           ` Kirill A. Shutemov
2019-08-07 16:12             ` William Kucharski
2019-07-31  8:35 ` [PATCH v3 0/2] mm,thp: Add filemap_huge_fault() for THP Song Liu
2019-07-31  8:58   ` William Kucharski
2019-07-31 10:20 ` Dave Chinner
2019-07-31 11:32   ` Matthew Wilcox [this message]
2019-07-31 22:19     ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190731113221.GE4700@bombadil.infradead.org \
    --to=willy@infradead.org \
    --cc=chad.mynhier@oracle.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@fromorbit.com \
    --cc=jweiner@fb.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=robert.a.kasten@intel.com \
    --cc=songliubraving@fb.com \
    --cc=william.kucharski@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).