Linux-Fsdevel Archive on lore.kernel.org
 help / Atom feed
* Re: [LSF/MM TOPIC] Non standard size THP
       [not found] <dcb0b2cf-ba5c-e6ef-0b05-c6006227b6a9@arm.com>
@ 2019-02-08  4:24 ` Matthew Wilcox
  2019-02-08  6:31   ` Anshuman Khandual
  0 siblings, 1 reply; 2+ messages in thread
From: Matthew Wilcox @ 2019-02-08  4:24 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: lsf-pc, linux-mm, linux-kernel, Andrew Morton, Michal Hocko,
	Kirill A . Shutemov, Vlastimil Babka, linux-fsdevel

On Fri, Feb 08, 2019 at 07:43:57AM +0530, Anshuman Khandual wrote:
> How non-standard huge pages can be supported for THP
> 
> 	- THP starts recognizing non standard huge page (exported by arch) like HPAGE_CONT_(PMD|PTE)_SIZE
> 	- THP starts operating for either on HPAGE_PMD_SIZE or HPAGE_CONT_PMD_SIZE or HPAGE_CONT_PTE_SIZE
> 	- set_pmd_at() only recognizes HPAGE_PMD_SIZE hence replace set_pmd_at() with set_huge_pmd_at()
> 	- set_huge_pmd_at() could differentiate between HPAGE_PMD_SIZE or HPAGE_CONT_PMD_SIZE
> 	- In case for HPAGE_CONT_PTE_SIZE extend page table walker till PTE level
> 	- Use set_huge_pte_at() which can operate on multiple contiguous PTE bits

I think your proposed solution reflects thinking like a hardware person
rather than like a software person.  Or maybe like an MM person rather
than a FS person.  I see the same problem with Kirill's solutions ;-)

Perhaps you don't realise that using larger pages when appropriate
would also benefit filesystems as well as CPUs.  You didn't include
linux-fsdevel on this submission, so that's a plausible explanation.

The XArray currently supports arbitrary power-of-two-naturally-aligned
page sizes, and conveniently so does the page allocator [1].  The problem
is that various bits of the MM have a very fixed mindset that pages are
PTE, PMD or PUD in size.

We should enhance routines like vmf_insert_page() to handle
arbitrary sized pages rather than having separate vmf_insert_pfn()
and vmf_insert_pfn_pmd().  We probably need to enhance the set_pxx_at()
API to pass in an order, rather than explicitly naming pte/pmd/pud/...

First, though, we need to actually get arbitrary sized pages handled
correctly in the page cache.  So if anyone's interested in talking about
this, but hasn't been reviewing or commenting on the patches I've been
sending to make this happen, I'm going to seriously question their actual
commitment to wanting this to happen, rather than wanting a nice holiday
in Puerto Rico.

Sorry to be so blunt about this, but I've only had review from Kirill,
which makes me think that nobody else actually cares about getting
this fixed.

[1] Support for arbitrary sized and aligned entries is in progress for
the XArray, but I don't think there's any appetite for changing the buddy
allocator to let us allocate "pages" that are an arbitrary extent in size.


^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [LSF/MM TOPIC] Non standard size THP
  2019-02-08  4:24 ` [LSF/MM TOPIC] Non standard size THP Matthew Wilcox
@ 2019-02-08  6:31   ` Anshuman Khandual
  0 siblings, 0 replies; 2+ messages in thread
From: Anshuman Khandual @ 2019-02-08  6:31 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: lsf-pc, linux-mm, linux-kernel, Andrew Morton, Michal Hocko,
	Kirill A . Shutemov, Vlastimil Babka, linux-fsdevel



On 02/08/2019 09:54 AM, Matthew Wilcox wrote:
> On Fri, Feb 08, 2019 at 07:43:57AM +0530, Anshuman Khandual wrote:
>> How non-standard huge pages can be supported for THP
>>
>> 	- THP starts recognizing non standard huge page (exported by arch) like HPAGE_CONT_(PMD|PTE)_SIZE
>> 	- THP starts operating for either on HPAGE_PMD_SIZE or HPAGE_CONT_PMD_SIZE or HPAGE_CONT_PTE_SIZE
>> 	- set_pmd_at() only recognizes HPAGE_PMD_SIZE hence replace set_pmd_at() with set_huge_pmd_at()
>> 	- set_huge_pmd_at() could differentiate between HPAGE_PMD_SIZE or HPAGE_CONT_PMD_SIZE
>> 	- In case for HPAGE_CONT_PTE_SIZE extend page table walker till PTE level
>> 	- Use set_huge_pte_at() which can operate on multiple contiguous PTE bits
> 
> I think your proposed solution reflects thinking like a hardware person
> rather than like a software person.  Or maybe like an MM person rather
> than a FS person.  I see the same problem with Kirill's solutions ;-)

You might be right on this :) I was trying to derive a solution based on
all existing semantics with limited code addition rather than inventing
something completely different.

> 
> Perhaps you don't realise that using larger pages when appropriate
> would also benefit filesystems as well as CPUs.  You didn't include
> linux-fsdevel on this submission, so that's a plausible explanation.

Yes that was an omission. Thanks for adding linux-fsdevel to the thread.

> 
> The XArray currently supports arbitrary power-of-two-naturally-aligned
> page sizes, and conveniently so does the page allocator [1].  The problem
> is that various bits of the MM have a very fixed mindset that pages are
> PTE, PMD or PUD in size.

I agree. But in general it works as allocated page with required order do
reside in one of these levels in the page table.

> 
> We should enhance routines like vmf_insert_page() to handle
> arbitrary sized pages rather than having separate vmf_insert_pfn()
> and vmf_insert_pfn_pmd().  We probably need to enhance the set_pxx_at()
> API to pass in an order, rather than explicitly naming pte/pmd/pud/...

I agree. set_huge_pte_at() actually does that to some extent on ARM64.
But thats just for HugeTLB.

> 
> First, though, we need to actually get arbitrary sized pages handled
> correctly in the page cache.  So if anyone's interested in talking about
> this, but hasn't been reviewing or commenting on the patches I've been
> sending to make this happen, I'm going to seriously question their actual
> commitment to wanting this to happen, rather than wanting a nice holiday
> in Puerto Rico.
> 
> Sorry to be so blunt about this, but I've only had review from Kirill,
> which makes me think that nobody else actually cares about getting
> this fixed.

To be honest I have not been following your work in this regard. I started
looking into this problem late last year and my goal has been more focused 
towards a THP solution for intermediate page table level sized huge pages.

But I agree to your point that there should be an wider solution which can
make generic MM deal with page sizes of any order rather than page table
level ones like PTE/PMD/PUD etc.

> 
> [1] Support for arbitrary sized and aligned entries is in progress for
> the XArray, but I don't think there's any appetite for changing the buddy
> allocator to let us allocate "pages" that are an arbitrary extent in size.
> 
> 

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, back to index

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <dcb0b2cf-ba5c-e6ef-0b05-c6006227b6a9@arm.com>
2019-02-08  4:24 ` [LSF/MM TOPIC] Non standard size THP Matthew Wilcox
2019-02-08  6:31   ` Anshuman Khandual

Linux-Fsdevel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-fsdevel/0 linux-fsdevel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-fsdevel linux-fsdevel/ https://lore.kernel.org/linux-fsdevel \
		linux-fsdevel@vger.kernel.org linux-fsdevel@archiver.kernel.org
	public-inbox-index linux-fsdevel


Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-fsdevel


AGPL code for this site: git clone https://public-inbox.org/ public-inbox