linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Jason Gunthorpe <jgg@nvidia.com>
To: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: "linux-mm@kvack.org" <linux-mm@kvack.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"linuxppc-dev@lists.ozlabs.org" <linuxppc-dev@lists.ozlabs.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	Peter Xu <peterx@redhat.com>
Subject: Re: [RFC PATCH 1/8] mm: Provide pagesize to pmd_populate()
Date: Wed, 27 Mar 2024 13:57:54 -0300	[thread overview]
Message-ID: <20240327165754.GM946323@nvidia.com> (raw)
In-Reply-To: <9703878c-c0b0-48ff-a356-d43e8f7391f3@csgroup.eu>

On Wed, Mar 27, 2024 at 09:58:35AM +0000, Christophe Leroy wrote:
> > Just general remarks on the ones with huge pages:
> > 
> >   hash 64k and hugepage 16M/16G
> >   radix 64k/radix hugepage 2M/1G
> >   radix 4k/radix hugepage 2M/1G
> >   nohash 32
> >    - I think this is just a normal x86 like scheme? PMD/PUD can be a
> >      leaf with the same size as a next level table.
> > 
> >      Do any of these cases need to know the higher level to parse the
> >      lower? eg is there a 2M bit in the PUD indicating that the PMD
> >      is a table of 2M leafs or does each PMD entry have a bit
> >      indicating it is a leaf?
> 
> For hash and radix there is a bit that tells it is leaf (_PAGE_PTE)
> 
> For nohash32/e500 I think the drawing is not full right, there is a huge 
> page directory (hugepd) with a single entry. I think it should be 
> possible to change it to a leaf entry, it seems we have bit _PAGE_SW1 
> available in the PTE.

It sounds to me like PPC breaks down into only a couple fundamental
behaviors
 - x86 like leaf in many page levels. Use the pgd/pud/pmd_leaf() and
   related to implement it
 - ARM like contig PTE within a single page table level. Use the
   contig sutff to implement it
 - Contig PTE across two page table levels with a bit in the
   PMD. Needs new support like you showed
 - Page table levels with a variable page size. Ie a PUD can point to
   a directory of 8 pages or 512 pages of different size. Probbaly
   needs some new core support, but I think your changes to the
   *_offset go a long way already.

> > 
> >   hash 4k and hugepage 16M/16G
> >   nohash 64
> >    - How does this work? I guess since 8xx explicitly calls out
> >      consecutive this is actually the pgd can point to 512 256M
> >      entries or 8 16G entries? Ie the table size at each level is
> >      varable? Or is it the same and the table size is still 512 and
> >      each 16G entry is replicated 64 times?
> 
> For those it is using the huge page directory (hugepd) which can be 
> hooked at any level and is a directory of huge pages on its own. There 
> is no consecutive entries involved here I think, allthough I'm not 
> completely sure.
> 
> For hash4k I'm not sure how it works, this was changed by commit 
> e2b3d202d1db ("powerpc: Switch 16GB and 16MB explicit hugepages to a 
> different page table format")
> 
> For the nohash/64, a PGD entry points either to a regular PUD directory 
> or to a HUGEPD directory. The size of the HUGEPD directory is encoded in 
> the 6 lower bits of the PGD entry.

If it is a software walker there might be value in just aligning to
the contig pte scheme in all levels and forgetting about the variable
size page table levels. That quarter page stuff is a PITA to manage
the memory allocation for on PPC anyhow..

Jason

  reply	other threads:[~2024-03-27 16:59 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-25 14:55 [RFC PATCH 0/8] Reimplement huge pages without hugepd on powerpc 8xx Christophe Leroy
2024-03-25 14:55 ` [RFC PATCH 1/8] mm: Provide pagesize to pmd_populate() Christophe Leroy
2024-03-25 16:19   ` Jason Gunthorpe
2024-03-25 19:05     ` Christophe Leroy
2024-03-26 15:01       ` Jason Gunthorpe
2024-03-27  9:58         ` Christophe Leroy
2024-03-27 16:57           ` Jason Gunthorpe [this message]
2024-04-03 18:24             ` Christophe Leroy
2024-04-04 11:46               ` Jason Gunthorpe
2024-03-25 14:55 ` [RFC PATCH 2/8] mm: Provide page size to pte_alloc_huge() Christophe Leroy
2024-03-25 14:55 ` [RFC PATCH 3/8] mm: Provide pmd to pte_leaf_size() Christophe Leroy
2024-03-25 14:55 ` [RFC PATCH 4/8] mm: Provide mm_struct and address to huge_ptep_get() Christophe Leroy
2024-03-25 16:35   ` Jason Gunthorpe
2024-03-25 14:55 ` [RFC PATCH 5/8] powerpc/mm: Allow hugepages without hugepd Christophe Leroy
2024-03-25 14:55 ` [RFC PATCH 6/8] powerpc/8xx: Fix size given to set_huge_pte_at() Christophe Leroy
2024-03-25 14:56 ` [RFC PATCH 7/8] powerpc/8xx: Remove support for 8M pages Christophe Leroy
2024-03-25 14:56 ` [RFC PATCH 8/8] powerpc/8xx: Add back support for 8M pages using contiguous PTE entries Christophe Leroy
2024-03-25 16:38 ` [RFC PATCH 0/8] Reimplement huge pages without hugepd on powerpc 8xx Jason Gunthorpe
2024-04-11 16:15   ` Peter Xu
2024-04-12 14:08     ` Christophe Leroy
2024-04-12 14:30       ` Peter Xu
2024-04-15 19:12         ` Christophe Leroy
2024-04-16 10:58           ` Christophe Leroy
2024-04-16 19:40             ` Peter Xu
2024-05-17 14:27 ` Oscar Salvador

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240327165754.GM946323@nvidia.com \
    --to=jgg@nvidia.com \
    --cc=akpm@linux-foundation.org \
    --cc=christophe.leroy@csgroup.eu \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=peterx@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).