linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: Peter Xu <peterx@redhat.com>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	Muchun Song <songmuchun@bytedance.com>,
	Michal Hocko <mhocko@suse.com>,
	Naoya Horiguchi <naoya.horiguchi@linux.dev>,
	James Houghton <jthoughton@google.com>,
	Mina Almasry <almasrymina@google.com>,
	"Aneesh Kumar K . V" <aneesh.kumar@linux.vnet.ibm.com>,
	Anshuman Khandual <anshuman.khandual@arm.com>,
	Paul Walmsley <paul.walmsley@sifive.com>,
	Christian Borntraeger <borntraeger@linux.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [RFC PATCH 1/3] hugetlb: skip to end of PT page mapping when pte not present
Date: Wed, 15 Jun 2022 10:27:25 -0700	[thread overview]
Message-ID: <YqoWfcZXJ7NfJaYJ@monkey> (raw)
In-Reply-To: <e9456223-c26a-4d49-2920-9a597a817190@oracle.com>

On 05/31/22 10:00, Mike Kravetz wrote:
> On 5/30/22 12:56, Peter Xu wrote:
> > Hi, Mike,
> > 
> > On Fri, May 27, 2022 at 03:58:47PM -0700, Mike Kravetz wrote:
> >> +unsigned long hugetlb_mask_last_hp(struct hstate *h)
> >> +{
> >> +	unsigned long hp_size = huge_page_size(h);
> >> +
> >> +	if (hp_size == P4D_SIZE)
> >> +		return PGDIR_SIZE - P4D_SIZE;
> >> +	else if (hp_size == PUD_SIZE)
> >> +		return P4D_SIZE - PUD_SIZE;
> >> +	else if (hp_size == PMD_SIZE)
> >> +		return PUD_SIZE - PMD_SIZE;
> >> +
> >> +	return ~(0);
> >> +}
> > 
> > How about:
> > 
> > unsigned long hugetlb_mask_last_hp(struct hstate *h)
> > {
> > 	unsigned long hp_size = huge_page_size(h);
> > 
> > 	return hp_size * (PTRS_PER_PTE - 1);
> > }
> > 
> > ?

As mentioned in a followup e-mail, I am a little worried about this
calculation not being accurate for all configurations.  Today,
PTRS_PER_PTE == PTRS_PER_PMD == PTRS_PER_PUD == PTRS_PER_P4D in all
architectures that CONFIG_ARCH_WANT_GENERAL_HUGETLB.  However, if we
code things as above and that changes the bug might be hard to find.

In the next version, I will leave this as above but move to a switch
statement for better readability.

> > 
> > This is definitely a good idea, though I'm wondering the possibility to go
> > one step further to make hugetlb pgtable walk just like the normal pages.
> > 
> > Say, would it be non-trivial to bring some of huge_pte_offset() into the
> > walker functions, so that we can jump over even larger than PTRS_PER_PTE
> > entries (e.g. when p4d==NULL for 2m huge pages)?  It's very possible I
> > overlooked something, though.

I briefly looked at this.  To make it work, the walker zapping functions
such as zap_*_range would need to have a 'is_vm_hugetlb_page(vma)', and
if true use hugetlb specific page table routines instead of the generic
routines.

In many cases, the hugetlb specific page table routines are the same as
the generic routines.  But, there are a few exceptions.  IMO, it would
be better to first try to cleanup and unify those routines.  That would
make changes to the walker routines less invasive and easier to
maintain.  I believe is other code that would benefit from such a
cleanup.  Unless there are strong objections, I suggest we move forward
with the optimization here and move the cleanup and possible walker
changes to a later series.
-- 
Mike Kravetz


  reply	other threads:[~2022-06-15 17:27 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-05-27 22:58 [RFC PATCH 0/3] hugetlb: speed up linear address scanning Mike Kravetz
2022-05-27 22:58 ` [RFC PATCH 1/3] hugetlb: skip to end of PT page mapping when pte not present Mike Kravetz
2022-05-30 10:10   ` Baolin Wang
2022-05-31 16:56     ` Mike Kravetz
2022-06-15 21:22     ` Mike Kravetz
2022-06-16  3:48       ` Baolin Wang
2022-05-30 19:56   ` Peter Xu
2022-05-31  2:04     ` Muchun Song
2022-05-31 17:05       ` Mike Kravetz
2022-06-01  6:58         ` Anshuman Khandual
2022-05-31 17:00     ` Mike Kravetz
2022-06-15 17:27       ` Mike Kravetz [this message]
2022-06-15 17:51         ` Peter Xu
2022-05-27 22:58 ` [RFC PATCH 2/3] hugetlb: do not update address in huge_pmd_unshare Mike Kravetz
2022-05-30 10:14   ` Baolin Wang
2022-05-30 15:36   ` Muchun Song
2022-05-31 17:06     ` Mike Kravetz
2022-05-27 22:58 ` [RFC PATCH 3/3] hugetlb: Lazy page table copies in fork() Mike Kravetz
2022-05-31 17:25   ` David Hildenbrand
2022-06-01  5:20   ` Muchun Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YqoWfcZXJ7NfJaYJ@monkey \
    --to=mike.kravetz@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=almasrymina@google.com \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=anshuman.khandual@arm.com \
    --cc=borntraeger@linux.ibm.com \
    --cc=jthoughton@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=naoya.horiguchi@linux.dev \
    --cc=paul.walmsley@sifive.com \
    --cc=peterx@redhat.com \
    --cc=songmuchun@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).