From: Mike Kravetz <mike.kravetz@oracle.com>
To: David Hildenbrand <david@redhat.com>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>,
akpm@linux-foundation.org, songmuchun@bytedance.com,
linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 1/5] mm/hugetlb: fix races when looking up a CONT-PTE size hugetlb page
Date: Thu, 25 Aug 2022 14:13:32 -0700 [thread overview]
Message-ID: <Ywfl/HIeO/ZbwYCg@monkey> (raw)
In-Reply-To: <887ca2e2-a7c5-93a7-46cb-185daccd4444@redhat.com>
On 08/25/22 09:25, David Hildenbrand wrote:
> > Is the primary concern the locking? If so, I am not sure we have an issue.
> > As mentioned in your commit message, current code will use
> > pte_offset_map_lock(). pte_offset_map_lock uses pte_lockptr, and pte_lockptr
> > will either be the mm wide lock or pmd_page lock. To me, it seems that
> > either would provide correct synchronization for CONT-PTE entries. Am I
> > missing something or misreading the code?
> >
> > I started looking at code cleanup suggested by David. Here is a quick
> > patch (not tested and likely containing errors) to see if this is a step
> > in the right direction.
> >
> > I like it because we get rid of/combine all those follow_huge_p*d
> > routines.
> >
>
> Yes, see comments below.
>
> > From 35d117a707c1567ddf350554298697d40eace0d7 Mon Sep 17 00:00:00 2001
> > From: Mike Kravetz <mike.kravetz@oracle.com>
> > Date: Wed, 24 Aug 2022 15:59:15 -0700
> > Subject: [PATCH] hugetlb: call hugetlb_follow_page_mask for hugetlb pages in
> > follow_page_mask
> >
> > At the beginning of follow_page_mask, there currently is a call to
> > follow_huge_addr which 'may' handle hugetlb pages. ia64 is the only
> > architecture which (incorrectly) provides a follow_huge_addr routine
> > that does not return error. Instead, at each level of the page table a
> > check is made for a hugetlb entry. If a hugetlb entry is found, a call
> > to a routine associated with that page table level such as
> > follow_huge_pmd is made.
> >
> > All the follow_huge_p*d routines are basically the same. In addition
> > huge page size can be derived from the vma, so we know where in the page
> > table a huge page would reside. So, replace follow_huge_addr with a
> > new architecture independent routine which will provide the same
> > functionality as the follow_huge_p*d routines. We can then eliminate
> > the p*d_huge checks in follow_page_mask page table walking as well as
> > the follow_huge_p*d routines themselves.>
> > follow_page_mask still has is_hugepd hugetlb checks during page table
> > walking. This is due to these checks and follow_huge_pd being
> > architecture specific. These can be eliminated if
> > hugetlb_follow_page_mask can be overwritten by architectures (powerpc)
> > that need to do follow_huge_pd processing.
>
> But won't the
>
> > + /* hugetlb is special */
> > + if (is_vm_hugetlb_page(vma))
> > + return hugetlb_follow_page_mask(vma, address, flags);
>
> code route everything via hugetlb_follow_page_mask() and all these
> (beloved) hugepd checks would essentially be unreachable?
>
> At least my understanding is that hugepd only applies to hugetlb.
>
> Can't we move the hugepd handling code into hugetlb_follow_page_mask()
> as well?
>
> I mean, doesn't follow_hugetlb_page() also have to handle that hugepd
> stuff already ... ?
>
> [...]
I think so, but I got a little confused looking at the hugepd handling code.
Adding Aneesh who added support to follow_page_mask in the series at:
https://lore.kernel.org/linux-mm/1494926612-23928-1-git-send-email-aneesh.kumar@linux.vnet.ibm.com/
I believe you are correct in that follow_hugetlb_page must handle as well.
One source of my confusion is the following in follow_huge_pd:
/*
* hugepage directory entries are protected by mm->page_table_lock
* Use this instead of huge_pte_lockptr
*/
ptl = &mm->page_table_lock;
spin_lock(ptl);
Yet, if follow_hugetlb_page handles hugepd then it is using huge_pte_lockptr
to get the lock pointer and is wrong?
Hoping Aneesh can help clear up the confusion.
BTW, I also noticed that the above series added the comment:
/* make this handle hugepd */
above the call to follow_huge_addr() in follow_page_mask. Perhaps there
was at one time a plan to have follow_huge_addr handle hugepd? That
series removed powerpc specific follow_huge_addr routine.
--
Mike Kravetz
next prev parent reply other threads:[~2022-08-25 21:13 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-08-23 7:50 [PATCH v2 0/5] Fix some issues when looking up hugetlb page Baolin Wang
2022-08-23 7:50 ` [PATCH v2 1/5] mm/hugetlb: fix races when looking up a CONT-PTE size " Baolin Wang
2022-08-23 8:29 ` David Hildenbrand
2022-08-23 10:02 ` Baolin Wang
2022-08-23 10:23 ` David Hildenbrand
2022-08-23 23:55 ` Mike Kravetz
2022-08-24 2:06 ` Baolin Wang
2022-08-24 7:31 ` David Hildenbrand
2022-08-24 9:41 ` Baolin Wang
2022-08-24 11:55 ` David Hildenbrand
2022-08-24 14:30 ` Baolin Wang
2022-08-24 14:33 ` David Hildenbrand
2022-08-24 15:06 ` Baolin Wang
2022-08-24 15:13 ` David Hildenbrand
2022-08-24 15:23 ` Baolin Wang
2022-08-24 23:34 ` Mike Kravetz
2022-08-25 1:43 ` Baolin Wang
2022-08-25 7:10 ` David Hildenbrand
2022-08-25 7:58 ` Baolin Wang
2022-08-25 18:30 ` Mike Kravetz
2022-08-25 7:25 ` David Hildenbrand
2022-08-25 10:54 ` Baolin Wang
2022-08-25 21:13 ` Mike Kravetz [this message]
2022-08-26 22:40 ` Mike Kravetz
2022-08-27 13:59 ` Aneesh Kumar K.V
2022-08-29 18:30 ` Mike Kravetz
2022-08-23 7:50 ` [PATCH v2 2/5] mm/hugetlb: use PTE page lock to protect CONT-PTE entries Baolin Wang
2022-08-23 7:50 ` [PATCH v2 3/5] mm/hugetlb: fix races when looking up a CONT-PMD size hugetlb page Baolin Wang
2022-08-23 7:50 ` [PATCH v2 4/5] mm/hugetlb: use PMD page lock to protect CONT-PTE entries Baolin Wang
2022-08-23 8:14 ` David Hildenbrand
2022-08-23 10:12 ` Baolin Wang
2022-08-23 7:50 ` [PATCH v2 5/5] mm/hugetlb: add FOLL_MIGRATION validation before waiting for a migration entry Baolin Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Ywfl/HIeO/ZbwYCg@monkey \
--to=mike.kravetz@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.ibm.com \
--cc=baolin.wang@linux.alibaba.com \
--cc=david@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=songmuchun@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).