linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
To: Zi Yan <zi.yan@cs.rutgers.edu>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>,
	Andrea Arcangeli <aarcange@redhat.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"kirill.shutemov@linux.intel.com"
	<kirill.shutemov@linux.intel.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"mhocko@suse.com" <mhocko@suse.com>,
	"will.deacon@arm.com" <will.deacon@arm.com>
Subject: Re: [PATCH] mm/thp: Correctly differentiate between mapped THP and PMD migration entry
Date: Thu, 18 Oct 2018 02:17:42 +0000	[thread overview]
Message-ID: <20181018021741.GA3603@hori1.linux.bs1.fc.nec.co.jp> (raw)
In-Reply-To: <5A0A88EF-4B86-4173-A506-DE19BDB786B8@cs.rutgers.edu>

On Tue, Oct 16, 2018 at 10:31:50AM -0400, Zi Yan wrote:
> On 15 Oct 2018, at 0:06, Anshuman Khandual wrote:
> 
> > On 10/15/2018 06:23 AM, Zi Yan wrote:
> >> On 12 Oct 2018, at 4:00, Anshuman Khandual wrote:
> >>
> >>> On 10/10/2018 06:13 PM, Zi Yan wrote:
> >>>> On 10 Oct 2018, at 0:05, Anshuman Khandual wrote:
> >>>>
> >>>>> On 10/09/2018 07:28 PM, Zi Yan wrote:
> >>>>>> cc: Naoya Horiguchi (who proposed to use !_PAGE_PRESENT && !_PAGE_PSE for x86
> >>>>>> PMD migration entry check)
> >>>>>>
> >>>>>> On 8 Oct 2018, at 23:58, Anshuman Khandual wrote:
> >>>>>>
> >>>>>>> A normal mapped THP page at PMD level should be correctly differentiated
> >>>>>>> from a PMD migration entry while walking the page table. A mapped THP would
> >>>>>>> additionally check positive for pmd_present() along with pmd_trans_huge()
> >>>>>>> as compared to a PMD migration entry. This just adds a new conditional test
> >>>>>>> differentiating the two while walking the page table.
> >>>>>>>
> >>>>>>> Fixes: 616b8371539a6 ("mm: thp: enable thp migration in generic path")
> >>>>>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> >>>>>>> ---
> >>>>>>> On X86, pmd_trans_huge() and is_pmd_migration_entry() are always mutually
> >>>>>>> exclusive which makes the current conditional block work for both mapped
> >>>>>>> and migration entries. This is not same with arm64 where pmd_trans_huge()
> >>>>>>
> >>>>>> !pmd_present() && pmd_trans_huge() is used to represent THPs under splitting,
> >>>>>
> >>>>> Not really if we just look at code in the conditional blocks.
> >>>>
> >>>> Yeah, I explained it wrong above. Sorry about that.
> >>>>
> >>>> In x86, pmd_present() checks (_PAGE_PRESENT | _PAGE_PROTNONE | _PAGE_PSE),
> >>>> thus, it returns true even if the present bit is cleared but PSE bit is set.
> >>>
> >>> Okay.
> >>>
> >>>> This is done so, because THPs under splitting are regarded as present in the kernel
> >>>> but not present when a hardware page table walker checks it.
> >>>
> >>> Okay.
> >>>
> >>>>
> >>>> For PMD migration entry, which should be regarded as not present, if PSE bit
> >>>> is set, which makes pmd_trans_huge() returns true, like ARM64 does, all
> >>>> PMD migration entries will be regarded as present
> >>>
> >>> Okay to make pmd_present() return false pmd_trans_huge() has to return false
> >>> as well. Is there anything which can be done to get around this problem on
> >>> X86 ? pmd_trans_huge() returning true for a migration entry sounds logical.
> >>> Otherwise we would revert the condition block order to accommodate both the
> >>> implementation for pmd_trans_huge() as suggested by Kirill before or just
> >>> consider this patch forward.
> >>>
> >>> Because I am not really sure yet about the idea of getting pmd_present()
> >>> check into pmd_trans_huge() on arm64 just to make it fit into this semantics
> >>> as suggested by Will. If a PMD is trans huge page or not should not depend on
> >>> whether it is present or not.
> >>
> >> In terms of THPs, we have three cases: a present THP, a THP under splitting,
> >> and a THP under migration. pmd_present() and pmd_trans_huge() both return true
> >> for a present THP and a THP under splitting, because they discover _PAGE_PSE bit
> >
> > Then how do we differentiate between a mapped THP and a splitting THP.
> 
> AFAIK, in x86, there is no distinction between a mapped THP and a splitting THP
> using helper functions.
> 
> A mapped THP has _PAGE_PRESENT bit and _PAGE_PSE bit set, whereas a splitting THP
> has only _PAGE_PSE bit set. But both pmd_present() and pmd_trans_huge() return
> true as long as _PAGE_PSE bit is set.
> 
> >
> >> is set for both cases, whereas they both return false for a THP under migration.
> >> You want to change them to make pmd_trans_huge() returns true for a THP under migration
> >> instead of false to help ARM64’s support for THP migration.
> > I am just trying to understand the rationale behind this semantics and see where
> > it should be fixed.
> >
> > I think the fundamental problem here is that THP under split has been difficult
> > to be re-presented through the available helper functions and in turn PTE bits.
> >
> > The following checks
> >
> > 1) pmd_present()
> > 2) pmd_trans_huge()
> >
> > Represent three THP states
> >
> > 1) Mapped THP		(pmd_present && pmd_trans_huge)
> > 2) Splitting THP	(pmd_present && pmd_trans_huge)
> > 3) Migrating THP	(!pmd_present && !pmd_trans_huge)
> >
> > The problem is if we make pmd_trans_huge() return true for all the three states
> > which sounds logical because they are all still trans huge PMD, then pmd_present()
> > can only represent two states not three as required.
> 
> We are on the same page about representing three THP states in x86.
> I also agree with you that it is logical to use three distinct representations
> for these three states, i.e. splitting THP could be changed to (!pmd_present && pmd_trans_huge).

I think that the behavior of pmd_trans_huge() for non-present pmd is
undefined by its nature. IOW, it's no use determining whether it's thp or
not for non-existing pages because it does not exist :)

So I think that the right direction is to make sure that pmd_trans_huge() is
never checked for non-present pmd, just like Kirill's suggestion.  And maybe
we have some room for engineering to ensure it (rather than just commenting it).

Thanks,
Naoya Horiguchi

  reply	other threads:[~2018-10-18  2:19 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-09  3:58 Anshuman Khandual
2018-10-09 13:04 ` Kirill A. Shutemov
2018-10-09 13:18   ` Will Deacon
2018-10-12  8:02     ` Anshuman Khandual
2018-10-15  8:32       ` Aneesh Kumar K.V
2018-10-16 13:16         ` Anshuman Khandual
2018-10-09 13:42   ` Anshuman Khandual
2018-10-09 13:58 ` Zi Yan
2018-10-10  4:05   ` Anshuman Khandual
2018-10-10 12:43     ` Zi Yan
2018-10-12  8:00       ` Anshuman Khandual
2018-10-15  0:53         ` Zi Yan
2018-10-15  4:06           ` Anshuman Khandual
2018-10-16 14:31             ` Zi Yan
2018-10-18  2:17               ` Naoya Horiguchi [this message]
2018-11-02  5:22                 ` Anshuman Khandual
2018-10-25  8:10               ` Anshuman Khandual
2018-10-25 18:45                 ` Zi Yan
2018-10-26  1:39                   ` Anshuman Khandual
2018-10-17  2:09           ` Andrea Arcangeli
2018-10-22 14:00             ` Zi Yan
2018-11-02  6:15             ` Anshuman Khandual
2018-11-06  0:35               ` Will Deacon
2018-11-06  9:51                 ` Anshuman Khandual

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181018021741.GA3603@hori1.linux.bs1.fc.nec.co.jp \
    --to=n-horiguchi@ah.jp.nec.com \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=anshuman.khandual@arm.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=will.deacon@arm.com \
    --cc=zi.yan@cs.rutgers.edu \
    --subject='Re: [PATCH] mm/thp: Correctly differentiate between mapped THP and PMD migration entry' \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).