linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Anshuman Khandual <anshuman.khandual@arm.com>
To: Zi Yan <zi.yan@cs.rutgers.edu>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	kirill.shutemov@linux.intel.com, akpm@linux-foundation.org,
	mhocko@suse.com, will.deacon@arm.com,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Subject: Re: [PATCH] mm/thp: Correctly differentiate between mapped THP and PMD migration entry
Date: Fri, 12 Oct 2018 13:30:12 +0530	[thread overview]
Message-ID: <5e0e772c-7eef-e75c-2921-e80d4fbe8324@arm.com> (raw)
In-Reply-To: <1968F276-5D96-426B-823F-38F6A51FB465@cs.rutgers.edu>



On 10/10/2018 06:13 PM, Zi Yan wrote:
> On 10 Oct 2018, at 0:05, Anshuman Khandual wrote:
> 
>> On 10/09/2018 07:28 PM, Zi Yan wrote:
>>> cc: Naoya Horiguchi (who proposed to use !_PAGE_PRESENT && !_PAGE_PSE for x86
>>> PMD migration entry check)
>>>
>>> On 8 Oct 2018, at 23:58, Anshuman Khandual wrote:
>>>
>>>> A normal mapped THP page at PMD level should be correctly differentiated
>>>> from a PMD migration entry while walking the page table. A mapped THP would
>>>> additionally check positive for pmd_present() along with pmd_trans_huge()
>>>> as compared to a PMD migration entry. This just adds a new conditional test
>>>> differentiating the two while walking the page table.
>>>>
>>>> Fixes: 616b8371539a6 ("mm: thp: enable thp migration in generic path")
>>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>>> ---
>>>> On X86, pmd_trans_huge() and is_pmd_migration_entry() are always mutually
>>>> exclusive which makes the current conditional block work for both mapped
>>>> and migration entries. This is not same with arm64 where pmd_trans_huge()
>>>
>>> !pmd_present() && pmd_trans_huge() is used to represent THPs under splitting,
>>
>> Not really if we just look at code in the conditional blocks.
> 
> Yeah, I explained it wrong above. Sorry about that.
> 
> In x86, pmd_present() checks (_PAGE_PRESENT | _PAGE_PROTNONE | _PAGE_PSE),
> thus, it returns true even if the present bit is cleared but PSE bit is set.

Okay.

> This is done so, because THPs under splitting are regarded as present in the kernel
> but not present when a hardware page table walker checks it.

Okay.

> 
> For PMD migration entry, which should be regarded as not present, if PSE bit
> is set, which makes pmd_trans_huge() returns true, like ARM64 does, all
> PMD migration entries will be regarded as present

Okay to make pmd_present() return false pmd_trans_huge() has to return false
as well. Is there anything which can be done to get around this problem on
X86 ? pmd_trans_huge() returning true for a migration entry sounds logical.
Otherwise we would revert the condition block order to accommodate both the
implementation for pmd_trans_huge() as suggested by Kirill before or just
consider this patch forward.

Because I am not really sure yet about the idea of getting pmd_present()
check into pmd_trans_huge() on arm64 just to make it fit into this semantics
as suggested by Will. If a PMD is trans huge page or not should not depend on
whether it is present or not.

> 
> My concern is that if ARM64’s pmd_trans_huge() returns true for migration
> entries, unlike x86, there might be bugs triggered in the kernel when
> THP migration is enabled in ARM64.

Right and that is exactly what we are trying to fix with this patch.

>
> Let me know if I explain this clear to you.
> 
>>
>>> since _PAGE_PRESENT is cleared during THP splitting but _PAGE_PSE is not.
>>> See the comment in pmd_present() for x86, in arch/x86/include/asm/pgtable.h
>>
>>
>>         if (pmd_trans_huge(pmde) || is_pmd_migration_entry(pmde)) {
>>                 pvmw->ptl = pmd_lock(mm, pvmw->pmd);
>>                 if (likely(pmd_trans_huge(*pvmw->pmd))) {
>>                         if (pvmw->flags & PVMW_MIGRATION)
>>                                 return not_found(pvmw);
>>                         if (pmd_page(*pvmw->pmd) != page)
>>                                 return not_found(pvmw);
>>                         return true;
>>                 } else if (!pmd_present(*pvmw->pmd)) {
>>                         if (thp_migration_supported()) {
>>                                 if (!(pvmw->flags & PVMW_MIGRATION))
>>                                         return not_found(pvmw);
>>                                 if (is_migration_entry(pmd_to_swp_entry(*pvmw->pmd))) {
>>                                         swp_entry_t entry = pmd_to_swp_entry(*pvmw->pmd);
>>
>>                                         if (migration_entry_to_page(entry) != page)
>>                                                 return not_found(pvmw);
>>                                         return true;
>>                                 }
>>                         }
>>                         return not_found(pvmw);
>>                 } else {
>>                         /* THP pmd was split under us: handle on pte level */
>>                         spin_unlock(pvmw->ptl);
>>                         pvmw->ptl = NULL;
>>                 }
>>         } else if (!pmd_present(pmde)) { ---> Outer 'else if'
>>                 return false;
>>         }
>>
>> Looking at the above code, it seems the conditional check for a THP
>> splitting case would be (!pmd_trans_huge && pmd_present) instead as
>> it has skipped the first two conditions. But THP splitting must have
>> been initiated once it has cleared the outer check (else it would not
>> have cleared otherwise)
>>
>> if (pmd_trans_huge(pmde) || is_pmd_migration_entry(pmde)).
> 
> If a THP is under splitting, both pmd_present() and pmd_trans_huge() return
> true in x86. The else part (/* THP pmd was split under us … */) happens
> after splitting is done.

Okay, got it.

> 
>> BTW what PMD state does the outer 'else if' block identify which must
>> have cleared the following condition to get there.
>>
>> (!pmd_present && !pmd_trans_huge && !is_pmd_migration_entry)
> 
> I think it is the case that the PMD is gone or equivalently pmd_none().
> This PMD entry is not in use.

Okay, got it.

  reply	other threads:[~2018-10-12  8:00 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-10-09  3:58 [PATCH] mm/thp: Correctly differentiate between mapped THP and PMD migration entry Anshuman Khandual
2018-10-09 13:04 ` Kirill A. Shutemov
2018-10-09 13:18   ` Will Deacon
2018-10-12  8:02     ` Anshuman Khandual
2018-10-15  8:32       ` Aneesh Kumar K.V
2018-10-16 13:16         ` Anshuman Khandual
2018-10-09 13:42   ` Anshuman Khandual
2018-10-09 13:58 ` Zi Yan
2018-10-10  4:05   ` Anshuman Khandual
2018-10-10 12:43     ` Zi Yan
2018-10-12  8:00       ` Anshuman Khandual [this message]
2018-10-15  0:53         ` Zi Yan
2018-10-15  4:06           ` Anshuman Khandual
2018-10-16 14:31             ` Zi Yan
2018-10-18  2:17               ` Naoya Horiguchi
2018-11-02  5:22                 ` Anshuman Khandual
2018-10-25  8:10               ` Anshuman Khandual
2018-10-25 18:45                 ` Zi Yan
2018-10-26  1:39                   ` Anshuman Khandual
2018-10-17  2:09           ` Andrea Arcangeli
2018-10-22 14:00             ` Zi Yan
2018-11-02  6:15             ` Anshuman Khandual
2018-11-06  0:35               ` Will Deacon
2018-11-06  9:51                 ` Anshuman Khandual

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5e0e772c-7eef-e75c-2921-e80d4fbe8324@arm.com \
    --to=anshuman.khandual@arm.com \
    --cc=akpm@linux-foundation.org \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=will.deacon@arm.com \
    --cc=zi.yan@cs.rutgers.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).