From: "Zi Yan" <zi.yan@cs.rutgers.edu>
To: "Anshuman Khandual" <anshuman.khandual@arm.com>
Cc: "Andrea Arcangeli" <aarcange@redhat.com>,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
kirill.shutemov@linux.intel.com, akpm@linux-foundation.org,
mhocko@suse.com, will.deacon@arm.com,
"Naoya Horiguchi" <n-horiguchi@ah.jp.nec.com>
Subject: Re: [PATCH] mm/thp: Correctly differentiate between mapped THP and PMD migration entry
Date: Tue, 16 Oct 2018 10:31:50 -0400 [thread overview]
Message-ID: <5A0A88EF-4B86-4173-A506-DE19BDB786B8@cs.rutgers.edu> (raw)
In-Reply-To: <796cb545-7376-16a2-db3e-bc9a6ca9894d@arm.com>
[-- Attachment #1: Type: text/plain, Size: 8270 bytes --]
On 15 Oct 2018, at 0:06, Anshuman Khandual wrote:
> On 10/15/2018 06:23 AM, Zi Yan wrote:
>> On 12 Oct 2018, at 4:00, Anshuman Khandual wrote:
>>
>>> On 10/10/2018 06:13 PM, Zi Yan wrote:
>>>> On 10 Oct 2018, at 0:05, Anshuman Khandual wrote:
>>>>
>>>>> On 10/09/2018 07:28 PM, Zi Yan wrote:
>>>>>> cc: Naoya Horiguchi (who proposed to use !_PAGE_PRESENT && !_PAGE_PSE for x86
>>>>>> PMD migration entry check)
>>>>>>
>>>>>> On 8 Oct 2018, at 23:58, Anshuman Khandual wrote:
>>>>>>
>>>>>>> A normal mapped THP page at PMD level should be correctly differentiated
>>>>>>> from a PMD migration entry while walking the page table. A mapped THP would
>>>>>>> additionally check positive for pmd_present() along with pmd_trans_huge()
>>>>>>> as compared to a PMD migration entry. This just adds a new conditional test
>>>>>>> differentiating the two while walking the page table.
>>>>>>>
>>>>>>> Fixes: 616b8371539a6 ("mm: thp: enable thp migration in generic path")
>>>>>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>>>>>> ---
>>>>>>> On X86, pmd_trans_huge() and is_pmd_migration_entry() are always mutually
>>>>>>> exclusive which makes the current conditional block work for both mapped
>>>>>>> and migration entries. This is not same with arm64 where pmd_trans_huge()
>>>>>>
>>>>>> !pmd_present() && pmd_trans_huge() is used to represent THPs under splitting,
>>>>>
>>>>> Not really if we just look at code in the conditional blocks.
>>>>
>>>> Yeah, I explained it wrong above. Sorry about that.
>>>>
>>>> In x86, pmd_present() checks (_PAGE_PRESENT | _PAGE_PROTNONE | _PAGE_PSE),
>>>> thus, it returns true even if the present bit is cleared but PSE bit is set.
>>>
>>> Okay.
>>>
>>>> This is done so, because THPs under splitting are regarded as present in the kernel
>>>> but not present when a hardware page table walker checks it.
>>>
>>> Okay.
>>>
>>>>
>>>> For PMD migration entry, which should be regarded as not present, if PSE bit
>>>> is set, which makes pmd_trans_huge() returns true, like ARM64 does, all
>>>> PMD migration entries will be regarded as present
>>>
>>> Okay to make pmd_present() return false pmd_trans_huge() has to return false
>>> as well. Is there anything which can be done to get around this problem on
>>> X86 ? pmd_trans_huge() returning true for a migration entry sounds logical.
>>> Otherwise we would revert the condition block order to accommodate both the
>>> implementation for pmd_trans_huge() as suggested by Kirill before or just
>>> consider this patch forward.
>>>
>>> Because I am not really sure yet about the idea of getting pmd_present()
>>> check into pmd_trans_huge() on arm64 just to make it fit into this semantics
>>> as suggested by Will. If a PMD is trans huge page or not should not depend on
>>> whether it is present or not.
>>
>> In terms of THPs, we have three cases: a present THP, a THP under splitting,
>> and a THP under migration. pmd_present() and pmd_trans_huge() both return true
>> for a present THP and a THP under splitting, because they discover _PAGE_PSE bit
>
> Then how do we differentiate between a mapped THP and a splitting THP.
AFAIK, in x86, there is no distinction between a mapped THP and a splitting THP
using helper functions.
A mapped THP has _PAGE_PRESENT bit and _PAGE_PSE bit set, whereas a splitting THP
has only _PAGE_PSE bit set. But both pmd_present() and pmd_trans_huge() return
true as long as _PAGE_PSE bit is set.
>
>> is set for both cases, whereas they both return false for a THP under migration.
>> You want to change them to make pmd_trans_huge() returns true for a THP under migration
>> instead of false to help ARM64’s support for THP migration.
> I am just trying to understand the rationale behind this semantics and see where
> it should be fixed.
>
> I think the fundamental problem here is that THP under split has been difficult
> to be re-presented through the available helper functions and in turn PTE bits.
>
> The following checks
>
> 1) pmd_present()
> 2) pmd_trans_huge()
>
> Represent three THP states
>
> 1) Mapped THP (pmd_present && pmd_trans_huge)
> 2) Splitting THP (pmd_present && pmd_trans_huge)
> 3) Migrating THP (!pmd_present && !pmd_trans_huge)
>
> The problem is if we make pmd_trans_huge() return true for all the three states
> which sounds logical because they are all still trans huge PMD, then pmd_present()
> can only represent two states not three as required.
We are on the same page about representing three THP states in x86.
I also agree with you that it is logical to use three distinct representations
for these three states, i.e. splitting THP could be changed to (!pmd_present && pmd_trans_huge).
>>
>> For x86, this change requires:
>> 1. changing the condition in pmd_trans_huge(), so that it returns true for
>> PMD migration entries;
>> 2. changing the code, which calls pmd_trans_huge(), to match the new logic.
> Can those be fixed with an additional check for pmd_present() as suggested here
> in this patch ? Asking because in case we could not get common semantics for
> these helpers on all arch that would be a fall back option for the moment.
It would be OK for x86, since pmd_trans_huge() implies pmd_present() and hence
adding pmd_present() to pmd_trans_huge() makes no difference. But for ARM64,
from my understanding of the code described below, adding pmd_present() to
pmd_trans_huge() seems to exclude splitting THPs from the original semantic.
>>
>> Another problem I see is that x86’s pmd_present() returns true for a THP under
>> splitting but ARM64’s pmd_present() returns false for a THP under splitting.
>
> But how did you conclude this ? I dont see any explicit helper for splitting
> THP. Could you please point me in the code ?
From the code I read for ARM64
(https://elixir.bootlin.com/linux/v4.19-rc8/source/arch/arm64/include/asm/pgtable.h#L360
and https://elixir.bootlin.com/linux/v4.19-rc8/source/arch/arm64/include/asm/pgtable.h#L86),
pmd_present() only checks _PAGE_PRESENT and _PAGE_PROTONE. During a THP splitting,
pmdp_invalidate() clears _PAGE_PRESENT (https://elixir.bootlin.com/linux/v4.19-rc8/source/mm/huge_memory.c#L2130). So pmd_present() returns false in ARM64. Let me know
if I got anything wrong.
>> I do not know if there is any correctness issue with this. So I copy Andrea
>> here, since he made x86’s pmd_present() returns true for a THP under splitting
>> as an optimization. I want to understand more about it and potentially make
>> x86 and ARM64 (maybe all other architectures, too) return the same value
>> for all three cases mentioned above.
>
> I agree. Fixing the semantics is the right thing to do. I am kind of wondering if
> it would be a good idea to have explicit helpers for (1) mapped THP, (2) splitting
> THP like the one for (3) migrating THP (e.g is_pmd_migration_entry) and use them
> in various conditional blocks instead of looking out for multiple checks like
> pmd_trans_huge(), pmd_present() etc. It will help unify the semantics as well.
>
I agree that explicit and distinct helpers for all three THP states would be helpful.
>>
>>
>> Hi Andrea, what is the purpose/benefit of making x86’s pmd_present() returns true
>> for a THP under splitting? Does it cause problems when ARM64’s pmd_present()
>> returns false in the same situation?
>>
>>
>>>>
>>>> My concern is that if ARM64’s pmd_trans_huge() returns true for migration
>>>> entries, unlike x86, there might be bugs triggered in the kernel when
>>>> THP migration is enabled in ARM64.
>>>
>>> Right and that is exactly what we are trying to fix with this patch.
>>>
>>
>> I am not sure this patch can fix the problem in ARM64, because many other places
>> in the kernel, pmd_trans_huge() still returns false for a THP under migration.
>> We may need more comprehensive fixes for ARM64.
> Are there more places where semantics needs to be fixed than what was originally
> added through 616b8371539a ("mm: thp: enable thp migration in generic path").
I guess not, but it would be safer to grep for all pmd_trans_huge() and pmd_present().
--
Best Regards
Yan Zi
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 557 bytes --]
next prev parent reply other threads:[~2018-10-16 14:31 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-10-09 3:58 [PATCH] mm/thp: Correctly differentiate between mapped THP and PMD migration entry Anshuman Khandual
2018-10-09 13:04 ` Kirill A. Shutemov
2018-10-09 13:18 ` Will Deacon
2018-10-12 8:02 ` Anshuman Khandual
2018-10-15 8:32 ` Aneesh Kumar K.V
2018-10-16 13:16 ` Anshuman Khandual
2018-10-09 13:42 ` Anshuman Khandual
2018-10-09 13:58 ` Zi Yan
2018-10-10 4:05 ` Anshuman Khandual
2018-10-10 12:43 ` Zi Yan
2018-10-12 8:00 ` Anshuman Khandual
2018-10-15 0:53 ` Zi Yan
2018-10-15 4:06 ` Anshuman Khandual
2018-10-16 14:31 ` Zi Yan [this message]
2018-10-18 2:17 ` Naoya Horiguchi
2018-11-02 5:22 ` Anshuman Khandual
2018-10-25 8:10 ` Anshuman Khandual
2018-10-25 18:45 ` Zi Yan
2018-10-26 1:39 ` Anshuman Khandual
2018-10-17 2:09 ` Andrea Arcangeli
2018-10-22 14:00 ` Zi Yan
2018-11-02 6:15 ` Anshuman Khandual
2018-11-06 0:35 ` Will Deacon
2018-11-06 9:51 ` Anshuman Khandual
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5A0A88EF-4B86-4173-A506-DE19BDB786B8@cs.rutgers.edu \
--to=zi.yan@cs.rutgers.edu \
--cc=aarcange@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=anshuman.khandual@arm.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@suse.com \
--cc=n-horiguchi@ah.jp.nec.com \
--cc=will.deacon@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).