From: Yang Shi <yang.shi@linux.alibaba.com>
To: Hugh Dickins <hughd@google.com>
Cc: Michal Hocko <mhocko@kernel.org>,
"Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>,
vbabka@suse.cz, rientjes@google.com, kirill@shutemov.name,
akpm@linux-foundation.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org
Subject: Re: [v2 PATCH] mm: thp: fix false negative of shmem vma's THP eligibility
Date: Fri, 7 Jun 2019 11:51:22 -0700 [thread overview]
Message-ID: <f5b9e7f5-20e7-76a7-e014-891d34780dc5@linux.alibaba.com> (raw)
In-Reply-To: <alpine.LSU.2.11.1906070314001.1938@eggly.anvils>
On 6/7/19 3:57 AM, Hugh Dickins wrote:
> On Thu, 6 Jun 2019, Yang Shi wrote:
>> On 5/7/19 10:10 AM, Yang Shi wrote:
>>> On 5/7/19 3:47 AM, Michal Hocko wrote:
>>>> [Hmm, I thought, Hugh was CCed]
>>>>
>>>> On Mon 06-05-19 16:37:42, Yang Shi wrote:
>>>>> On 4/28/19 12:13 PM, Yang Shi wrote:
>>>>>> On 4/23/19 10:52 AM, Michal Hocko wrote:
>>>>>>> On Wed 24-04-19 00:43:01, Yang Shi wrote:
>>>>>>>> The commit 7635d9cbe832 ("mm, thp, proc: report THP eligibility
>>>>>>>> for each
>>>>>>>> vma") introduced THPeligible bit for processes' smaps. But, when
>>>>>>>> checking
>>>>>>>> the eligibility for shmem vma, __transparent_hugepage_enabled()
>>>>>>>> is
>>>>>>>> called to override the result from shmem_huge_enabled(). It may
>>>>>>>> result
>>>>>>>> in the anonymous vma's THP flag override shmem's. For example,
>>>>>>>> running a
>>>>>>>> simple test which create THP for shmem, but with anonymous THP
>>>>>>>> disabled,
>>>>>>>> when reading the process's smaps, it may show:
>>>>>>>>
>>>>>>>> 7fc92ec00000-7fc92f000000 rw-s 00000000 00:14 27764 /dev/shm/test
>>>>>>>> Size: 4096 kB
>>>>>>>> ...
>>>>>>>> [snip]
>>>>>>>> ...
>>>>>>>> ShmemPmdMapped: 4096 kB
>>>>>>>> ...
>>>>>>>> [snip]
>>>>>>>> ...
>>>>>>>> THPeligible: 0
>>>>>>>>
>>>>>>>> And, /proc/meminfo does show THP allocated and PMD mapped too:
>>>>>>>>
>>>>>>>> ShmemHugePages: 4096 kB
>>>>>>>> ShmemPmdMapped: 4096 kB
>>>>>>>>
>>>>>>>> This doesn't make too much sense. The anonymous THP flag should
>>>>>>>> not
>>>>>>>> intervene shmem THP. Calling shmem_huge_enabled() with checking
>>>>>>>> MMF_DISABLE_THP sounds good enough. And, we could skip stack and
>>>>>>>> dax vma check since we already checked if the vma is shmem
>>>>>>>> already.
>>>>>>> Kirill, can we get a confirmation that this is really intended
>>>>>>> behavior
>>>>>>> rather than an omission please? Is this documented? What is a
>>>>>>> global
>>>>>>> knob to simply disable THP system wise?
>>>>>> Hi Kirill,
>>>>>>
>>>>>> Ping. Any comment?
>>>>> Talked with Kirill at LSFMM, it sounds this is kind of intended
>>>>> behavior
>>>>> according to him. But, we all agree it looks inconsistent.
>>>>>
>>>>> So, we may have two options:
>>>>> - Just fix the false negative issue as what the patch does
>>>>> - Change the behavior to make it more consistent
>>>>>
>>>>> I'm not sure whether anyone relies on the behavior explicitly or
>>>>> implicitly
>>>>> or not.
>>>> Well, I would be certainly more happy with a more consistent behavior.
>>>> Talked to Hugh at LSFMM about this and he finds treating shmem objects
>>>> separately from the anonymous memory. And that is already the case
>>>> partially when each mount point might have its own setup. So the primary
>>>> question is whether we need a one global knob to controll all THP
>>>> allocations. One argument to have that is that it might be helpful to
>>>> for an admin to simply disable source of THP at a single place rather
>>>> than crawling over all shmem mount points and remount them. Especially
>>>> in environments where shmem points are mounted in a container by a
>>>> non-root. Why would somebody wanted something like that? One example
>>>> would be to temporarily workaround high order allocations issues which
>>>> we have seen non trivial amount of in the past and we are likely not at
>>>> the end of the tunel.
>>> Shmem has a global control for such use. Setting shmem_enabled to "force"
>>> or "deny" would enable or disable THP for shmem globally, including non-fs
>>> objects, i.e. memfd, SYS V shmem, etc.
>>>
>>>> That being said I would be in favor of treating the global sysfs knob to
>>>> be global for all THP allocations. I will not push back on that if there
>>>> is a general consensus that shmem and fs in general are a different
>>>> class of objects and a single global control is not desirable for
>>>> whatever reasons.
>>> OK, we need more inputs from Kirill, Hugh and other folks.
>> [Forgot cc to mailing lists]
>>
>> Hi guys,
>>
>> How should we move forward for this one? Make the sysfs knob
>> (/sys/kernel/mm/transparent_hugepage/enabled) to be global for both anonymous
>> and tmpfs? Or just treat shmem objects separately from anon memory then fix
>> the false-negative of THP eligibility by this patch?
> Sorry for not getting back to you sooner on this.
>
> I don't like to drive design by smaps. I agree with the word "mess" used
> several times of THP tunings in this thread, but it's too easy to make
> that mess worse by unnecessary changes, so I'm very cautious here.
>
> The addition of "THPeligible" without an "Anon" in its name was
> unfortunate. I suppose we're two releases too late to change that.
The smaps shows it is anon vma or shmem vma for the most cases.
>
> Applying process (PR_SET_THP_DISABLE) and mm (MADV_*HUGEPAGE)
> limitations to shared filesystem objects doesn't work all that well.
The THP eligibility indicator is per vma, it just reports whether THP is
eligible for a specific vma. So, I'm supposed it should keep consistent
with MMF_DISABLE_THP and MADV_*HUGEPAGE setting.
The current implementation in shmem and kuhugepaged also checks these.
>
> I recommend that you continue to treat shmem objects separately from
> anon memory, and just make the smaps "THPeligible" more often accurate.
>
> Is your v2 patch earlier in this thread the best for that?
The v2 patch treats shmem objects separately from anon memory and it
makes the "THPeligible" more often accurate.
> No answer tonight, I'll re-examine later in the day.
>
> Hugh
next prev parent reply other threads:[~2019-06-07 18:51 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-04-23 16:43 [v2 PATCH] mm: thp: fix false negative of shmem vma's THP eligibility Yang Shi
2019-04-23 17:52 ` Michal Hocko
2019-04-23 18:34 ` Yang Shi
2019-04-24 0:22 ` Yang Shi
2019-04-24 7:58 ` Michal Hocko
2019-04-28 19:13 ` Yang Shi
2019-05-06 23:37 ` Yang Shi
2019-05-07 10:47 ` Michal Hocko
2019-05-07 17:10 ` Yang Shi
2019-06-06 18:59 ` Yang Shi
2019-06-07 10:57 ` Hugh Dickins
2019-06-07 14:25 ` Michal Hocko
2019-06-07 18:51 ` Yang Shi [this message]
2019-04-24 13:10 ` Vlastimil Babka
2019-04-24 15:47 ` Yang Shi
2019-04-24 16:17 ` Vlastimil Babka
2019-04-25 16:44 ` Yang Shi
2019-06-08 3:58 ` Hugh Dickins
2019-06-10 17:33 ` Yang Shi
2019-06-12 18:44 ` Hugh Dickins
2019-06-12 19:59 ` Yang Shi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=f5b9e7f5-20e7-76a7-e014-891d34780dc5@linux.alibaba.com \
--to=yang.shi@linux.alibaba.com \
--cc=akpm@linux-foundation.org \
--cc=hughd@google.com \
--cc=kirill.shutemov@linux.intel.com \
--cc=kirill@shutemov.name \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=rientjes@google.com \
--cc=vbabka@suse.cz \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).