From: David Hildenbrand <david@redhat.com>
To: Michal Hocko <mhocko@suse.com>, Muchun Song <songmuchun@bytedance.com>
Cc: "Jonathan Corbet" <corbet@lwn.net>,
"Mike Kravetz" <mike.kravetz@oracle.com>,
"Thomas Gleixner" <tglx@linutronix.de>,
mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com,
dave.hansen@linux.intel.com, luto@kernel.org,
"Peter Zijlstra" <peterz@infradead.org>,
viro@zeniv.linux.org.uk,
"Andrew Morton" <akpm@linux-foundation.org>,
paulmck@kernel.org, mchehab+huawei@kernel.org,
pawan.kumar.gupta@linux.intel.com,
"Randy Dunlap" <rdunlap@infradead.org>,
oneukum@suse.com, anshuman.khandual@arm.com, jroedel@suse.de,
"Mina Almasry" <almasrymina@google.com>,
"David Rientjes" <rientjes@google.com>,
"Matthew Wilcox" <willy@infradead.org>,
"Oscar Salvador" <osalvador@suse.de>,
"Song Bao Hua (Barry Song)" <song.bao.hua@hisilicon.com>,
"HORIGUCHI NAOYA(堀口 直也)" <naoya.horiguchi@nec.com>,
"Joao Martins" <joao.m.martins@oracle.com>,
"Xiongchun duan" <duanxiongchun@bytedance.com>,
linux-doc@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
"Linux Memory Management List" <linux-mm@kvack.org>,
linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [External] Re: [PATCH v15 4/8] mm: hugetlb: alloc the vmemmap pages associated with each HugeTLB page
Date: Tue, 16 Feb 2021 09:13:09 +0100 [thread overview]
Message-ID: <4f8664fb-0d65-b7d6-39d6-2ce5fc86623a@redhat.com> (raw)
In-Reply-To: <YCrFY4ODu/O9KSND@dhcp22.suse.cz>
On 15.02.21 20:02, Michal Hocko wrote:
> On Tue 16-02-21 01:48:29, Muchun Song wrote:
>> On Tue, Feb 16, 2021 at 12:28 AM Michal Hocko <mhocko@suse.com> wrote:
>>>
>>> On Mon 15-02-21 23:36:49, Muchun Song wrote:
>>> [...]
>>>>> There shouldn't be any real reason why the memory allocation for
>>>>> vmemmaps, or handling vmemmap in general, has to be done from within the
>>>>> hugetlb lock and therefore requiring a non-sleeping semantic. All that
>>>>> can be deferred to a more relaxed context. If you want to make a
>>>>
>>>> Yeah, you are right. We can put the freeing hugetlb routine to a
>>>> workqueue. Just like I do in the previous version (before v13) patch.
>>>> I will pick up these patches.
>>>
>>> I haven't seen your v13 and I will unlikely have time to revisit that
>>> version. I just wanted to point out that the actual allocation doesn't
>>> have to happen from under the spinlock. There are multiple ways to go
>>> around that. Dropping the lock would be one of them. Preallocation
>>> before the spin lock is taken is another. WQ is certainly an option but
>>> I would take it as the last resort when other paths are not feasible.
>>>
>>
>> "Dropping the lock" and "Preallocation before the spin lock" can limit
>> the context of put_page to non-atomic context. I am not sure if there
>> is a page puted somewhere under an atomic context. e.g. compaction.
>> I am not an expert on this.
>
> Then do a due research or ask for a help from the MM community. Do
> not just try to go around harder problems and somehow duct tape a
> solution. I am sorry for sounding harsh here but this is a repetitive
> pattern.
>
> Now to the merit. put_page can indeed be called from all sorts of
> contexts. And it might be indeed impossible to guarantee that hugetlb
> pages are never freed up from an atomic context. Requiring that would be
> even hard to maintain longterm. There are ways around that, I believe,
> though.
>
> The most simple one that I can think of right now would be using
> in_atomic() rather than in_task() check free_huge_page. IIRC recent
> changes would allow in_atomic to be reliable also on !PREEMPT kernels
> (via RCU tree, not sure where this stands right now). That would make
> __free_huge_page always run in a non-atomic context which sounds like an
> easy enough solution.
> Another way would be to keep a pool of ready pages to use in case of
> GFP_NOWAIT allocation fails and have means to keep that pool replenished
> when needed. Would it be feasible to reused parts of the freed page in
> the worst case?
As already discussed, this is only possible when the huge page does not
reside on ZONE_MOVABLE/CMA.
In addition, we can no longer form a huge page at that memory location ever.
--
Thanks,
David / dhildenb
next prev parent reply other threads:[~2021-02-16 8:15 UTC|newest]
Thread overview: 37+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-02-08 8:50 [PATCH v15 0/8] Free some vmemmap pages of HugeTLB page Muchun Song
2021-02-08 8:50 ` [PATCH v15 1/8] mm: memory_hotplug: factor out bootmem core functions to bootmem_info.c Muchun Song
2021-02-08 8:50 ` [PATCH v15 2/8] mm: hugetlb: introduce a new config HUGETLB_PAGE_FREE_VMEMMAP Muchun Song
2021-02-08 8:50 ` [PATCH v15 3/8] mm: hugetlb: free the vmemmap pages associated with each HugeTLB page Muchun Song
2021-02-08 8:50 ` [PATCH v15 4/8] mm: hugetlb: alloc " Muchun Song
2021-02-11 18:05 ` Mike Kravetz
2021-02-12 14:15 ` David Hildenbrand
2021-02-12 15:32 ` Michal Hocko
2021-02-15 10:05 ` [External] " Muchun Song
2021-02-15 10:33 ` Michal Hocko
2021-02-15 11:51 ` Muchun Song
2021-02-15 12:00 ` Muchun Song
2021-02-15 12:18 ` Michal Hocko
2021-02-15 12:44 ` Muchun Song
2021-02-15 13:19 ` Michal Hocko
2021-02-15 15:36 ` Muchun Song
2021-02-15 16:27 ` Michal Hocko
2021-02-15 17:48 ` Muchun Song
2021-02-15 18:19 ` Muchun Song
2021-02-15 19:39 ` Michal Hocko
2021-02-16 4:34 ` Muchun Song
2021-02-16 8:15 ` Michal Hocko
2021-02-16 8:20 ` David Hildenbrand
2021-02-16 9:03 ` Muchun Song
2021-02-15 19:02 ` Michal Hocko
2021-02-16 8:13 ` David Hildenbrand [this message]
2021-02-16 8:21 ` Michal Hocko
2021-02-16 19:44 ` Mike Kravetz
2021-02-17 8:13 ` Michal Hocko
2021-02-18 1:00 ` Mike Kravetz
2021-02-18 3:20 ` Muchun Song
2021-02-18 8:21 ` Michal Hocko
2021-02-15 12:24 ` Michal Hocko
2021-02-08 8:50 ` [PATCH v15 5/8] mm: hugetlb: add a kernel parameter hugetlb_free_vmemmap Muchun Song
2021-02-08 8:50 ` [PATCH v15 6/8] mm: hugetlb: introduce nr_free_vmemmap_pages in the struct hstate Muchun Song
2021-02-08 8:50 ` [PATCH v15 7/8] mm: hugetlb: gather discrete indexes of tail page Muchun Song
2021-02-08 8:50 ` [PATCH v15 8/8] mm: hugetlb: optimize the code with the help of the compiler Muchun Song
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4f8664fb-0d65-b7d6-39d6-2ce5fc86623a@redhat.com \
--to=david@redhat.com \
--cc=akpm@linux-foundation.org \
--cc=almasrymina@google.com \
--cc=anshuman.khandual@arm.com \
--cc=bp@alien8.de \
--cc=corbet@lwn.net \
--cc=dave.hansen@linux.intel.com \
--cc=duanxiongchun@bytedance.com \
--cc=hpa@zytor.com \
--cc=joao.m.martins@oracle.com \
--cc=jroedel@suse.de \
--cc=linux-doc@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=luto@kernel.org \
--cc=mchehab+huawei@kernel.org \
--cc=mhocko@suse.com \
--cc=mike.kravetz@oracle.com \
--cc=mingo@redhat.com \
--cc=naoya.horiguchi@nec.com \
--cc=oneukum@suse.com \
--cc=osalvador@suse.de \
--cc=paulmck@kernel.org \
--cc=pawan.kumar.gupta@linux.intel.com \
--cc=peterz@infradead.org \
--cc=rdunlap@infradead.org \
--cc=rientjes@google.com \
--cc=song.bao.hua@hisilicon.com \
--cc=songmuchun@bytedance.com \
--cc=tglx@linutronix.de \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).