All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Muchun Song <songmuchun@bytedance.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH 1/2] mm: Make alloc_contig_range handle free hugetlb pages
Date: Wed, 17 Feb 2021 14:53:37 +0100	[thread overview]
Message-ID: <5f50c810-3f49-a162-6d1d-cf621c515f45@redhat.com> (raw)
In-Reply-To: <YC0fIhEHRDOVzK8U@dhcp22.suse.cz>

On 17.02.21 14:50, Michal Hocko wrote:
> On Wed 17-02-21 14:36:47, David Hildenbrand wrote:
>> On 17.02.21 14:30, Michal Hocko wrote:
>>> On Wed 17-02-21 11:08:15, Oscar Salvador wrote:
>>>> Free hugetlb pages are tricky to handle so as to no userspace application
>>>> notices disruption, we need to replace the current free hugepage with
>>>> a new one.
>>>>
>>>> In order to do that, a new function called alloc_and_dissolve_huge_page
>>>> is introduced.
>>>> This function will first try to get a new fresh hugetlb page, and if it
>>>> succeeds, it will dissolve the old one.
>>>>
>>>> With regard to the allocation, since we do not know whether the old page
>>>> was allocated on a specific node on request, the node the old page belongs
>>>> to will be tried first, and then we will fallback to all nodes containing
>>>> memory (N_MEMORY).
>>>
>>> I do not think fallback to a different zone is ok. If yes then this
>>> really requires a very good reasoning. alloc_contig_range is an
>>> optimistic allocation interface at best and it shouldn't break carefully
>>> node aware preallocation done by administrator.
>>
>> What does memory offlining do when migrating in-use hugetlbfs pages? Does it
>> always keep the node?
> 
> No it will break the node pool. The reasoning behind that is that
> offlining is an explicit request from the userspace and it is expected

userspace? in 99,9996% it's the hardware that triggers the unplug of a DIMM.

> 
>> I think keeping the node is the easiest/simplest approach for now.
>>
>>>
>>>> Note that gigantic hugetlb pages are fenced off since there is a cyclic
>>>> dependency between them and alloc_contig_range.
>>>
>>> Why do we need/want to do all this in the first place?
>>
>> cma and virtio-mem (especially on ZONE_MOVABLE) really want to handle
>> hugetlbfs pages.
> 
> Do we have any real life examples? Or does this fall more into, let's
> optimize an existing implementation category.
> 

It's a big TODO item I have on my list and I am happy that Oscar is 
looking into it. So yes, I noticed it while working on virtio-mem. It's 
real.

-- 
Thanks,

David / dhildenb


  reply	other threads:[~2021-02-17 13:55 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-17 10:08 [PATCH 0/2] Make alloc_contig_range handle Hugetlb pages Oscar Salvador
2021-02-17 10:08 ` [PATCH 1/2] mm: Make alloc_contig_range handle free hugetlb pages Oscar Salvador
2021-02-17 13:30   ` Michal Hocko
2021-02-17 13:36     ` David Hildenbrand
2021-02-17 13:50       ` Michal Hocko
2021-02-17 13:53         ` David Hildenbrand [this message]
2021-02-17 13:59           ` Michal Hocko
2021-02-17 14:08             ` David Hildenbrand
2021-02-17 14:14               ` Michal Hocko
2021-02-17 14:23               ` Oscar Salvador
2021-02-17 13:42     ` Oscar Salvador
2021-02-17 15:00   ` Michal Hocko
2021-02-18 10:09     ` Oscar Salvador
2021-02-18 12:52       ` Michal Hocko
2021-02-18 13:32         ` Oscar Salvador
2021-02-18 13:59           ` Michal Hocko
2021-02-18 16:53             ` Oscar Salvador
2021-02-19  9:05             ` Oscar Salvador
2021-02-19  9:56               ` Michal Hocko
2021-02-19 10:14                 ` Oscar Salvador
2021-02-19 20:00                   ` Mike Kravetz
2021-02-19 10:40                 ` Oscar Salvador
2021-02-19 10:55                   ` Michal Hocko
2021-02-19 11:17                     ` Oscar Salvador
2021-02-19 11:24                       ` Michal Hocko
2021-02-17 10:08 ` [PATCH 2/2] mm: Make alloc_contig_range handle in-use " Oscar Salvador
2021-02-17 13:36   ` Michal Hocko
2021-02-17 13:46     ` Oscar Salvador
2021-02-17 13:54       ` Michal Hocko
2021-02-17 15:06   ` Michal Hocko
2021-02-17 15:27     ` Oscar Salvador
2021-02-17 15:33       ` Michal Hocko
2021-02-18  6:01         ` Oscar Salvador

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5f50c810-3f49-a162-6d1d-cf621c515f45@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=osalvador@suse.de \
    --cc=songmuchun@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.