linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Michal Hocko <mhocko@suse.com>
Cc: Oscar Salvador <osalvador@suse.de>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	David Hildenbrand <david@redhat.com>,
	Muchun Song <songmuchun@bytedance.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v3 1/2] mm: Make alloc_contig_range handle free hugetlb pages
Date: Fri, 26 Feb 2021 10:25:46 +0100	[thread overview]
Message-ID: <1F1B32C0-10EA-4A7F-A062-1B8CE8D47C3F@redhat.com> (raw)
In-Reply-To: <YDizi3/TKYnSi5Zo@dhcp22.suse.cz>


> Am 26.02.2021 um 09:38 schrieb Michal Hocko <mhocko@suse.com>:
> 
> On Fri 26-02-21 09:35:10, Michal Hocko wrote:
>>> On Mon 22-02-21 14:51:36, Oscar Salvador wrote:
>>> alloc_contig_range will fail if it ever sees a HugeTLB page within the
>>> range we are trying to allocate, even when that page is free and can be
>>> easily reallocated.
>>> This has proved to be problematic for some users of alloc_contic_range,
>>> e.g: CMA and virtio-mem, where those would fail the call even when those
>>> pages lay in ZONE_MOVABLE and are free.
>>> 
>>> We can do better by trying to replace such page.
>>> 
>>> Free hugepages are tricky to handle so as to no userspace application
>>> notices disruption, we need to replace the current free hugepage with
>>> a new one.
>>> 
>>> In order to do that, a new function called alloc_and_dissolve_huge_page
>>> is introduced.
>>> This function will first try to get a new fresh hugepage, and if it
>>> succeeds, it will replace the old one in the free hugepage pool.
>>> 
>>> All operations are being handled under hugetlb_lock, so no races are
>>> possible. The only exception is when page's refcount is 0, but it still
>>> has not been flagged as PageHugeFreed.
>> 
>> I think it would be helpful to call out that specific case explicitly
>> here. I can see only one scenario (are there more?)
>> __free_huge_page()        isolate_or_dissolve_huge_page
>>                  PageHuge() == T
>>                  alloc_and_dissolve_huge_page
>>                    alloc_fresh_huge_page()
>>                    spin_lock(hugetlb_lock)
>>                    // PageHuge() && !PageHugeFreed &&
>>                    // !PageCount()
>>                    spin_unlock(hugetlb_lock)
>>  spin_lock(hugetlb_lock)
>>  1) update_and_free_page
>>       PageHuge() == F
>>       __free_pages()
>>  2) enqueue_huge_page
>>       SetPageHugeFreed()
>>  spin_unlock(&hugetlb_lock)                
>> 
>>> In this case we retry as the window race is quite small and we have high
>>> chances to succeed next time.
>>> 
>>> With regard to the allocation, we restrict it to the node the page belongs
>>> to with __GFP_THISNODE, meaning we do not fallback on other node's zones.
>>> 
>>> Note that gigantic hugetlb pages are fenced off since there is a cyclic
>>> dependency between them and alloc_contig_range.
>>> 
>>> Signed-off-by: Oscar Salvador <osalvador@suse.de>
>> 
>> Thanks this looks much better than the initial version. One nit below.
>> Acked-by: Michal Hocko <mhocko@suse.com>
> 
> Btw. if David has some numbers it would be great to add them to the
> changelog.

I‘m planning on giving both patches a churn early next week, with

a) free huge pages
b) idle allocated huge pages
c) heavily read huge pages

(Them I‘m also planning on having another brief look at the patches :) )

Thanks!



  reply	other threads:[~2021-02-26  9:25 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-02-22 13:51 [PATCH v3 0/2] Make alloc_contig_range handle Hugetlb pages Oscar Salvador
2021-02-22 13:51 ` [PATCH v3 1/2] mm: Make alloc_contig_range handle free hugetlb pages Oscar Salvador
2021-02-25 20:03   ` Mike Kravetz
2021-02-26  9:48     ` Oscar Salvador
2021-02-26  8:35   ` Michal Hocko
2021-02-26  8:38     ` Michal Hocko
2021-02-26  9:25       ` David Hildenbrand [this message]
2021-02-26  9:47         ` Oscar Salvador
2021-02-26  9:45     ` Oscar Salvador
2021-02-26  9:51       ` Michal Hocko
2021-03-01 14:09   ` David Hildenbrand
2021-03-04 10:19     ` Oscar Salvador
2021-03-04 10:32       ` David Hildenbrand
2021-03-04 10:41         ` Oscar Salvador
2021-02-22 13:51 ` [PATCH v3 2/2] mm: Make alloc_contig_range handle in-use " Oscar Salvador
2021-02-25 23:05   ` Mike Kravetz
2021-02-26  8:46   ` Michal Hocko
2021-02-26 10:24     ` Oscar Salvador
2021-02-26 10:27       ` Oscar Salvador
2021-02-26 12:46       ` Michal Hocko
2021-02-28 13:43         ` Oscar Salvador
2021-03-05 17:30   ` David Hildenbrand
2021-03-01 12:43 ` [PATCH v3 0/2] Make alloc_contig_range handle Hugetlb pages David Hildenbrand
2021-03-01 12:57   ` Oscar Salvador
2021-03-01 12:59     ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1F1B32C0-10EA-4A7F-A062-1B8CE8D47C3F@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=osalvador@suse.de \
    --cc=songmuchun@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).