linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Muchun Song <songmuchun@bytedance.com>
Cc: Jonathan Corbet <corbet@lwn.net>,
	Mike Kravetz <mike.kravetz@oracle.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com,
	dave.hansen@linux.intel.com, luto@kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	viro@zeniv.linux.org.uk,
	Andrew Morton <akpm@linux-foundation.org>,
	paulmck@kernel.org, mchehab+huawei@kernel.org,
	pawan.kumar.gupta@linux.intel.com,
	Randy Dunlap <rdunlap@infradead.org>,
	oneukum@suse.com, anshuman.khandual@arm.com, jroedel@suse.de,
	Mina Almasry <almasrymina@google.com>,
	David Rientjes <rientjes@google.com>,
	Matthew Wilcox <willy@infradead.org>,
	Oscar Salvador <osalvador@suse.de>,
	Michal Hocko <mhocko@suse.com>,
	"Song Bao Hua (Barry Song)" <song.bao.hua@hisilicon.com>,
	Xiongchun duan <duanxiongchun@bytedance.com>,
	linux-doc@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
	Linux Memory Management List <linux-mm@kvack.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>
Subject: Re: [External] Re: [PATCH v7 05/15] mm/bootmem_info: Introduce {free,prepare}_vmemmap_page()
Date: Wed, 9 Dec 2020 10:32:55 +0100	[thread overview]
Message-ID: <73832edd-13ec-8032-d8d6-4afc53297fdb@redhat.com> (raw)
In-Reply-To: <CAMZfGtU-zpPRkSikcYZUhKvWhpwZ+cspXNhoaok9e6MCE2pk-g@mail.gmail.com>

On 09.12.20 10:25, Muchun Song wrote:
> On Wed, Dec 9, 2020 at 4:50 PM David Hildenbrand <david@redhat.com> wrote:
>>
>> On 09.12.20 08:36, Muchun Song wrote:
>>> On Mon, Dec 7, 2020 at 8:39 PM David Hildenbrand <david@redhat.com> wrote:
>>>>
>>>> On 30.11.20 16:18, Muchun Song wrote:
>>>>> In the later patch, we can use the free_vmemmap_page() to free the
>>>>> unused vmemmap pages and initialize a page for vmemmap page using
>>>>> via prepare_vmemmap_page().
>>>>>
>>>>> Signed-off-by: Muchun Song <songmuchun@bytedance.com>
>>>>> ---
>>>>>  include/linux/bootmem_info.h | 24 ++++++++++++++++++++++++
>>>>>  1 file changed, 24 insertions(+)
>>>>>
>>>>> diff --git a/include/linux/bootmem_info.h b/include/linux/bootmem_info.h
>>>>> index 4ed6dee1adc9..239e3cc8f86c 100644
>>>>> --- a/include/linux/bootmem_info.h
>>>>> +++ b/include/linux/bootmem_info.h
>>>>> @@ -3,6 +3,7 @@
>>>>>  #define __LINUX_BOOTMEM_INFO_H
>>>>>
>>>>>  #include <linux/mmzone.h>
>>>>> +#include <linux/mm.h>
>>>>>
>>>>>  /*
>>>>>   * Types for free bootmem stored in page->lru.next. These have to be in
>>>>> @@ -22,6 +23,29 @@ void __init register_page_bootmem_info_node(struct pglist_data *pgdat);
>>>>>  void get_page_bootmem(unsigned long info, struct page *page,
>>>>>                     unsigned long type);
>>>>>  void put_page_bootmem(struct page *page);
>>>>> +
>>>>> +static inline void free_vmemmap_page(struct page *page)
>>>>> +{
>>>>> +     VM_WARN_ON(!PageReserved(page) || page_ref_count(page) != 2);
>>>>> +
>>>>> +     /* bootmem page has reserved flag in the reserve_bootmem_region */
>>>>> +     if (PageReserved(page)) {
>>>>> +             unsigned long magic = (unsigned long)page->freelist;
>>>>> +
>>>>> +             if (magic == SECTION_INFO || magic == MIX_SECTION_INFO)
>>>>> +                     put_page_bootmem(page);
>>>>> +             else
>>>>> +                     WARN_ON(1);
>>>>> +     }
>>>>> +}
>>>>> +
>>>>> +static inline void prepare_vmemmap_page(struct page *page)
>>>>> +{
>>>>> +     unsigned long section_nr = pfn_to_section_nr(page_to_pfn(page));
>>>>> +
>>>>> +     get_page_bootmem(section_nr, page, SECTION_INFO);
>>>>> +     mark_page_reserved(page);
>>>>> +}
>>>>
>>>> Can you clarify in the description when exactly these functions are
>>>> called and on which type of pages?
>>>>
>>>> Would indicating "bootmem" in the function names make it clearer what we
>>>> are dealing with?
>>>>
>>>> E.g., any memory allocated via the memblock allocator and not via the
>>>> buddy will be makred reserved already in the memmap. It's unclear to me
>>>> why we need the mark_page_reserved() here - can you enlighten me? :)
>>>
>>> Sorry for ignoring this question. Because the vmemmap pages are allocated
>>> from the bootmem allocator which is marked as PG_reserved. For those bootmem
>>> pages, we should call put_page_bootmem for free. You can see that we
>>> clear the PG_reserved in the put_page_bootmem. In order to be consistent,
>>> the prepare_vmemmap_page also marks the page as PG_reserved.
>>
>> I don't think that really makes sense.
>>
>> After put_page_bootmem() put the last reference, it clears PG_reserved
>> and hands the page over to the buddy via free_reserved_page(). From that
>> point on, further get_page_bootmem() would be completely wrong and
>> dangerous.
>>
>> Both, put_page_bootmem() and get_page_bootmem() rely on the fact that
>> they are dealing with memblock allcoations - marked via PG_reserved. If
>> prepare_vmemmap_page() would be called on something that's *not* coming
>> from the memblock allocator, it would be completely broken - or am I
>> missing something?
>>
>> AFAIKT, there should rather be a BUG_ON(!PageReserved(page)) in
>> prepare_vmemmap_page() - or proper handling to deal with !memblock
>> allocations.
>>
> 
> I want to allocate some pages as the vmemmap when
> we free a HugeTLB page to the buddy allocator. So I use
> the prepare_vmemmap_page() to initialize the page (which
> allocated from buddy allocator) and make it as the vmemmap
> of the freed HugeTLB page.
> 
> Any suggestions to deal with this case?

If you obtained pages via the buddy, there shouldn't be anything special
to handle, no? What speaks against


prepare_vmemmap_page():
if (!PageReserved(page))
	return;


put_page_bootmem():
if (!PageReserved(page))
	__free_page();


Or if we care about multiple references, get_page() and put_page().

> 
> I have a solution to address this. When the pages allocated
> from the buddy as vmemmap pages,  we do not call
> prepare_vmemmap_page().
> 
> When we free some vmemmap pages of a HugeTLB
> page, if the PG_reserved of the vmemmap page is set,
> we call free_vmemmap_page() to free it to buddy,
> otherwise call free_page(). What is your opinion?

That would also work. Then, please include "bootmem" as part of the
function name. If you plan on using my suggestion, you can drop
"bootmem" from the name as it works for both types of pages.


-- 
Thanks,

David / dhildenb



  reply	other threads:[~2020-12-09  9:33 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-30 15:18 [PATCH v7 00/15] Free some vmemmap pages of hugetlb page Muchun Song
2020-11-30 15:18 ` [PATCH v7 01/15] mm/memory_hotplug: Move bootmem info registration API to bootmem_info.c Muchun Song
2020-12-07 12:12   ` David Hildenbrand
2020-11-30 15:18 ` [PATCH v7 02/15] mm/memory_hotplug: Move {get,put}_page_bootmem() " Muchun Song
2020-12-07 12:14   ` David Hildenbrand
2020-12-07 12:16     ` [External] " Muchun Song
2020-11-30 15:18 ` [PATCH v7 03/15] mm/hugetlb: Introduce a new config HUGETLB_PAGE_FREE_VMEMMAP Muchun Song
2020-12-07 12:19   ` David Hildenbrand
2020-12-07 12:42     ` [External] " Muchun Song
2020-12-07 12:47       ` David Hildenbrand
2020-12-07 13:22         ` Muchun Song
2020-11-30 15:18 ` [PATCH v7 04/15] mm/hugetlb: Introduce nr_free_vmemmap_pages in the struct hstate Muchun Song
2020-12-07 12:36   ` David Hildenbrand
2020-12-07 13:11     ` [External] " Muchun Song
2020-12-09  8:54       ` David Hildenbrand
2020-12-09  9:27         ` Muchun Song
2020-11-30 15:18 ` [PATCH v7 05/15] mm/bootmem_info: Introduce {free,prepare}_vmemmap_page() Muchun Song
2020-12-07 12:39   ` David Hildenbrand
2020-12-07 13:23     ` [External] " Muchun Song
2020-12-09  7:36     ` Muchun Song
2020-12-09  8:49       ` David Hildenbrand
2020-12-09  9:25         ` Muchun Song
2020-12-09  9:32           ` David Hildenbrand [this message]
2020-12-09  9:43             ` Muchun Song
2020-11-30 15:18 ` [PATCH v7 06/15] mm/hugetlb: Disable freeing vmemmap if struct page size is not power of two Muchun Song
2020-12-09  9:57   ` David Hildenbrand
2020-12-09 10:03     ` [External] " Muchun Song
2020-12-09 10:06       ` David Hildenbrand
2020-12-09 10:10         ` David Hildenbrand
2020-12-09 10:16           ` Muchun Song
2020-12-09 15:13           ` Muchun Song
2020-12-09 15:47             ` David Hildenbrand
2020-12-09 15:50               ` Muchun Song
2020-12-09 10:10         ` Muchun Song
2020-11-30 15:18 ` [PATCH v7 07/15] x86/mm/64: Disable PMD page mapping of vmemmap Muchun Song
2020-11-30 15:18 ` [PATCH v7 08/15] mm/hugetlb: Free the vmemmap pages associated with each hugetlb page Muchun Song
2020-11-30 15:18 ` [PATCH v7 09/15] mm/hugetlb: Defer freeing of HugeTLB pages Muchun Song
2020-11-30 15:18 ` [PATCH v7 10/15] mm/hugetlb: Allocate the vmemmap pages associated with each hugetlb page Muchun Song
2020-11-30 15:18 ` [PATCH v7 11/15] mm/hugetlb: Set the PageHWPoison to the raw error page Muchun Song
2020-11-30 15:18 ` [PATCH v7 12/15] mm/hugetlb: Flush work when dissolving hugetlb page Muchun Song
2020-11-30 15:18 ` [PATCH v7 13/15] mm/hugetlb: Add a kernel parameter hugetlb_free_vmemmap Muchun Song
2020-12-04  0:01   ` Song Bao Hua (Barry Song)
2020-11-30 15:18 ` [PATCH v7 14/15] mm/hugetlb: Gather discrete indexes of tail page Muchun Song
2020-11-30 15:18 ` [PATCH v7 15/15] mm/hugetlb: Add BUILD_BUG_ON to catch invalid usage of tail struct page Muchun Song
2020-12-03  8:35 ` [PATCH v7 00/15] Free some vmemmap pages of hugetlb page Muchun Song
2020-12-03 23:48   ` Mike Kravetz
2020-12-04  3:39     ` [External] " Muchun Song
2020-12-07 18:38       ` Oscar Salvador
2020-12-08  2:26         ` Muchun Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=73832edd-13ec-8032-d8d6-4afc53297fdb@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=almasrymina@google.com \
    --cc=anshuman.khandual@arm.com \
    --cc=bp@alien8.de \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=duanxiongchun@bytedance.com \
    --cc=hpa@zytor.com \
    --cc=jroedel@suse.de \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mchehab+huawei@kernel.org \
    --cc=mhocko@suse.com \
    --cc=mike.kravetz@oracle.com \
    --cc=mingo@redhat.com \
    --cc=oneukum@suse.com \
    --cc=osalvador@suse.de \
    --cc=paulmck@kernel.org \
    --cc=pawan.kumar.gupta@linux.intel.com \
    --cc=peterz@infradead.org \
    --cc=rdunlap@infradead.org \
    --cc=rientjes@google.com \
    --cc=song.bao.hua@hisilicon.com \
    --cc=songmuchun@bytedance.com \
    --cc=tglx@linutronix.de \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).