All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: Oscar Salvador <osalvador@suse.de>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	David Hildenbrand <david@redhat.com>,
	Michal Hocko <mhocko@suse.com>, Zi Yan <ziy@nvidia.com>,
	Muchun Song <songmuchun@bytedance.com>,
	Naoya Horiguchi <naoya.horiguchi@linux.dev>,
	David Rientjes <rientjes@google.com>,
	"Aneesh Kumar K . V" <aneesh.kumar@linux.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>
Subject: Re: [PATCH v3 3/5] hugetlb: be sure to free demoted CMA pages to CMA
Date: Wed, 6 Oct 2021 11:27:58 -0700	[thread overview]
Message-ID: <2095ad43-f918-4ea3-9dbd-76a7d1f174bd@oracle.com> (raw)
In-Reply-To: <20211006075451.GA11341@linux>

On 10/6/21 12:54 AM, Oscar Salvador wrote:
> On Tue, Oct 05, 2021 at 11:57:54AM -0700, Mike Kravetz wrote:
>> It is the smallest.
>>
>> CMA uses a per-region bit map to track allocations.  When setting up the
>> region, you specify how many pages each bit represents.  Currently,
>> only gigantic pages are allocated/freed from CMA so the region is set up
>> such that one bit represents a gigantic page size allocation.
>>
>> With demote, a gigantic page (allocation) could be split into smaller
>> size pages.  And, these smaller size pages will be freed to CMA.  So,
>> since the per-region bit map needs to be set up to represent the smallest
>> allocation/free size, it now needs to be set to the smallest huge page
>> size which can be freed to CMA.
>>
>> Unfortunately, we set up the CMA region for huge pages before we set up
>> huge pages sizes (hstates).  So, technically we do not know the smallest
>> huge page size as this can change via command line options and
>> architecture specific code.  Therefore, at region setup time we need some
>> constant value for smallest possible huge page size.  That is why
>> HUGETLB_PAGE_ORDER is used.
> 
> Do you know if that is done for a reason? Setting up CMA for hugetlb before
> initialiting hugetlb itself? Would not make more sense to do it the other way
> around?
> 

One reason is that the initialization sequence is a bit messy.  In most
cases, arch specific code sets up huge pages.  So, we would need to make
sure this is done before the cma initialization.  This might be
possible, but I am not confident in my abilities to understand/modify
and test early init code for all architectures supporting hugetlb cma
allocations.

In addition, not all architectures initialize their huge page sizes.  It
is possible for architectures to only set up huge pages that have been
requested on the command line.  In such cases, it would require some
fancy command line parsing to look for and process a hugetlb_cma argument
before any other hugetlb argument.  Not even sure if this is possible.

The most reasonable way to address this would be to add an arch specific
callback asking for the smallest supported huge page size.  I did not do
this here, as I am not sure this is really going to be an issue.  In
the use case (and architecture) I know of, this is not an issue.  As you
mention, this or something else could be added if the need arises.
-- 
Mike Kravetz

> The way I see it is that it is a bit unfortunate that we cannot only demote
> gigantic pages per se, so 1GB on x86_64 and 16G on arm64 with 64k page size.
> 
> I guess nothing to be worried about now as this is an early stage, but maybe
> something to think about in the future in we case we want to allow for more
> flexibility.
> 
>> I should probably add all that to the changelog for clarity?
> 
> Yes, I think it would be great to have that as a context.
> 
>> After your comment yesterday about rewriting this code for clarity,  this
>> now becomes:
>>
>> 		/*
>> 		 * Set demote order for each hstate.  Note that
>> 		 * h->demote_order is initially 0.
>> 		 * - We can not demote gigantic pages if runtime freeing
>> 		 *   is not supported, so skip this.
>> 		 * - If CMA allocation is possible, we can not demote
>> 		 *   HUGETLB_PAGE_ORDER or smaller size pages.
>> 		 */
>> 		if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported())
>> 			continue;
>> 		if (hugetlb_cma_size && h->order <= HUGETLB_PAGE_ORDER)
>> 			continue;
>> 		for_each_hstate(h2) {
>> 			if (h2 == h)
>> 				continue;
>> 			if (h2->order < h->order &&
>> 			    h2->order > h->demote_order)
>> 				h->demote_order = h2->order;
>> 		}
>>
>> Hopefully, that is more clear.
> 
> Defintiely, this looks better to me.
> 

  reply	other threads:[~2021-10-06 18:28 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-01 17:52 [PATCH v3 0/5] hugetlb: add demote/split page functionality Mike Kravetz
2021-10-01 17:52 ` [PATCH v3 1/5] hugetlb: add demote hugetlb page sysfs interfaces Mike Kravetz
2021-10-04 13:00   ` Oscar Salvador
2021-10-04 18:27     ` Mike Kravetz
2021-10-05  8:23   ` Oscar Salvador
2021-10-05 16:58     ` Mike Kravetz
2021-11-12 20:10   ` kernel test robot
2021-10-01 17:52 ` [PATCH v3 2/5] mm/cma: add cma_pages_valid to determine if pages are in CMA Mike Kravetz
2021-10-05  8:35   ` Oscar Salvador
2021-10-05  8:45   ` Oscar Salvador
2021-10-05 17:06     ` Mike Kravetz
2021-10-05  8:48   ` David Hildenbrand
2021-10-01 17:52 ` [PATCH v3 3/5] hugetlb: be sure to free demoted CMA pages to CMA Mike Kravetz
2021-10-05  9:33   ` Oscar Salvador
2021-10-05 18:57     ` Mike Kravetz
2021-10-06  7:54       ` Oscar Salvador
2021-10-06 18:27         ` Mike Kravetz [this message]
2021-10-01 17:52 ` [PATCH v3 4/5] hugetlb: add demote bool to gigantic page routines Mike Kravetz
2021-10-01 17:52 ` [PATCH v3 5/5] hugetlb: add hugetlb demote page support Mike Kravetz
2021-10-06  8:41   ` Oscar Salvador
2021-10-06 18:52     ` Mike Kravetz
2021-10-07  7:52       ` Oscar Salvador

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2095ad43-f918-4ea3-9dbd-76a7d1f174bd@oracle.com \
    --to=mike.kravetz@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=david@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@suse.com \
    --cc=naoya.horiguchi@linux.dev \
    --cc=osalvador@suse.de \
    --cc=rientjes@google.com \
    --cc=songmuchun@bytedance.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.