linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Mike Kravetz <mike.kravetz@oracle.com>
Cc: linux-mm@kvack.org, Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH RFC 2/2] mm, hugetlb: do not rely on overcommit limit during migration
Date: Thu, 30 Nov 2017 20:57:43 +0100	[thread overview]
Message-ID: <20171130195743.52vc6enr3rnivtdx@dhcp22.suse.cz> (raw)
In-Reply-To: <e23f971e-cd62-afea-6567-0873a3e48db7@oracle.com>

On Thu 30-11-17 11:35:11, Mike Kravetz wrote:
> On 11/29/2017 11:57 PM, Michal Hocko wrote:
> > On Wed 29-11-17 11:52:53, Mike Kravetz wrote:
> >> On 11/29/2017 01:22 AM, Michal Hocko wrote:
> >>> What about this on top. I haven't tested this yet though.
> >>
> >> Yes, this would work.
> >>
> >> However, I think a simple modification to your previous free_huge_page
> >> changes would make this unnecessary.  I was confused in your previous
> >> patch because you decremented the per-node surplus page count, but not
> >> the global count.  I think it would have been correct (and made this
> >> patch unnecessary) if you decremented the global counter there as well.
> > 
> > We cannot really increment the global counter because the over number of
> > surplus pages during migration doesn't increase.
> 
> I was not suggesting we increment the global surplus count.  Rather,
> your previous patch should have decremented the global surplus count in
> free_huge_page.  Something like:

sorry I meant to say decrement. The point is that overal suprlus count
doesn't change after the migration. The only thing that _might_ change
is the per node distribution of surplus pages. That is why I think we
should handle that during the migration.

> @@ -1283,7 +1283,13 @@ void free_huge_page(struct page *page)
>  	if (restore_reserve)
>  		h->resv_huge_pages++;
>  
> -	if (h->surplus_huge_pages_node[nid]) {
> +	if (PageHugeTemporary(page)) {
> +		list_del(&page->lru);
> +		ClearPageHugeTemporary(page);
> +		update_and_free_page(h, page);
> +		if (h->surplus_huge_pages_node[nid])
> +			h->surplus_huge_pages--;
> +			h->surplus_huge_pages_node[nid]--;
> +		}
> +	} else if (h->surplus_huge_pages_node[nid]) {
>  		/* remove the page from active list */
>  		list_del(&page->lru);
>  		update_and_free_page(h, page);
> 
> When we allocate one of these 'PageHugeTemporary' pages, we only increment
> the global and node specific nr_huge_pages counters.  To me, this makes all
> the huge page counters be the same as if there were simply one additional
> pre-allocated huge page.  This 'extra' (PageHugeTemporary) page will go
> away when free_huge_page is called.  So, my thought is that it is not
> necessary to transfer per-node counts from the original to target node.
> Of course, I may be missing something.

The thing is that we do not know whether the original page is surplus
until the deallocation time.

> When thinking about transfering per-node counts as is done in your latest
> patch, I took another look at all the per-node counts.  This may show my
> ignorance of huge page migration, but do we need to handle the case where
> the page being migrated is 'free'?  Is that possible?  If so, there will
> be a count for free_huge_pages_node and the page will be on the per node
> hugepage_freelists that must be handled

I do not understand. What do you mean by free? Sitting on the pool? I do
not think we ever try to migrate those. They simply do not have any
state to migrate. We could very well just allocate fresh pages on the
remote node and dissolve free ones. I am not sure we do that during the
memory hotplug to preserve the pool size and I am too tired to check
that now. This would be a different topic I guess.
-- 
Michal Hocko
SUSE Labs

  reply	other threads:[~2017-11-30 19:57 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-22 15:28 hugetlb page migration vs. overcommit Michal Hocko
2017-11-22 19:11 ` Mike Kravetz
2017-11-23  9:21   ` Michal Hocko
2017-11-27  6:27   ` Naoya Horiguchi
2017-11-28 10:19 ` Michal Hocko
2017-11-28 14:12   ` Michal Hocko
2017-11-28 14:12     ` [PATCH RFC 1/2] mm, hugetlb: unify core page allocation accounting and initialization Michal Hocko
2017-11-28 21:34       ` Mike Kravetz
2017-11-29  6:57         ` Michal Hocko
2017-11-29 19:09           ` Mike Kravetz
2017-11-28 14:12     ` [PATCH RFC 2/2] mm, hugetlb: do not rely on overcommit limit during migration Michal Hocko
2017-11-29  1:39       ` Mike Kravetz
2017-11-29  7:17         ` Michal Hocko
2017-11-29  9:22       ` Michal Hocko
2017-11-29  9:40         ` Michal Hocko
2017-11-29 11:23         ` Michal Hocko
2017-11-29 19:52         ` Mike Kravetz
2017-11-30  7:57           ` Michal Hocko
2017-11-30 19:35             ` Mike Kravetz
2017-11-30 19:57               ` Michal Hocko [this message]
2017-11-30 20:06                 ` Michal Hocko
2017-11-29  9:51       ` Michal Hocko
2017-11-29 11:33       ` [PATCH RFC v2 " Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171130195743.52vc6enr3rnivtdx@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=n-horiguchi@ah.jp.nec.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).