linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@suse.com>
To: Baoquan He <bhe@redhat.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>,
	Wei Yang <richard.weiyang@linux.alibaba.com>,
	akpm@linux-foundation.org, linux-mm@kvack.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH 10/10] mm/hugetlb: not necessary to abuse temporary page to workaround the nasty free_huge_page
Date: Tue, 11 Aug 2020 08:54:46 +0200	[thread overview]
Message-ID: <20200811065406.GC4793@dhcp22.suse.cz> (raw)
In-Reply-To: <20200811015148.GA10792@MiWiFi-R3L-srv>

On Tue 11-08-20 09:51:48, Baoquan He wrote:
> On 08/10/20 at 05:19pm, Mike Kravetz wrote:
> > On 8/9/20 7:17 PM, Baoquan He wrote:
> > > On 08/07/20 at 05:12pm, Wei Yang wrote:
> > >> Let's always increase surplus_huge_pages and so that free_huge_page
> > >> could decrease it at free time.
> > >>
> > >> Signed-off-by: Wei Yang <richard.weiyang@linux.alibaba.com>
> > >> ---
> > >>  mm/hugetlb.c | 14 ++++++--------
> > >>  1 file changed, 6 insertions(+), 8 deletions(-)
> > >>
> > >> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > >> index 1f2010c9dd8d..a0eb81e0e4c5 100644
> > >> --- a/mm/hugetlb.c
> > >> +++ b/mm/hugetlb.c
> > >> @@ -1913,21 +1913,19 @@ static struct page *alloc_surplus_huge_page(struct hstate *h, gfp_t gfp_mask,
> > >>  		return NULL;
> > >>  
> > >>  	spin_lock(&hugetlb_lock);
> > >> +
> > >> +	h->surplus_huge_pages++;
> > >> +	h->surplus_huge_pages_node[page_to_nid(page)]++;
> > >> +
> > >>  	/*
> > >>  	 * We could have raced with the pool size change.
> > >>  	 * Double check that and simply deallocate the new page
> > >> -	 * if we would end up overcommiting the surpluses. Abuse
> > >> -	 * temporary page to workaround the nasty free_huge_page
> > >> -	 * codeflow
> > >> +	 * if we would end up overcommiting the surpluses.
> > >>  	 */
> > >> -	if (h->surplus_huge_pages >= h->nr_overcommit_huge_pages) {
> > >> -		SetPageHugeTemporary(page);
> > > 
> > > Hmm, the temporary page way is taken intentionally in
> > > commit 9980d744a0428 ("mm, hugetlb: get rid of surplus page accounting tricks").
> > > From code, this is done inside hugetlb_lock holding, and the code flow
> > > is straightforward, should be safe. Adding Michal to CC.

But the lock is not held during the migration, right?

> > I remember when the temporary page code was added for page migration.
> > The use of temporary page here was added at about the same time.  Temporary
> > page does have one advantage in that it will not CAUSE surplus count to
> > exceed overcommit.  This patch could cause surplus to exceed overcommit
> > for a very short period of time.  However, do note that for this to happen
> > the code needs to race with a pool resize which itself could cause surplus
> > to exceed overcommit.

Correct.

> > IMO both approaches are valid.
> > - Advantage of temporary page is that it can not cause surplus to exceed
> >   overcommit.  Disadvantage is as mentioned in the comment 'abuse of temporary
> >   page'.
> > - Advantage of this patch is that it uses existing counters.  Disadvantage
> >   is that it can momentarily cause surplus to exceed overcommit.

Do I remember correctly that this can cause an allocation failure due to
overcommit check? In other words it would be user space visible thing?

> Yeah, since it's all done inside hugetlb_lock, should be OK even
> though it may cause surplus to exceed overcommit.
> > 
> > Unless someone has a strong opinion, I prefer the changes in this patch.
> 
> Agree, I also prefer the code change in this patch, to remove the
> unnecessary confusion about the temporary page.

I have managed to forgot all the juicy details since I have made that
change. All that remains is that the surplus pages accounting was quite
tricky and back then I didn't figure out a simpler method that would
achieve the consistent look at those counters. As mentioned above I
suspect this could lead to pre-mature allocation failures while the
migration is ongoing. Sure quite unlikely to happen and the race window
is likely very small. Maybe this is even acceptable but I would strongly
recommend to have all this thinking documented in the changelog.
-- 
Michal Hocko
SUSE Labs


  reply	other threads:[~2020-08-11  6:54 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-07  9:12 [PATCH 00/10] mm/hugetlb: code refine and simplification Wei Yang
2020-08-07  9:12 ` [PATCH 01/10] mm/hugetlb: not necessary to coalesce regions recursively Wei Yang
2020-08-07 12:47   ` Baoquan He
2020-08-10 20:22   ` Mike Kravetz
2020-08-07  9:12 ` [PATCH 02/10] mm/hugetlb: make sure to get NULL when list is empty Wei Yang
2020-08-07 12:49   ` Baoquan He
2020-08-07 14:28     ` Wei Yang
2020-08-10  0:57       ` Baoquan He
2020-08-10 20:28       ` Mike Kravetz
2020-08-10 23:05         ` Wei Yang
2020-08-07  9:12 ` [PATCH 03/10] mm/hugetlb: use list_splice to merge two list at once Wei Yang
2020-08-07 12:53   ` Baoquan He
2020-08-10 21:07   ` Mike Kravetz
2020-08-07  9:12 ` [PATCH 04/10] mm/hugetlb: count file_region to be added when regions_needed != NULL Wei Yang
2020-08-07 12:54   ` Baoquan He
2020-08-10 21:46   ` Mike Kravetz
2020-08-07  9:12 ` [PATCH 05/10] mm/hugetlb: remove the redundant check on non_swap_entry() Wei Yang
2020-08-07 12:55   ` Baoquan He
2020-08-07 14:28     ` Wei Yang
2020-08-07  9:12 ` [PATCH 06/10] mm/hugetlb: remove redundant huge_pte_alloc() in hugetlb_fault() Wei Yang
2020-08-07 12:59   ` Baoquan He
2020-08-10 22:00   ` Mike Kravetz
2020-08-07  9:12 ` [PATCH 07/10] mm/hugetlb: a page from buddy is not on any list Wei Yang
2020-08-07 13:06   ` Baoquan He
2020-08-10 22:25   ` Mike Kravetz
2020-08-07  9:12 ` [PATCH 08/10] mm/hugetlb: return non-isolated page in the loop instead of break and check Wei Yang
2020-08-07 13:09   ` Baoquan He
2020-08-07 14:32     ` Wei Yang
2020-08-10 22:55   ` Mike Kravetz
2020-08-07  9:12 ` [PATCH 09/10] mm/hugetlb: narrow the hugetlb_lock protection area during preparing huge page Wei Yang
2020-08-07 13:12   ` Baoquan He
2020-08-10 23:02   ` Mike Kravetz
2020-08-07  9:12 ` [PATCH 10/10] mm/hugetlb: not necessary to abuse temporary page to workaround the nasty free_huge_page Wei Yang
2020-08-10  2:17   ` Baoquan He
2020-08-11  0:19     ` Mike Kravetz
2020-08-11  1:51       ` Baoquan He
2020-08-11  6:54         ` Michal Hocko [this message]
2020-08-11 21:43           ` Mike Kravetz
2020-08-11 23:19             ` Wei Yang
2020-08-11 23:25               ` Mike Kravetz
2020-08-12  5:40             ` Baoquan He
2020-08-13 11:46             ` Michal Hocko
2020-08-17  3:04               ` Wei Yang
2020-08-11 23:55           ` Baoquan He
2020-08-07 22:25 ` [PATCH 00/10] mm/hugetlb: code refine and simplification Mike Kravetz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200811065406.GC4793@dhcp22.suse.cz \
    --to=mhocko@suse.com \
    --cc=akpm@linux-foundation.org \
    --cc=bhe@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mike.kravetz@oracle.com \
    --cc=richard.weiyang@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).