linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mike Kravetz <mike.kravetz@oracle.com>
To: TSUKADA Koutaro <tsukada@ascade.co.jp>, Michal Hocko <mhocko@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>,
	Vladimir Davydov <vdavydov.dev@gmail.com>,
	Jonathan Corbet <corbet@lwn.net>,
	"Luis R. Rodriguez" <mcgrof@kernel.org>,
	Kees Cook <keescook@chromium.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Roman Gushchin <guro@fb.com>,
	David Rientjes <rientjes@google.com>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Anshuman Khandual <khandual@linux.vnet.ibm.com>,
	Marc-Andre Lureau <marcandre.lureau@redhat.com>,
	Punit Agrawal <punit.agrawal@arm.com>,
	Dan Williams <dan.j.williams@intel.com>,
	Vlastimil Babka <vbabka@suse.cz>,
	linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
	cgroups@vger.kernel.org
Subject: Re: [PATCH v2 0/7] mm: pages for hugetlb's overcommit may be able to charge to memcg
Date: Thu, 24 May 2018 10:45:08 -0700	[thread overview]
Message-ID: <4078bc2d-4aaf-cd1b-0145-5915e382852f@oracle.com> (raw)
In-Reply-To: <af1a3050-7365-428a-dfb1-2f3da37dc9ff@ascade.co.jp>

On 05/23/2018 09:26 PM, TSUKADA Koutaro wrote:
> 
> I do not know if it is really a strong use case, but I will explain my
> motive in detail. English is not my native language, so please pardon
> my poor English.
> 
> I am one of the developers for software that managing the resource used
> from user job at HPC-Cluster with Linux. The resource is memory mainly.
> The HPC-Cluster may be shared by multiple people and used. Therefore, the
> memory used by each user must be strictly controlled, otherwise the
> user's job will runaway, not only will it hamper the other users, it will
> crash the entire system in OOM.
> 
> Some users of HPC are very nervous about performance. Jobs are executed
> while synchronizing with MPI communication using multiple compute nodes.
> Since CPU wait time will occur when synchronizing, they want to minimize
> the variation in execution time at each node to reduce waiting times as
> much as possible. We call this variation a noise.
> 
> THP does not guarantee to use the Huge Page, but may use the normal page.

Note.  You do not want to use THP because "THP does not guarantee".

> This mechanism is one cause of variation(noise).
> 
> The users who know this mechanism will be hesitant to use THP. However,
> the users also know the benefits of the Huge Page's TLB hit rate
> performance, and the Huge Page seems to be attractive. It seems natural
> that these users are interested in HugeTLBfs, I do not know at all
> whether it is the right approach or not.
> 
> At the very least, our HPC system is pursuing high versatility and we
> have to consider whether we can provide it if users want to use HugeTLBfs.
> 
> In order to use HugeTLBfs we need to create a persistent pool, but in
> our use case sharing nodes, it would be impossible to create, delete or
> resize the pool.
> 
> One of the answers I have reached is to use HugeTLBfs by overcommitting
> without creating a pool(this is the surplus hugepage).

Using hugetlbfs overcommit also does not provide a guarantee.  Without
doing much research, I would say the failure rate for obtaining a huge
page via THP and hugetlbfs overcommit is about the same.  The most
difficult issue in both cases will be obtaining a "huge page" number of
pages from the buddy allocator.

I really do not think hugetlbfs overcommit will provide any benefit over
THP for your use case.  Also, new user space code is required to "fall back"
to normal pages in the case of hugetlbfs page allocation failure.  This
is not needed in the THP case.
-- 
Mike Kravetz

  parent reply	other threads:[~2018-05-24 17:45 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-18  4:27 [PATCH v2 0/7] mm: pages for hugetlb's overcommit may be able to charge to memcg TSUKADA Koutaro
2018-05-18  4:29 ` [PATCH v2 1/7] hugetlb: introduce charge_surplus_huge_pages to struct hstate TSUKADA Koutaro
2018-05-18  4:32 ` [PATCH v2 2/7] hugetlb: support migrate charging for surplus hugepages TSUKADA Koutaro
2018-05-18  4:34 ` [PATCH v2 3/7] memcg: use compound_order rather than hpage_nr_pages TSUKADA Koutaro
2018-05-18 17:46   ` Punit Agrawal
2018-05-18 17:51     ` Punit Agrawal
2018-05-21  3:48       ` TSUKADA Koutaro
2018-05-21 14:53         ` Punit Agrawal
2018-05-18  4:36 ` [PATCH v2 4/7] mm, sysctl: make charging surplus hugepages controllable TSUKADA Koutaro
2018-05-18  4:37 ` [PATCH v2 5/7] hugetlb: add charge_surplus_hugepages attribute TSUKADA Koutaro
2018-05-18  4:39 ` [PATCH v2 6/7] Documentation, hugetlb: describe about charge_surplus_hugepages, TSUKADA Koutaro
2018-05-18  4:41 ` [PATCH v2 7/7] memcg: supports movement of surplus hugepages statistics TSUKADA Koutaro
2018-05-21 14:52 ` [PATCH v2 0/7] mm: pages for hugetlb's overcommit may be able to charge to memcg Punit Agrawal
2018-05-22 12:56   ` TSUKADA Koutaro
2018-05-21 18:07 ` Mike Kravetz
2018-05-22 13:04   ` TSUKADA Koutaro
2018-05-22 18:54     ` Michal Hocko
2018-05-24  4:39       ` TSUKADA Koutaro
2018-05-24  8:20         ` Michal Hocko
2018-05-24 12:58           ` TSUKADA Koutaro
2018-05-24 13:24             ` Michal Hocko
2018-05-25  1:51               ` TSUKADA Koutaro
2018-05-22 20:28     ` Mike Kravetz
2018-05-22 13:51 ` Michal Hocko
2018-05-24  4:26   ` TSUKADA Koutaro
2018-05-24  8:27     ` Michal Hocko
2018-05-24 17:45     ` Mike Kravetz [this message]
2018-05-25  1:55       ` TSUKADA Koutaro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4078bc2d-4aaf-cd1b-0145-5915e382852f@oracle.com \
    --to=mike.kravetz@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=cgroups@vger.kernel.org \
    --cc=corbet@lwn.net \
    --cc=dan.j.williams@intel.com \
    --cc=guro@fb.com \
    --cc=hannes@cmpxchg.org \
    --cc=keescook@chromium.org \
    --cc=khandual@linux.vnet.ibm.com \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=marcandre.lureau@redhat.com \
    --cc=mcgrof@kernel.org \
    --cc=mhocko@kernel.org \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=punit.agrawal@arm.com \
    --cc=rientjes@google.com \
    --cc=tsukada@ascade.co.jp \
    --cc=vbabka@suse.cz \
    --cc=vdavydov.dev@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).