archive mirror
 help / color / mirror / Atom feed
From: Michal Hocko <>
To: TSUKADA Koutaro <>
Cc: Johannes Weiner <>,
	Vladimir Davydov <>,
	Jonathan Corbet <>,
	"Luis R. Rodriguez" <>,
	Kees Cook <>,
	Andrew Morton <>,
	Roman Gushchin <>,
	David Rientjes <>,
	Mike Kravetz <>,
	"Aneesh Kumar K.V" <>,
	Naoya Horiguchi <>,
	Anshuman Khandual <>,
	Marc-Andre Lureau <>,
	Punit Agrawal <>,
	Dan Williams <>,
	Vlastimil Babka <>,,,,,
Subject: Re: [PATCH v2 0/7] mm: pages for hugetlb's overcommit may be able to charge to memcg
Date: Thu, 24 May 2018 10:27:29 +0200	[thread overview]
Message-ID: <> (raw)
In-Reply-To: <>

On Thu 24-05-18 13:26:12, TSUKADA Koutaro wrote:
> I do not know if it is really a strong use case, but I will explain my
> motive in detail. English is not my native language, so please pardon
> my poor English.
> I am one of the developers for software that managing the resource used
> from user job at HPC-Cluster with Linux. The resource is memory mainly.
> The HPC-Cluster may be shared by multiple people and used. Therefore, the
> memory used by each user must be strictly controlled, otherwise the
> user's job will runaway, not only will it hamper the other users, it will
> crash the entire system in OOM.
> Some users of HPC are very nervous about performance. Jobs are executed
> while synchronizing with MPI communication using multiple compute nodes.
> Since CPU wait time will occur when synchronizing, they want to minimize
> the variation in execution time at each node to reduce waiting times as
> much as possible. We call this variation a noise.
> THP does not guarantee to use the Huge Page, but may use the normal page.
> This mechanism is one cause of variation(noise).
> The users who know this mechanism will be hesitant to use THP. However,
> the users also know the benefits of the Huge Page's TLB hit rate
> performance, and the Huge Page seems to be attractive. It seems natural
> that these users are interested in HugeTLBfs, I do not know at all
> whether it is the right approach or not.

Sure, asking for guarantee makes hugetlb pages attractive. But nothing
is really for free, especially any resource _guarantee_, and you have to
pay an additional configuration price usually.
> At the very least, our HPC system is pursuing high versatility and we
> have to consider whether we can provide it if users want to use HugeTLBfs.
> In order to use HugeTLBfs we need to create a persistent pool, but in
> our use case sharing nodes, it would be impossible to create, delete or
> resize the pool.

Why? I can see this would be quite a PITA but not really impossible.

> One of the answers I have reached is to use HugeTLBfs by overcommitting
> without creating a pool(this is the surplus hugepage).
> Surplus hugepages is hugetlb page, but I think at least that consuming
> buddy pool is a decisive difference from hugetlb page of persistent pool.
> If nr_overcommit_hugepages is assumed to be infinite, allocating pages for
> surplus hugepages from buddy pool is all unlimited even if being limited
> by memcg.

Not really, you can specify how much you can overcommit hugetlb pages.

> In extreme cases, overcommitment will allow users to exhaust
> the entire memory of the system. Of course, this can be prevented by the
> hugetlb cgroup, but even if we set the limit for memcg and hugetlb cgroup
> respectively, as I asked in the first mail(set limit to 10GB), the
> control will not work.
Michal Hocko

  reply	other threads:[~2018-05-24  8:27 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-05-18  4:27 [PATCH v2 0/7] mm: pages for hugetlb's overcommit may be able to charge to memcg TSUKADA Koutaro
2018-05-18  4:29 ` [PATCH v2 1/7] hugetlb: introduce charge_surplus_huge_pages to struct hstate TSUKADA Koutaro
2018-05-18  4:32 ` [PATCH v2 2/7] hugetlb: support migrate charging for surplus hugepages TSUKADA Koutaro
2018-05-18  4:34 ` [PATCH v2 3/7] memcg: use compound_order rather than hpage_nr_pages TSUKADA Koutaro
2018-05-18 17:46   ` Punit Agrawal
2018-05-18 17:51     ` Punit Agrawal
2018-05-21  3:48       ` TSUKADA Koutaro
2018-05-21 14:53         ` Punit Agrawal
2018-05-18  4:36 ` [PATCH v2 4/7] mm, sysctl: make charging surplus hugepages controllable TSUKADA Koutaro
2018-05-18  4:37 ` [PATCH v2 5/7] hugetlb: add charge_surplus_hugepages attribute TSUKADA Koutaro
2018-05-18  4:39 ` [PATCH v2 6/7] Documentation, hugetlb: describe about charge_surplus_hugepages, TSUKADA Koutaro
2018-05-18  4:41 ` [PATCH v2 7/7] memcg: supports movement of surplus hugepages statistics TSUKADA Koutaro
2018-05-21 14:52 ` [PATCH v2 0/7] mm: pages for hugetlb's overcommit may be able to charge to memcg Punit Agrawal
2018-05-22 12:56   ` TSUKADA Koutaro
2018-05-21 18:07 ` Mike Kravetz
2018-05-22 13:04   ` TSUKADA Koutaro
2018-05-22 18:54     ` Michal Hocko
2018-05-24  4:39       ` TSUKADA Koutaro
2018-05-24  8:20         ` Michal Hocko
2018-05-24 12:58           ` TSUKADA Koutaro
2018-05-24 13:24             ` Michal Hocko
2018-05-25  1:51               ` TSUKADA Koutaro
2018-05-22 20:28     ` Mike Kravetz
2018-05-22 13:51 ` Michal Hocko
2018-05-24  4:26   ` TSUKADA Koutaro
2018-05-24  8:27     ` Michal Hocko [this message]
2018-05-24 17:45     ` Mike Kravetz
2018-05-25  1:55       ` TSUKADA Koutaro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).