From: Michal Hocko <mhocko@kernel.org>
To: TSUKADA Koutaro <tsukada@ascade.co.jp>
Cc: Mike Kravetz <mike.kravetz@oracle.com>,
Johannes Weiner <hannes@cmpxchg.org>,
Vladimir Davydov <vdavydov.dev@gmail.com>,
Jonathan Corbet <corbet@lwn.net>,
"Luis R. Rodriguez" <mcgrof@kernel.org>,
Kees Cook <keescook@chromium.org>,
Andrew Morton <akpm@linux-foundation.org>,
Roman Gushchin <guro@fb.com>,
David Rientjes <rientjes@google.com>,
"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
Anshuman Khandual <khandual@linux.vnet.ibm.com>,
Marc-Andre Lureau <marcandre.lureau@redhat.com>,
Punit Agrawal <punit.agrawal@arm.com>,
Dan Williams <dan.j.williams@intel.com>,
Vlastimil Babka <vbabka@suse.cz>,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-fsdevel@vger.kernel.org, linux-mm@kvack.org,
cgroups@vger.kernel.org
Subject: Re: [PATCH v2 0/7] mm: pages for hugetlb's overcommit may be able to charge to memcg
Date: Thu, 24 May 2018 15:24:14 +0200 [thread overview]
Message-ID: <20180524132414.GI20441@dhcp22.suse.cz> (raw)
In-Reply-To: <b2afbff6-b59f-7105-3808-64d41bd4a3a8@ascade.co.jp>
On Thu 24-05-18 21:58:49, TSUKADA Koutaro wrote:
> On 2018/05/24 17:20, Michal Hocko wrote:
> > On Thu 24-05-18 13:39:59, TSUKADA Koutaro wrote:
> >> On 2018/05/23 3:54, Michal Hocko wrote:
> > [...]
> >>> I am also quite confused why you keep distinguishing surplus hugetlb
> >>> pages from regular preallocated ones. Being a surplus page is an
> >>> implementation detail that we use for an internal accounting rather than
> >>> something to exhibit to the userspace even more than we do currently.
> >>
> >> I apologize for having confused.
> >>
> >> The hugetlb pages obtained from the pool do not waste the buddy pool.
> >
> > Because they have already allocated from the buddy allocator so the end
> > result is very same.
> >
> >> On
> >> the other hand, surplus hugetlb pages waste the buddy pool. Due to this
> >> difference in property, I thought it could be distinguished.
> >
> > But this is simply not correct. Surplus pages are fluid. If you increase
> > the hugetlb size they will become regular persistent hugetlb pages.
>
> I really can not understand what's wrong with this. That page is obviously
> released before being added to the persistent pool, and at that time it is
> uncharged from memcg to which the task belongs(This assumes my patch-set).
> After that, the same page obtained from the pool is not surplus hugepage
> so it will not be charged to memcg again.
I do not see anything like that. adjust_pool_surplus is simply and
accounting thing. At least the last time I've checked. Maybe your
patchset handles that?
> >> Although my memcg knowledge is extremely limited, memcg is accounting for
> >> various kinds of pages obtained from the buddy pool by the task belonging
> >> to it. I would like to argue that surplus hugepage has specificity in
> >> terms of obtaining from the buddy pool, and that it is specially permitted
> >> charge requirements for memcg.
> >
> > Not really. Memcg accounts primarily for reclaimable memory. We do
> > account for some non-reclaimable slabs but the life time should be at
> > least bound to a process life time. Otherwise the memcg oom killer
> > behavior is not guaranteed to unclutter the situation. Hugetlb pages are
> > simply persistent. Well, to be completely honest tmpfs pages have a
> > similar problem but lacking the swap space for them is kinda
> > configuration bug.
>
> Absolutely you are saying the right thing, but, for example, can mlock(2)ed
> pages be swapped out by reclaim?(What is the difference between mlock(2)ed
> pages and hugetlb page?)
No mlocked pages cannot be reclaimed and that is why we restrict them to
a relatively small amount.
> >> It seems very strange that charge hugetlb page to memcg, but essentially
> >> it only charges the usage of the compound page obtained from the buddy pool,
> >> and even if that page is used as hugetlb page after that, memcg is not
> >> interested in that.
> >
> > Ohh, it is very much interested. The primary goal of memcg is to enforce
> > the limit. How are you going to do that in an absence of the reclaimable
> > memory? And quite a lot of it because hugetlb pages usually consume a
> > lot of memory.
>
> Simply kill any of the tasks belonging to that memcg. Maybe, no one wants
> reclaim at the time of account of with surplus hugepages.
But that will not release the hugetlb memory, does it?
> [...]
> >> I could not understand the intention of this question, sorry. When resize
> >> the pool, I think that the number of surplus hugepages in use does not
> >> change. Could you explain what you were concerned about?
> >
> > It does change when you change the hugetlb pool size, migrate pages
> > between per-numa pools (have a look at adjust_pool_surplus).
>
> As I looked at, what kind of fatal problem is caused by charging surplus
> hugepages to memcg by just manipulating counter of statistical information?
Fatal? Not sure. It simply tries to add an alien memory to the memcg
concept so I would pressume an unexpected behavior (e.g. not being able
to reclaim memcg or, over reclaim, trashing etc.).
--
Michal Hocko
SUSE Labs
next prev parent reply other threads:[~2018-05-24 13:24 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-05-18 4:27 [PATCH v2 0/7] mm: pages for hugetlb's overcommit may be able to charge to memcg TSUKADA Koutaro
2018-05-18 4:29 ` [PATCH v2 1/7] hugetlb: introduce charge_surplus_huge_pages to struct hstate TSUKADA Koutaro
2018-05-18 4:32 ` [PATCH v2 2/7] hugetlb: support migrate charging for surplus hugepages TSUKADA Koutaro
2018-05-18 4:34 ` [PATCH v2 3/7] memcg: use compound_order rather than hpage_nr_pages TSUKADA Koutaro
2018-05-18 17:46 ` Punit Agrawal
2018-05-18 17:51 ` Punit Agrawal
2018-05-21 3:48 ` TSUKADA Koutaro
2018-05-21 14:53 ` Punit Agrawal
2018-05-18 4:36 ` [PATCH v2 4/7] mm, sysctl: make charging surplus hugepages controllable TSUKADA Koutaro
2018-05-18 4:37 ` [PATCH v2 5/7] hugetlb: add charge_surplus_hugepages attribute TSUKADA Koutaro
2018-05-18 4:39 ` [PATCH v2 6/7] Documentation, hugetlb: describe about charge_surplus_hugepages, TSUKADA Koutaro
2018-05-18 4:41 ` [PATCH v2 7/7] memcg: supports movement of surplus hugepages statistics TSUKADA Koutaro
2018-05-21 14:52 ` [PATCH v2 0/7] mm: pages for hugetlb's overcommit may be able to charge to memcg Punit Agrawal
2018-05-22 12:56 ` TSUKADA Koutaro
2018-05-21 18:07 ` Mike Kravetz
2018-05-22 13:04 ` TSUKADA Koutaro
2018-05-22 18:54 ` Michal Hocko
2018-05-24 4:39 ` TSUKADA Koutaro
2018-05-24 8:20 ` Michal Hocko
2018-05-24 12:58 ` TSUKADA Koutaro
2018-05-24 13:24 ` Michal Hocko [this message]
2018-05-25 1:51 ` TSUKADA Koutaro
2018-05-22 20:28 ` Mike Kravetz
2018-05-22 13:51 ` Michal Hocko
2018-05-24 4:26 ` TSUKADA Koutaro
2018-05-24 8:27 ` Michal Hocko
2018-05-24 17:45 ` Mike Kravetz
2018-05-25 1:55 ` TSUKADA Koutaro
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180524132414.GI20441@dhcp22.suse.cz \
--to=mhocko@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@linux.vnet.ibm.com \
--cc=cgroups@vger.kernel.org \
--cc=corbet@lwn.net \
--cc=dan.j.williams@intel.com \
--cc=guro@fb.com \
--cc=hannes@cmpxchg.org \
--cc=keescook@chromium.org \
--cc=khandual@linux.vnet.ibm.com \
--cc=linux-doc@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=marcandre.lureau@redhat.com \
--cc=mcgrof@kernel.org \
--cc=mike.kravetz@oracle.com \
--cc=n-horiguchi@ah.jp.nec.com \
--cc=punit.agrawal@arm.com \
--cc=rientjes@google.com \
--cc=tsukada@ascade.co.jp \
--cc=vbabka@suse.cz \
--cc=vdavydov.dev@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).