linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] memcg: hugetlbfs basic usage accounting
@ 2017-11-14 17:24 Roman Gushchin
  2017-11-14 21:06 ` Tejun Heo
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Roman Gushchin @ 2017-11-14 17:24 UTC (permalink / raw)
  To: linux-mm
  Cc: Roman Gushchin, Johannes Weiner, Michal Hocko, Vladimir Davydov,
	Andrew Morton, Tejun Heo, Mike Kravetz, Dave Hansen, kernel-team,
	cgroups, linux-kernel

This patch implements basic accounting of memory consumption
by hugetlbfs pages for cgroup v2 memory controller.

Cgroup v2 memory controller lacks any visibility into the
hugetlbfs memory consumption. Cgroup v1 implemented a separate
hugetlbfs controller, which provided such stats, and also
provided some control abilities. Although porting of the
hugetlbfs controller to cgroup v2 is arguable a good idea and
is outside of scope of this patch, it's very useful to have
basic stats provided by memory.stat.

As hugetlbfs memory can easily represent a big portion of total
memory, it's important to understand who (which memcg/container)
is using it.

The number is represented in memory.stat as "hugetlb" in bytes and
is printed unconditionally. Accounting code doesn't depend on
cgroup v1 hugetlb controller.

Example:
  $ cat /sys/fs/cgroup/user.slice/user-0.slice/session-1.scope/memory.stat
  anon 1634304
  file 1163264
  kernel_stack 16384
  slab 737280
  sock 0
  shmem 0
  file_mapped 32768
  file_dirty 4096
  file_writeback 0
  inactive_anon 0
  active_anon 1634304
  inactive_file 65536
  active_file 1097728
  unevictable 0
  slab_reclaimable 282624
  slab_unreclaimable 454656
  hugetlb 1073741824
  pgfault 4580
  pgmajfault 13
  ...

Signed-off-by: Roman Gushchin <guro@fb.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: kernel-team@fb.com
Cc: cgroups@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 include/linux/memcontrol.h | 48 ++++++++++++++++++++++++++++++++++++++++++++++
 mm/hugetlb.c               |  5 +++++
 mm/memcontrol.c            |  3 +++
 3 files changed, 56 insertions(+)

diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 69966c461d1c..e0dfb64d2918 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -45,6 +45,7 @@ enum memcg_stat_item {
 	MEMCG_SOCK,
 	/* XXX: why are these zone and not node counters? */
 	MEMCG_KERNEL_STACK_KB,
+	MEMCG_HUGETLB,
 	MEMCG_NR_STAT,
 };
 
@@ -664,6 +665,39 @@ static inline void count_memcg_event_mm(struct mm_struct *mm,
 void mem_cgroup_split_huge_fixup(struct page *head);
 #endif
 
+#ifdef CONFIG_HUGETLBFS
+static inline void mem_cgroup_add_hugetlb_page(struct page *page,
+					       unsigned int count)
+{
+	if (mem_cgroup_disabled())
+		return;
+
+	rcu_read_lock();
+	page->mem_cgroup = mem_cgroup_from_task(current);
+	css_get(&page->mem_cgroup->css);
+	rcu_read_unlock();
+
+	mod_memcg_page_state(page, MEMCG_HUGETLB, count);
+}
+
+static inline void mem_cgroup_remove_hugetlb_page(struct page *page,
+						  unsigned int count)
+{
+	if (mem_cgroup_disabled() || !page->mem_cgroup)
+		return;
+
+	mod_memcg_page_state(page, MEMCG_HUGETLB, -count);
+
+	css_put(&page->mem_cgroup->css);
+	page->mem_cgroup = NULL;
+}
+
+static inline void mem_cgroup_reset_hugetlb_page(struct page *page)
+{
+	page->mem_cgroup = NULL;
+}
+#endif
+
 #else /* CONFIG_MEMCG */
 
 #define MEM_CGROUP_ID_SHIFT	0
@@ -936,6 +970,20 @@ static inline
 void count_memcg_event_mm(struct mm_struct *mm, enum vm_event_item idx)
 {
 }
+
+static inline void mem_cgroup_add_hugetlb_page(struct page *page,
+					       unsigned int count)
+{
+}
+
+static inline void mem_cgroup_remove_hugetlb_page(struct page *page,
+						  unsigned int count)
+{
+}
+
+static inline void mem_cgroup_reset_hugetlb_page(struct page *page)
+{
+}
 #endif /* CONFIG_MEMCG */
 
 /* idx can be of type enum memcg_stat_item or node_stat_item */
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 4b3bbd2980bb..d1a2a9fa549a 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -1279,6 +1279,8 @@ void free_huge_page(struct page *page)
 	clear_page_huge_active(page);
 	hugetlb_cgroup_uncharge_page(hstate_index(h),
 				     pages_per_huge_page(h), page);
+	mem_cgroup_remove_hugetlb_page(page, pages_per_huge_page(h));
+
 	if (restore_reserve)
 		h->resv_huge_pages++;
 
@@ -1301,6 +1303,7 @@ static void prep_new_huge_page(struct hstate *h, struct page *page, int nid)
 	set_compound_page_dtor(page, HUGETLB_PAGE_DTOR);
 	spin_lock(&hugetlb_lock);
 	set_hugetlb_cgroup(page, NULL);
+	mem_cgroup_reset_hugetlb_page(page);
 	h->nr_huge_pages++;
 	h->nr_huge_pages_node[nid]++;
 	spin_unlock(&hugetlb_lock);
@@ -1584,6 +1587,7 @@ static struct page *__alloc_buddy_huge_page(struct hstate *h, gfp_t gfp_mask,
 		r_nid = page_to_nid(page);
 		set_compound_page_dtor(page, HUGETLB_PAGE_DTOR);
 		set_hugetlb_cgroup(page, NULL);
+		mem_cgroup_reset_hugetlb_page(page);
 		/*
 		 * We incremented the global counters already
 		 */
@@ -2041,6 +2045,7 @@ struct page *alloc_huge_page(struct vm_area_struct *vma,
 		/* Fall through */
 	}
 	hugetlb_cgroup_commit_charge(idx, pages_per_huge_page(h), h_cg, page);
+	mem_cgroup_add_hugetlb_page(page, pages_per_huge_page(h));
 	spin_unlock(&hugetlb_lock);
 
 	set_page_private(page, (unsigned long)spool);
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 50e6906314f8..f2323a9405a4 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5338,6 +5338,9 @@ static int memory_stat_show(struct seq_file *m, void *v)
 	seq_printf(m, "slab_unreclaimable %llu\n",
 		   (u64)stat[NR_SLAB_UNRECLAIMABLE] * PAGE_SIZE);
 
+	seq_printf(m, "hugetlb %llu\n",
+		   (u64)stat[MEMCG_HUGETLB] * PAGE_SIZE);
+
 	/* Accumulated memory events */
 
 	seq_printf(m, "pgfault %lu\n", events[PGFAULT]);
-- 
2.13.6

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] memcg: hugetlbfs basic usage accounting
  2017-11-14 17:24 [PATCH] memcg: hugetlbfs basic usage accounting Roman Gushchin
@ 2017-11-14 21:06 ` Tejun Heo
  2017-11-15  8:35 ` Michal Hocko
  2017-11-15 12:15 ` Johannes Weiner
  2 siblings, 0 replies; 8+ messages in thread
From: Tejun Heo @ 2017-11-14 21:06 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: linux-mm, Johannes Weiner, Michal Hocko, Vladimir Davydov,
	Andrew Morton, Mike Kravetz, Dave Hansen, kernel-team, cgroups,
	linux-kernel

On Tue, Nov 14, 2017 at 05:24:29PM +0000, Roman Gushchin wrote:
> This patch implements basic accounting of memory consumption
> by hugetlbfs pages for cgroup v2 memory controller.
> 
> Cgroup v2 memory controller lacks any visibility into the
> hugetlbfs memory consumption. Cgroup v1 implemented a separate
> hugetlbfs controller, which provided such stats, and also
> provided some control abilities. Although porting of the
> hugetlbfs controller to cgroup v2 is arguable a good idea and
> is outside of scope of this patch, it's very useful to have
> basic stats provided by memory.stat.
> 
> As hugetlbfs memory can easily represent a big portion of total
> memory, it's important to understand who (which memcg/container)
> is using it.
> 
> The number is represented in memory.stat as "hugetlb" in bytes and
> is printed unconditionally. Accounting code doesn't depend on
> cgroup v1 hugetlb controller.
> 
> Example:
>   $ cat /sys/fs/cgroup/user.slice/user-0.slice/session-1.scope/memory.stat
>   anon 1634304
>   file 1163264
>   kernel_stack 16384
>   slab 737280
>   sock 0
>   shmem 0
>   file_mapped 32768
>   file_dirty 4096
>   file_writeback 0
>   inactive_anon 0
>   active_anon 1634304
>   inactive_file 65536
>   active_file 1097728
>   unevictable 0
>   slab_reclaimable 282624
>   slab_unreclaimable 454656
>   hugetlb 1073741824
>   pgfault 4580
>   pgmajfault 13
>   ...
> 
> Signed-off-by: Roman Gushchin <guro@fb.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: Dave Hansen <dave.hansen@intel.com>
> Cc: kernel-team@fb.com
> Cc: cgroups@vger.kernel.org
> Cc: linux-mm@kvack.org
> Cc: linux-kernel@vger.kernel.org

Acked-by: Tejun Heo <tj@kernel.org>

Thanks.

-- 
tejun

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] memcg: hugetlbfs basic usage accounting
  2017-11-14 17:24 [PATCH] memcg: hugetlbfs basic usage accounting Roman Gushchin
  2017-11-14 21:06 ` Tejun Heo
@ 2017-11-15  8:35 ` Michal Hocko
  2017-11-15 11:18   ` Roman Gushchin
  2017-11-15 12:15 ` Johannes Weiner
  2 siblings, 1 reply; 8+ messages in thread
From: Michal Hocko @ 2017-11-15  8:35 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: linux-mm, Johannes Weiner, Vladimir Davydov, Andrew Morton,
	Tejun Heo, Mike Kravetz, Dave Hansen, kernel-team, cgroups,
	linux-kernel

On Tue 14-11-17 17:24:29, Roman Gushchin wrote:
> This patch implements basic accounting of memory consumption
> by hugetlbfs pages for cgroup v2 memory controller.
> 
> Cgroup v2 memory controller lacks any visibility into the
> hugetlbfs memory consumption. Cgroup v1 implemented a separate
> hugetlbfs controller, which provided such stats, and also
> provided some control abilities. Although porting of the
> hugetlbfs controller to cgroup v2 is arguable a good idea and
> is outside of scope of this patch, it's very useful to have
> basic stats provided by memory.stat.

Separate hugetlb cgroup controller was really a deliberate decision.
We didn't want to mix hugetlb with the reclaimable memory. There is no
reasonable way to enforce memcg limits if hugetlb pages are involved.

AFAICS your patch shouldn't break the hugetlb controller because that
one (ab)uses page[2].private to store the hstate for the accounting.
You also do not really charge those hugetlb pages so the memcg
accounting will work unchaged.

So my primary question is, why don't you simply allow hugetlb controller
rather than tweak stats for memcg? Is there any fundamental reason why
hugetlb controller is not v2 compatible?

It feels really strange to keeps stats of something the controller
doesn't really control. I can imagine confused users claiming that
numbers just do not add up...
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] memcg: hugetlbfs basic usage accounting
  2017-11-15  8:35 ` Michal Hocko
@ 2017-11-15 11:18   ` Roman Gushchin
  2017-11-15 11:42     ` Michal Hocko
  0 siblings, 1 reply; 8+ messages in thread
From: Roman Gushchin @ 2017-11-15 11:18 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, Johannes Weiner, Vladimir Davydov, Andrew Morton,
	Tejun Heo, Mike Kravetz, Dave Hansen, kernel-team, cgroups,
	linux-kernel

On Wed, Nov 15, 2017 at 09:35:04AM +0100, Michal Hocko wrote:
> On Tue 14-11-17 17:24:29, Roman Gushchin wrote:
> > This patch implements basic accounting of memory consumption
> > by hugetlbfs pages for cgroup v2 memory controller.
> > 
> > Cgroup v2 memory controller lacks any visibility into the
> > hugetlbfs memory consumption. Cgroup v1 implemented a separate
> > hugetlbfs controller, which provided such stats, and also
> > provided some control abilities. Although porting of the
> > hugetlbfs controller to cgroup v2 is arguable a good idea and
> > is outside of scope of this patch, it's very useful to have
> > basic stats provided by memory.stat.

Hi, Michal!

> Separate hugetlb cgroup controller was really a deliberate decision.
> We didn't want to mix hugetlb with the reclaimable memory. There is no
> reasonable way to enforce memcg limits if hugetlb pages are involved.
> 
> AFAICS your patch shouldn't break the hugetlb controller because that
> one (ab)uses page[2].private to store the hstate for the accounting.
> You also do not really charge those hugetlb pages so the memcg
> accounting will work unchaged.

Yes, you are right.

> 
> So my primary question is, why don't you simply allow hugetlb controller
> rather than tweak stats for memcg? Is there any fundamental reason why
> hugetlb controller is not v2 compatible?

I really don't know if the hugetlb controller has enough users to deserve
full support in v2 interface: adding knobs like memory.hugetlb.current,
memory.hugetlb.min, memory.hugetlb.high, memory.hugetlb.max, etc.

I'd be rather conservative here and avoid adding a lot to the interface
without clear demand. Also, hugetlb pages are really special, and it's
at least not obvious how, say, memory.high should work for it.

At the same time we don't really have any accounting of hugetlb page
usage (except system-wide stats in sysfs). And providing such stats
is really useful.
In my particular case, I have some number of pre-allocated hugepages,
and I have several containerized workloads, which are potentially
using them to get performance bonuses. Having these stats allows to
attribute the memory holding by hugetlb pages to one of the workloads.

> It feels really strange to keeps stats of something the controller
> doesn't really control. I can imagine confused users claiming that
> numbers just do not add up...

This is why I do not add this number to memory.current. At the same
time numbers in memory.stat are not intended to be summed (we have
event counters there, dirty pages counter, etc), so I don't see a problem.

Thanks!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] memcg: hugetlbfs basic usage accounting
  2017-11-15 11:18   ` Roman Gushchin
@ 2017-11-15 11:42     ` Michal Hocko
  2017-11-15 12:23       ` Roman Gushchin
  0 siblings, 1 reply; 8+ messages in thread
From: Michal Hocko @ 2017-11-15 11:42 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: linux-mm, Johannes Weiner, Vladimir Davydov, Andrew Morton,
	Tejun Heo, Mike Kravetz, Dave Hansen, kernel-team, cgroups,
	linux-kernel

On Wed 15-11-17 11:18:13, Roman Gushchin wrote:
> On Wed, Nov 15, 2017 at 09:35:04AM +0100, Michal Hocko wrote:
[...]
> > So my primary question is, why don't you simply allow hugetlb controller
> > rather than tweak stats for memcg? Is there any fundamental reason why
> > hugetlb controller is not v2 compatible?
> 
> I really don't know if the hugetlb controller has enough users to deserve
> full support in v2 interface: adding knobs like memory.hugetlb.current,
> memory.hugetlb.min, memory.hugetlb.high, memory.hugetlb.max, etc.
> 
> I'd be rather conservative here and avoid adding a lot to the interface
> without clear demand. Also, hugetlb pages are really special, and it's
> at least not obvious how, say, memory.high should work for it.

But you clearly want the hugetlb accoutning and that is what hugetlb
controller is for. You might not be interested in the limit enforcement
but that is not strictly required. So my question remains. Why don't we
reuse an existing infrastructure and add a new which might confuse users
in an extreme case?

Please note that I am not saying your patch is wrong, I just do not see
why we should handle hugetlb pages 2 different ways to achieve a common
infrastructure.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] memcg: hugetlbfs basic usage accounting
  2017-11-14 17:24 [PATCH] memcg: hugetlbfs basic usage accounting Roman Gushchin
  2017-11-14 21:06 ` Tejun Heo
  2017-11-15  8:35 ` Michal Hocko
@ 2017-11-15 12:15 ` Johannes Weiner
  2 siblings, 0 replies; 8+ messages in thread
From: Johannes Weiner @ 2017-11-15 12:15 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: linux-mm, Michal Hocko, Vladimir Davydov, Andrew Morton,
	Tejun Heo, Mike Kravetz, Dave Hansen, kernel-team, cgroups,
	linux-kernel

On Tue, Nov 14, 2017 at 05:24:29PM +0000, Roman Gushchin wrote:
> This patch implements basic accounting of memory consumption
> by hugetlbfs pages for cgroup v2 memory controller.
> 
> Cgroup v2 memory controller lacks any visibility into the
> hugetlbfs memory consumption. Cgroup v1 implemented a separate
> hugetlbfs controller, which provided such stats, and also
> provided some control abilities. Although porting of the
> hugetlbfs controller to cgroup v2 is arguable a good idea and
> is outside of scope of this patch, it's very useful to have
> basic stats provided by memory.stat.
> 
> As hugetlbfs memory can easily represent a big portion of total
> memory, it's important to understand who (which memcg/container)
> is using it.

I'm not really buying this argument.

Hugetlb setups tend to be static configurations that require intimate
coordination between booting the kernel with a hugetlb reservation and
precisely setting up the application(s).

In the few cases where you need introspection, you can check the the
HugetlbPages entry in /proc/<pid>/status. The minor convenience
provided by adding an aggregate cgroup counter IMO doesn't outweigh
the weirdness of listing a type of resource in memory.stat that isn't
otherwise acknowledged or controllable as memory.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] memcg: hugetlbfs basic usage accounting
  2017-11-15 11:42     ` Michal Hocko
@ 2017-11-15 12:23       ` Roman Gushchin
  2017-11-15 12:34         ` Michal Hocko
  0 siblings, 1 reply; 8+ messages in thread
From: Roman Gushchin @ 2017-11-15 12:23 UTC (permalink / raw)
  To: Michal Hocko
  Cc: linux-mm, Johannes Weiner, Vladimir Davydov, Andrew Morton,
	Tejun Heo, Mike Kravetz, Dave Hansen, kernel-team, cgroups,
	linux-kernel

On Wed, Nov 15, 2017 at 12:42:23PM +0100, Michal Hocko wrote:
> On Wed 15-11-17 11:18:13, Roman Gushchin wrote:
> > On Wed, Nov 15, 2017 at 09:35:04AM +0100, Michal Hocko wrote:
> [...]
> > > So my primary question is, why don't you simply allow hugetlb controller
> > > rather than tweak stats for memcg? Is there any fundamental reason why
> > > hugetlb controller is not v2 compatible?
> > 
> > I really don't know if the hugetlb controller has enough users to deserve
> > full support in v2 interface: adding knobs like memory.hugetlb.current,
> > memory.hugetlb.min, memory.hugetlb.high, memory.hugetlb.max, etc.
> > 
> > I'd be rather conservative here and avoid adding a lot to the interface
> > without clear demand. Also, hugetlb pages are really special, and it's
> > at least not obvious how, say, memory.high should work for it.
> 
> But you clearly want the hugetlb accoutning and that is what hugetlb
> controller is for. You might not be interested in the limit enforcement
> but that is not strictly required. So my question remains. Why don't we
> reuse an existing infrastructure and add a new which might confuse users
> in an extreme case?

Hm, but to use a small part of hugetlb controller infrastructure I would
have to add a whole set of cgroup v2 controls.
And control knobs (like memory.hugetlb.current) are much more obligatory
than an entry in memory.stat, where we have some internal stats as well.

So, I don't really know why confusion should come from in this case?
It would be confusing, if we'd add hugetlb stats to the memory.current,
so that it could be larger then memory.max.
But as separate entry in memory.stat it should not confuse anyone,
at least not more than the existing state of things, when hugetlb pages
are a black hole.

> 
> Please note that I am not saying your patch is wrong, I just do not see
> why we should handle hugetlb pages 2 different ways to achieve a common
> infrastructure.

This is perfectly fine, and I do understand it.
My point is that it's a cheap way to solve a real problem, which is also
not binding us too much.

Thanks!

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] memcg: hugetlbfs basic usage accounting
  2017-11-15 12:23       ` Roman Gushchin
@ 2017-11-15 12:34         ` Michal Hocko
  0 siblings, 0 replies; 8+ messages in thread
From: Michal Hocko @ 2017-11-15 12:34 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: linux-mm, Johannes Weiner, Vladimir Davydov, Andrew Morton,
	Tejun Heo, Mike Kravetz, Dave Hansen, kernel-team, cgroups,
	linux-kernel

On Wed 15-11-17 12:23:14, Roman Gushchin wrote:
> On Wed, Nov 15, 2017 at 12:42:23PM +0100, Michal Hocko wrote:
> > On Wed 15-11-17 11:18:13, Roman Gushchin wrote:
> > > On Wed, Nov 15, 2017 at 09:35:04AM +0100, Michal Hocko wrote:
> > [...]
> > > > So my primary question is, why don't you simply allow hugetlb controller
> > > > rather than tweak stats for memcg? Is there any fundamental reason why
> > > > hugetlb controller is not v2 compatible?
> > > 
> > > I really don't know if the hugetlb controller has enough users to deserve
> > > full support in v2 interface: adding knobs like memory.hugetlb.current,
> > > memory.hugetlb.min, memory.hugetlb.high, memory.hugetlb.max, etc.
> > > 
> > > I'd be rather conservative here and avoid adding a lot to the interface
> > > without clear demand. Also, hugetlb pages are really special, and it's
> > > at least not obvious how, say, memory.high should work for it.
> > 
> > But you clearly want the hugetlb accoutning and that is what hugetlb
> > controller is for. You might not be interested in the limit enforcement
> > but that is not strictly required. So my question remains. Why don't we
> > reuse an existing infrastructure and add a new which might confuse users
> > in an extreme case?
> 
> Hm, but to use a small part of hugetlb controller infrastructure I would
> have to add a whole set of cgroup v2 controls.

And? I mean how does that differ from somebody using memcg only for stat
purposes?

> And control knobs (like memory.hugetlb.current) are much more obligatory
> than an entry in memory.stat, where we have some internal stats as well.
> 
> So, I don't really know why confusion should come from in this case?

Because you stat data that is not controlled by the controller. Just
imagine if somebody wanted to sum counters to get the resulting
accounted and enforced memory.

You are simply trying to push a square through circle without a good
reason. Unless there is a fundamental reason to not enable hugetlb
controller to v2 I would rather go that way rather than to have another
hugetlb weirdness. Enabling the controller should be a matter of
exporting knobs. Trivial thing AFAICS.
-- 
Michal Hocko
SUSE Labs

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-11-15 12:34 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-11-14 17:24 [PATCH] memcg: hugetlbfs basic usage accounting Roman Gushchin
2017-11-14 21:06 ` Tejun Heo
2017-11-15  8:35 ` Michal Hocko
2017-11-15 11:18   ` Roman Gushchin
2017-11-15 11:42     ` Michal Hocko
2017-11-15 12:23       ` Roman Gushchin
2017-11-15 12:34         ` Michal Hocko
2017-11-15 12:15 ` Johannes Weiner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).