All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC v1] memcg: add memcg lru for page reclaiming
@ 2019-10-21 11:56 Hillf Danton
  2019-10-21 12:14 ` Michal Hocko
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Hillf Danton @ 2019-10-21 11:56 UTC (permalink / raw)
  To: linux-mm
  Cc: Andrew Morton, linux-kernel, Chris Down, Tejun Heo,
	Roman Gushchin, Michal Hocko, Johannes Weiner, Shakeel Butt,
	Matthew Wilcox, Minchan Kim, Mel Gorman, Hillf Danton


Currently soft limit reclaim is frozen, see
Documentation/admin-guide/cgroup-v2.rst for reasons.

Copying the page lru idea, memcg lru is added for selecting victim
memcg to reclaim pages from under memory pressure. It now works in
parallel to slr not only because the latter needs some time to reap
but the coexistence facilitates it a lot to add the lru in a straight
forward manner.

A lru list paired with a spin lock is added, thanks to the current
memcg high_work that provides other things it needs, and a couple of
helpers to add memcg to and pick victim from lru.

V1 is based on 5.4-rc3.

Changes since v0
- add MEMCG_LRU in init/Kconfig
- drop changes in mm/vmscan.c
- make memcg lru work in parallel to slr

Cc: Chris Down <chris@chrisdown.name>
Cc: Tejun Heo <tj@kernel.org>
Cc: Roman Gushchin <guro@fb.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Hillf Danton <hdanton@sina.com>
---

--- a/init/Kconfig
+++ b/init/Kconfig
@@ -843,6 +843,14 @@ config MEMCG
 	help
 	  Provides control over the memory footprint of tasks in a cgroup.
 
+config MEMCG_LRU
+	bool
+	depends on MEMCG
+	help
+	  Select victim memcg on lru for page reclaiming.
+
+	  Say N if unsure.
+
 config MEMCG_SWAP
 	bool "Swap controller"
 	depends on MEMCG && SWAP
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -223,6 +223,10 @@ struct mem_cgroup {
 	/* Upper bound of normal memory consumption range */
 	unsigned long high;
 
+#ifdef CONFIG_MEMCG_LRU
+	struct list_head lru_node;
+#endif
+
 	/* Range enforcement for interrupt charges */
 	struct work_struct high_work;
 
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -2338,14 +2338,54 @@ static int memcg_hotplug_cpu_dead(unsign
 	return 0;
 }
 
+#ifdef CONFIG_MEMCG_LRU
+static DEFINE_SPINLOCK(memcg_lru_lock);
+static LIST_HEAD(memcg_lru);	/* a copy of page lru */
+
+static void memcg_add_lru(struct mem_cgroup *memcg)
+{
+	spin_lock_irq(&memcg_lru_lock);
+	if (list_empty(&memcg->lru_node))
+		list_add_tail(&memcg->lru_node, &memcg_lru);
+	spin_unlock_irq(&memcg_lru_lock);
+}
+
+static struct mem_cgroup *memcg_pick_lru(void)
+{
+	struct mem_cgroup *memcg, *next;
+
+	spin_lock_irq(&memcg_lru_lock);
+
+	list_for_each_entry_safe(memcg, next, &memcg_lru, lru_node) {
+		list_del_init(&memcg->lru_node);
+
+		if (page_counter_read(&memcg->memory) > memcg->high) {
+			spin_unlock_irq(&memcg_lru_lock);
+			return memcg;
+		}
+	}
+	spin_unlock_irq(&memcg_lru_lock);
+
+	return NULL;
+}
+#endif
+
 static void reclaim_high(struct mem_cgroup *memcg,
 			 unsigned int nr_pages,
 			 gfp_t gfp_mask)
 {
+#ifdef CONFIG_MEMCG_LRU
+	struct mem_cgroup *start = memcg;
+#endif
 	do {
 		if (page_counter_read(&memcg->memory) <= memcg->high)
 			continue;
 		memcg_memory_event(memcg, MEMCG_HIGH);
+		if (IS_ENABLED(CONFIG_MEMCG_LRU))
+			if (start != memcg) {
+				memcg_add_lru(memcg);
+				return;
+			}
 		try_to_free_mem_cgroup_pages(memcg, nr_pages, gfp_mask, true);
 	} while ((memcg = parent_mem_cgroup(memcg)));
 }
@@ -3158,6 +3198,13 @@ unsigned long mem_cgroup_soft_limit_recl
 	unsigned long excess;
 	unsigned long nr_scanned;
 
+	if (IS_ENABLED(CONFIG_MEMCG_LRU)) {
+		struct mem_cgroup *memcg = memcg_pick_lru();
+		if (memcg)
+			schedule_work(&memcg->high_work);
+		return 0;
+	}
+
 	if (order > 0)
 		return 0;
 
@@ -5068,6 +5115,8 @@ static struct mem_cgroup *mem_cgroup_all
 	if (memcg_wb_domain_init(memcg, GFP_KERNEL))
 		goto fail;
 
+	if (IS_ENABLED(CONFIG_MEMCG_LRU))
+		INIT_LIST_HEAD(&memcg->lru_node);
 	INIT_WORK(&memcg->high_work, high_work_func);
 	memcg->last_scanned_node = MAX_NUMNODES;
 	INIT_LIST_HEAD(&memcg->oom_notify);
--



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC v1] memcg: add memcg lru for page reclaiming
  2019-10-21 11:56 [RFC v1] memcg: add memcg lru for page reclaiming Hillf Danton
@ 2019-10-21 12:14 ` Michal Hocko
  2019-10-22  2:17 ` kbuild test robot
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 8+ messages in thread
From: Michal Hocko @ 2019-10-21 12:14 UTC (permalink / raw)
  To: Hillf Danton
  Cc: linux-mm, Andrew Morton, linux-kernel, Chris Down, Tejun Heo,
	Roman Gushchin, Johannes Weiner, Shakeel Butt, Matthew Wilcox,
	Minchan Kim, Mel Gorman

On Mon 21-10-19 19:56:54, Hillf Danton wrote:
> 
> Currently soft limit reclaim is frozen, see
> Documentation/admin-guide/cgroup-v2.rst for reasons.
> 
> Copying the page lru idea, memcg lru is added for selecting victim
> memcg to reclaim pages from under memory pressure. It now works in
> parallel to slr not only because the latter needs some time to reap
> but the coexistence facilitates it a lot to add the lru in a straight
> forward manner.

This doesn't explain what is the problem/feature you would like to
fix/achieve. It also doesn't explain the overall design. 

> A lru list paired with a spin lock is added, thanks to the current
> memcg high_work that provides other things it needs, and a couple of
> helpers to add memcg to and pick victim from lru.
> 
> V1 is based on 5.4-rc3.
> 
> Changes since v0
> - add MEMCG_LRU in init/Kconfig
> - drop changes in mm/vmscan.c
> - make memcg lru work in parallel to slr
> 
> Cc: Chris Down <chris@chrisdown.name>
> Cc: Tejun Heo <tj@kernel.org>
> Cc: Roman Gushchin <guro@fb.com>
> Cc: Michal Hocko <mhocko@kernel.org>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Shakeel Butt <shakeelb@google.com>
> Cc: Matthew Wilcox <willy@infradead.org>
> Cc: Minchan Kim <minchan@kernel.org>
> Cc: Mel Gorman <mgorman@suse.de>
> Signed-off-by: Hillf Danton <hdanton@sina.com>
> ---
> 
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -843,6 +843,14 @@ config MEMCG
>  	help
>  	  Provides control over the memory footprint of tasks in a cgroup.
>  
> +config MEMCG_LRU
> +	bool
> +	depends on MEMCG
> +	help
> +	  Select victim memcg on lru for page reclaiming.
> +
> +	  Say N if unsure.
> +
>  config MEMCG_SWAP
>  	bool "Swap controller"
>  	depends on MEMCG && SWAP
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -223,6 +223,10 @@ struct mem_cgroup {
>  	/* Upper bound of normal memory consumption range */
>  	unsigned long high;
>  
> +#ifdef CONFIG_MEMCG_LRU
> +	struct list_head lru_node;
> +#endif
> +
>  	/* Range enforcement for interrupt charges */
>  	struct work_struct high_work;
>  
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2338,14 +2338,54 @@ static int memcg_hotplug_cpu_dead(unsign
>  	return 0;
>  }
>  
> +#ifdef CONFIG_MEMCG_LRU
> +static DEFINE_SPINLOCK(memcg_lru_lock);
> +static LIST_HEAD(memcg_lru);	/* a copy of page lru */
> +
> +static void memcg_add_lru(struct mem_cgroup *memcg)
> +{
> +	spin_lock_irq(&memcg_lru_lock);
> +	if (list_empty(&memcg->lru_node))
> +		list_add_tail(&memcg->lru_node, &memcg_lru);
> +	spin_unlock_irq(&memcg_lru_lock);
> +}
> +
> +static struct mem_cgroup *memcg_pick_lru(void)
> +{
> +	struct mem_cgroup *memcg, *next;
> +
> +	spin_lock_irq(&memcg_lru_lock);
> +
> +	list_for_each_entry_safe(memcg, next, &memcg_lru, lru_node) {
> +		list_del_init(&memcg->lru_node);
> +
> +		if (page_counter_read(&memcg->memory) > memcg->high) {
> +			spin_unlock_irq(&memcg_lru_lock);
> +			return memcg;
> +		}
> +	}
> +	spin_unlock_irq(&memcg_lru_lock);
> +
> +	return NULL;
> +}
> +#endif
> +
>  static void reclaim_high(struct mem_cgroup *memcg,
>  			 unsigned int nr_pages,
>  			 gfp_t gfp_mask)
>  {
> +#ifdef CONFIG_MEMCG_LRU
> +	struct mem_cgroup *start = memcg;
> +#endif
>  	do {
>  		if (page_counter_read(&memcg->memory) <= memcg->high)
>  			continue;
>  		memcg_memory_event(memcg, MEMCG_HIGH);
> +		if (IS_ENABLED(CONFIG_MEMCG_LRU))
> +			if (start != memcg) {
> +				memcg_add_lru(memcg);
> +				return;
> +			}
>  		try_to_free_mem_cgroup_pages(memcg, nr_pages, gfp_mask, true);
>  	} while ((memcg = parent_mem_cgroup(memcg)));
>  }
> @@ -3158,6 +3198,13 @@ unsigned long mem_cgroup_soft_limit_recl
>  	unsigned long excess;
>  	unsigned long nr_scanned;
>  
> +	if (IS_ENABLED(CONFIG_MEMCG_LRU)) {
> +		struct mem_cgroup *memcg = memcg_pick_lru();
> +		if (memcg)
> +			schedule_work(&memcg->high_work);
> +		return 0;
> +	}
> +
>  	if (order > 0)
>  		return 0;
>  
> @@ -5068,6 +5115,8 @@ static struct mem_cgroup *mem_cgroup_all
>  	if (memcg_wb_domain_init(memcg, GFP_KERNEL))
>  		goto fail;
>  
> +	if (IS_ENABLED(CONFIG_MEMCG_LRU))
> +		INIT_LIST_HEAD(&memcg->lru_node);
>  	INIT_WORK(&memcg->high_work, high_work_func);
>  	memcg->last_scanned_node = MAX_NUMNODES;
>  	INIT_LIST_HEAD(&memcg->oom_notify);
> --
> 

-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC v1] memcg: add memcg lru for page reclaiming
  2019-10-21 11:56 [RFC v1] memcg: add memcg lru for page reclaiming Hillf Danton
  2019-10-21 12:14 ` Michal Hocko
@ 2019-10-22  2:17 ` kbuild test robot
  2019-10-22  3:37 ` kbuild test robot
  2019-10-22 13:30 ` Hillf Danton
  3 siblings, 0 replies; 8+ messages in thread
From: kbuild test robot @ 2019-10-22  2:17 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 3202 bytes --]

Hi Hillf,

[FYI, it's a private test report for your RFC patch.]
[auto build test ERROR on linus/master]
[cannot apply to v5.4-rc4 next-20191021]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Hillf-Danton/memcg-add-memcg-lru-for-page-reclaiming/20191022-082625
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 7d194c2100ad2a6dded545887d02754948ca5241
config: x86_64-lkp (attached as .config)
compiler: gcc-7 (Debian 7.4.0-14) 7.4.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=x86_64 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   mm/memcontrol.c: In function 'reclaim_high':
   mm/memcontrol.c:2385:8: error: 'start' undeclared (first use in this function); did you mean 'stat'?
       if (start != memcg) {
           ^~~~~
           stat
   mm/memcontrol.c:2385:8: note: each undeclared identifier is reported only once for each function it appears in
>> mm/memcontrol.c:2386:5: error: implicit declaration of function 'memcg_add_lru'; did you mean 'numa_add_cpu'? [-Werror=implicit-function-declaration]
        memcg_add_lru(memcg);
        ^~~~~~~~~~~~~
        numa_add_cpu
   mm/memcontrol.c: In function 'mem_cgroup_soft_limit_reclaim':
   mm/memcontrol.c:3202:30: error: implicit declaration of function 'memcg_pick_lru'; did you mean 'lock_page_lru'? [-Werror=implicit-function-declaration]
      struct mem_cgroup *memcg = memcg_pick_lru();
                                 ^~~~~~~~~~~~~~
                                 lock_page_lru
   mm/memcontrol.c:3202:30: warning: initialization makes pointer from integer without a cast [-Wint-conversion]
   mm/memcontrol.c: In function 'mem_cgroup_alloc':
   mm/memcontrol.c:5119:26: error: 'struct mem_cgroup' has no member named 'lru_node'; did you mean 'scan_nodes'?
      INIT_LIST_HEAD(&memcg->lru_node);
                             ^~~~~~~~
                             scan_nodes
   cc1: some warnings being treated as errors

vim +2386 mm/memcontrol.c

  2372	
  2373	static void reclaim_high(struct mem_cgroup *memcg,
  2374				 unsigned int nr_pages,
  2375				 gfp_t gfp_mask)
  2376	{
  2377	#ifdef CONFIG_MEMCG_LRU
  2378		struct mem_cgroup *start = memcg;
  2379	#endif
  2380		do {
  2381			if (page_counter_read(&memcg->memory) <= memcg->high)
  2382				continue;
  2383			memcg_memory_event(memcg, MEMCG_HIGH);
  2384			if (IS_ENABLED(CONFIG_MEMCG_LRU))
> 2385				if (start != memcg) {
> 2386					memcg_add_lru(memcg);
  2387					return;
  2388				}
  2389			try_to_free_mem_cgroup_pages(memcg, nr_pages, gfp_mask, true);
  2390		} while ((memcg = parent_mem_cgroup(memcg)));
  2391	}
  2392	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 28617 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC v1] memcg: add memcg lru for page reclaiming
  2019-10-21 11:56 [RFC v1] memcg: add memcg lru for page reclaiming Hillf Danton
  2019-10-21 12:14 ` Michal Hocko
  2019-10-22  2:17 ` kbuild test robot
@ 2019-10-22  3:37 ` kbuild test robot
  2019-10-22 13:30 ` Hillf Danton
  3 siblings, 0 replies; 8+ messages in thread
From: kbuild test robot @ 2019-10-22  3:37 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 3297 bytes --]

Hi Hillf,

[FYI, it's a private test report for your RFC patch.]
[auto build test ERROR on linus/master]
[cannot apply to v5.4-rc4 next-20191021]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:    https://github.com/0day-ci/linux/commits/Hillf-Danton/memcg-add-memcg-lru-for-page-reclaiming/20191022-082625
base:   https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git 7d194c2100ad2a6dded545887d02754948ca5241
config: parisc-allyesconfig (attached as .config)
compiler: hppa-linux-gcc (GCC) 7.4.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        GCC_VERSION=7.4.0 make.cross ARCH=parisc 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   mm/memcontrol.c: In function 'reclaim_high':
   mm/memcontrol.c:2385:8: error: 'start' undeclared (first use in this function); did you mean 'stat'?
       if (start != memcg) {
           ^~~~~
           stat
   mm/memcontrol.c:2385:8: note: each undeclared identifier is reported only once for each function it appears in
>> mm/memcontrol.c:2386:5: error: implicit declaration of function 'memcg_add_lru'; did you mean 'be64_add_cpu'? [-Werror=implicit-function-declaration]
        memcg_add_lru(memcg);
        ^~~~~~~~~~~~~
        be64_add_cpu
   mm/memcontrol.c: In function 'mem_cgroup_soft_limit_reclaim':
   mm/memcontrol.c:3202:30: error: implicit declaration of function 'memcg_pick_lru'; did you mean 'lock_page_lru'? [-Werror=implicit-function-declaration]
      struct mem_cgroup *memcg = memcg_pick_lru();
                                 ^~~~~~~~~~~~~~
                                 lock_page_lru
   mm/memcontrol.c:3202:30: warning: initialization makes pointer from integer without a cast [-Wint-conversion]
   mm/memcontrol.c: In function 'mem_cgroup_alloc':
   mm/memcontrol.c:5119:24: error: 'struct mem_cgroup' has no member named 'lru_node'
      INIT_LIST_HEAD(&memcg->lru_node);
                           ^~
   cc1: some warnings being treated as errors

vim +2386 mm/memcontrol.c

  2372	
  2373	static void reclaim_high(struct mem_cgroup *memcg,
  2374				 unsigned int nr_pages,
  2375				 gfp_t gfp_mask)
  2376	{
  2377	#ifdef CONFIG_MEMCG_LRU
  2378		struct mem_cgroup *start = memcg;
  2379	#endif
  2380		do {
  2381			if (page_counter_read(&memcg->memory) <= memcg->high)
  2382				continue;
  2383			memcg_memory_event(memcg, MEMCG_HIGH);
  2384			if (IS_ENABLED(CONFIG_MEMCG_LRU))
> 2385				if (start != memcg) {
> 2386					memcg_add_lru(memcg);
  2387					return;
  2388				}
  2389			try_to_free_mem_cgroup_pages(memcg, nr_pages, gfp_mask, true);
  2390		} while ((memcg = parent_mem_cgroup(memcg)));
  2391	}
  2392	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 58587 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC v1] memcg: add memcg lru for page reclaiming
  2019-10-21 11:56 [RFC v1] memcg: add memcg lru for page reclaiming Hillf Danton
                   ` (2 preceding siblings ...)
  2019-10-22  3:37 ` kbuild test robot
@ 2019-10-22 13:30 ` Hillf Danton
  2019-10-22 13:58   ` Michal Hocko
  2019-10-23  4:44   ` Hillf Danton
  3 siblings, 2 replies; 8+ messages in thread
From: Hillf Danton @ 2019-10-22 13:30 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Hillf Danton, linux-mm, Andrew Morton, linux-kernel, Chris Down,
	Tejun Heo, Roman Gushchin, Johannes Weiner, Shakeel Butt,
	Matthew Wilcox, Minchan Kim, Mel Gorman


On Mon, 21 Oct 2019 14:14:53 +0200 Michal Hocko wrote:
> 
> On Mon 21-10-19 19:56:54, Hillf Danton wrote:
> > 
> > Currently soft limit reclaim is frozen, see
> > Documentation/admin-guide/cgroup-v2.rst for reasons.
> > 
> > Copying the page lru idea, memcg lru is added for selecting victim
> > memcg to reclaim pages from under memory pressure. It now works in
> > parallel to slr not only because the latter needs some time to reap
> > but the coexistence facilitates it a lot to add the lru in a straight
> > forward manner.
> 
> This doesn't explain what is the problem/feature you would like to
> fix/achieve. It also doesn't explain the overall design. 

1, memcg lru makes page reclaiming hierarchy aware

While doing the high work, memcgs are currently reclaimed one after
another up through the hierarchy; in this RFC after ripping pages off
the first victim, the work finishes with the first ancestor of the victim
added to lru.

Recaliming is defered until kswapd becomes active.

2, memcg lru tries much to avoid overreclaim

Only one memcg is picked off lru in FIFO mode under memory pressure,
and MEMCG_CHARGE_BATCH pages are reclaimed one memcg at a time.

In next version, a new function will be added for kswapd to call,

	void memcg_try_to_free_pages(void)

with CONFIG_MEMCG_LRU dropped and mem_cgroup_soft_limit_reclaim()
untouched.

Thanks
Hillf



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC v1] memcg: add memcg lru for page reclaiming
  2019-10-22 13:30 ` Hillf Danton
@ 2019-10-22 13:58   ` Michal Hocko
  2019-10-23  4:44   ` Hillf Danton
  1 sibling, 0 replies; 8+ messages in thread
From: Michal Hocko @ 2019-10-22 13:58 UTC (permalink / raw)
  To: Hillf Danton
  Cc: linux-mm, Andrew Morton, linux-kernel, Chris Down, Tejun Heo,
	Roman Gushchin, Johannes Weiner, Shakeel Butt, Matthew Wilcox,
	Minchan Kim, Mel Gorman

On Tue 22-10-19 21:30:50, Hillf Danton wrote:
> 
> On Mon, 21 Oct 2019 14:14:53 +0200 Michal Hocko wrote:
> > 
> > On Mon 21-10-19 19:56:54, Hillf Danton wrote:
> > > 
> > > Currently soft limit reclaim is frozen, see
> > > Documentation/admin-guide/cgroup-v2.rst for reasons.
> > > 
> > > Copying the page lru idea, memcg lru is added for selecting victim
> > > memcg to reclaim pages from under memory pressure. It now works in
> > > parallel to slr not only because the latter needs some time to reap
> > > but the coexistence facilitates it a lot to add the lru in a straight
> > > forward manner.
> > 
> > This doesn't explain what is the problem/feature you would like to
> > fix/achieve. It also doesn't explain the overall design. 
> 
> 1, memcg lru makes page reclaiming hierarchy aware

Is that a problem statement or a design goal?

> While doing the high work, memcgs are currently reclaimed one after
> another up through the hierarchy;

Which is the design because it is the the memcg where the high limit got
hit. The hierarchical behavior ensures that the subtree of that memcg is
reclaimed and we try to spread the reclaim fairly over the tree.

> in this RFC after ripping pages off
> the first victim, the work finishes with the first ancestor of the victim
> added to lru.
> 
> Recaliming is defered until kswapd becomes active.

This is a wrong assumption because high limit might be configured way
before kswapd is woken up.

> 2, memcg lru tries much to avoid overreclaim

Again, is this a problem statement or a design goal?
 
> Only one memcg is picked off lru in FIFO mode under memory pressure,
> and MEMCG_CHARGE_BATCH pages are reclaimed one memcg at a time.

And why is this preferred over SWAP_CLUSTER_MAX and whole subtree
reclaim that we do currently? 

Please do not set another version until it is actually clear what you
want to achieve and why.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC v1] memcg: add memcg lru for page reclaiming
  2019-10-22 13:30 ` Hillf Danton
  2019-10-22 13:58   ` Michal Hocko
@ 2019-10-23  4:44   ` Hillf Danton
  2019-10-23  8:08     ` Michal Hocko
  1 sibling, 1 reply; 8+ messages in thread
From: Hillf Danton @ 2019-10-23  4:44 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Hillf Danton, linux-mm, Andrew Morton, linux-kernel, Chris Down,
	Tejun Heo, Roman Gushchin, Johannes Weiner, Shakeel Butt,
	Matthew Wilcox, Minchan Kim, Mel Gorman


On Tue, 22 Oct 2019 15:58:32 +0200 Michal Hocko wrote:
> 
> On Tue 22-10-19 21:30:50, Hillf Danton wrote:
> > 
> > On Mon, 21 Oct 2019 14:14:53 +0200 Michal Hocko wrote:
> > > 
> > > On Mon 21-10-19 19:56:54, Hillf Danton wrote:
> > > > 
> > > > Currently soft limit reclaim is frozen, see
> > > > Documentation/admin-guide/cgroup-v2.rst for reasons.
> > > > 
> > > > Copying the page lru idea, memcg lru is added for selecting victim
> > > > memcg to reclaim pages from under memory pressure. It now works in
> > > > parallel to slr not only because the latter needs some time to reap
> > > > but the coexistence facilitates it a lot to add the lru in a straight
> > > > forward manner.
> > > 
> > > This doesn't explain what is the problem/feature you would like to
> > > fix/achieve. It also doesn't explain the overall design. 
> > 
> > 1, memcg lru makes page reclaiming hierarchy aware
> 
> Is that a problem statement or a design goal?

A problem in soft limit reclaim as per cgroup-v2.rst that is addressed
in the RFC.

> > While doing the high work, memcgs are currently reclaimed one after
> > another up through the hierarchy;
> 
> Which is the design because it is the the memcg where the high limit got
> hit. The hierarchical behavior ensures that the subtree of that memcg is
> reclaimed and we try to spread the reclaim fairly over the tree.

Yeah, that coding is scarcely able to escape standing ovation. No one of
its merits yet is missed in the RFC except for breaking spiraling up the
memcg hierarchy into two parts, the up half that rips pages off the first
victim, and the bottom half that queues the victim's first ancestor on the
lru(the ice box storing the cakes baked for kswapd), see below for reasons.

> > in this RFC after ripping pages off
> > the first victim, the work finishes with the first ancestor of the victim
> > added to lru.
> > 
> > Recaliming is defered until kswapd becomes active.
> 
> This is a wrong assumption because high limit might be configured way
> before kswapd is woken up.

This change was introduced because high limit breach looks not like a
serious problem in the absence of memory pressure. Lets do the hard work,
reclaiming one memcg a time up through the hierarchy, when kswapd becomes
active. It also explains the BH introduced.

> > 2, memcg lru tries much to avoid overreclaim
> 
> Again, is this a problem statement or a design goal?

Another problem in SLR as per cgroup-v2.rst that is addressed in the RFC.

> > Only one memcg is picked off lru in FIFO mode under memory pressure,
> > and MEMCG_CHARGE_BATCH pages are reclaimed one memcg at a time.
> 
> And why is this preferred over SWAP_CLUSTER_MAX

No change is added in the current high work behavior in terms of
MEMCG_CHARGE_BATCH; try_to_free_mem_cgroup_pages() takes care of both.

> and whole subtree reclaim that we do currently? 

We terminate climbing up the hierarchy once kswapd finger snaps "Cut. Work done."

Thanks
Hillf



^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC v1] memcg: add memcg lru for page reclaiming
  2019-10-23  4:44   ` Hillf Danton
@ 2019-10-23  8:08     ` Michal Hocko
  0 siblings, 0 replies; 8+ messages in thread
From: Michal Hocko @ 2019-10-23  8:08 UTC (permalink / raw)
  To: Hillf Danton
  Cc: linux-mm, Andrew Morton, linux-kernel, Chris Down, Tejun Heo,
	Roman Gushchin, Johannes Weiner, Shakeel Butt, Matthew Wilcox,
	Minchan Kim, Mel Gorman

On Wed 23-10-19 12:44:48, Hillf Danton wrote:
> 
> On Tue, 22 Oct 2019 15:58:32 +0200 Michal Hocko wrote:
> > 
> > On Tue 22-10-19 21:30:50, Hillf Danton wrote:
[...]
> > > in this RFC after ripping pages off
> > > the first victim, the work finishes with the first ancestor of the victim
> > > added to lru.
> > > 
> > > Recaliming is defered until kswapd becomes active.
> > 
> > This is a wrong assumption because high limit might be configured way
> > before kswapd is woken up.
> 
> This change was introduced because high limit breach looks not like a
> serious problem in the absence of memory pressure. Lets do the hard work,
> reclaiming one memcg a time up through the hierarchy, when kswapd becomes
> active. It also explains the BH introduced.

But this goes against the main motivation for the high limit - to
throttle. It is not all that important that there is not global memory
pressure. The preventive high limit reclaim is there to make sure that
the specific memcg is kept in a reasonable containment.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-10-23  8:08 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-21 11:56 [RFC v1] memcg: add memcg lru for page reclaiming Hillf Danton
2019-10-21 12:14 ` Michal Hocko
2019-10-22  2:17 ` kbuild test robot
2019-10-22  3:37 ` kbuild test robot
2019-10-22 13:30 ` Hillf Danton
2019-10-22 13:58   ` Michal Hocko
2019-10-23  4:44   ` Hillf Danton
2019-10-23  8:08     ` Michal Hocko

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.