linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/2] memcg: folding CONFIG_MEMCG_SWAP as default
@ 2020-04-17 14:43 Alex Shi
  2020-04-17 14:43 ` [PATCH 2/2] mm/swap: clean up parameter pass in swapin funcs Alex Shi
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Alex Shi @ 2020-04-17 14:43 UTC (permalink / raw)
  Cc: Alex Shi, Andrew Morton, Johannes Weiner, linux-mm, cgroups

This patch fold MEMCG_SWAP feature into kernel as default function. That
required a short size memcg id for each of page. As Johannes mentioned

"the overhead of tracking is tiny - 512k per G of swap (0.04%).'

So all swapout page could be tracked for its memcg id.

Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Andrew Morton <akpm@linux-foundation.org> 
Cc: Johannes Weiner <hannes@cmpxchg.org> 
Cc: linux-mm@kvack.org 
Cc: cgroups@vger.kernel.org 
---
 arch/arm/configs/omap2plus_defconfig        |  1 -
 arch/arm64/configs/defconfig                |  1 -
 arch/mips/configs/db1xxx_defconfig          |  1 -
 arch/mips/configs/generic_defconfig         |  1 -
 arch/mips/configs/loongson3_defconfig       |  1 -
 arch/parisc/configs/generic-64bit_defconfig |  1 -
 arch/powerpc/configs/powernv_defconfig      |  1 -
 arch/powerpc/configs/pseries_defconfig      |  1 -
 arch/s390/configs/debug_defconfig           |  1 -
 arch/s390/configs/defconfig                 |  1 -
 arch/sh/configs/sdk7786_defconfig           |  1 -
 arch/sh/configs/urquell_defconfig           |  1 -
 include/linux/memcontrol.h                  |  2 --
 include/linux/swap.h                        | 27 --------------------------
 include/linux/swap_cgroup.h                 | 30 -----------------------------
 init/Kconfig                                | 20 -------------------
 mm/Makefile                                 |  4 ++--
 mm/memcontrol.c                             | 18 -----------------
 18 files changed, 2 insertions(+), 111 deletions(-)

diff --git a/arch/arm/configs/omap2plus_defconfig b/arch/arm/configs/omap2plus_defconfig
index 3cc3ca5fa027..929cfc4c062a 100644
--- a/arch/arm/configs/omap2plus_defconfig
+++ b/arch/arm/configs/omap2plus_defconfig
@@ -10,7 +10,6 @@ CONFIG_IKCONFIG_PROC=y
 CONFIG_LOG_BUF_SHIFT=16
 CONFIG_CGROUPS=y
 CONFIG_MEMCG=y
-CONFIG_MEMCG_SWAP=y
 CONFIG_BLK_CGROUP=y
 CONFIG_CGROUP_SCHED=y
 CONFIG_CFS_BANDWIDTH=y
diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index 24e534d85045..3854c6140a04 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -13,7 +13,6 @@ CONFIG_IKCONFIG=y
 CONFIG_IKCONFIG_PROC=y
 CONFIG_NUMA_BALANCING=y
 CONFIG_MEMCG=y
-CONFIG_MEMCG_SWAP=y
 CONFIG_BLK_CGROUP=y
 CONFIG_CGROUP_PIDS=y
 CONFIG_CGROUP_HUGETLB=y
diff --git a/arch/mips/configs/db1xxx_defconfig b/arch/mips/configs/db1xxx_defconfig
index e6f3e8e3da39..4604e826d03f 100644
--- a/arch/mips/configs/db1xxx_defconfig
+++ b/arch/mips/configs/db1xxx_defconfig
@@ -9,7 +9,6 @@ CONFIG_HIGH_RES_TIMERS=y
 CONFIG_LOG_BUF_SHIFT=16
 CONFIG_CGROUPS=y
 CONFIG_MEMCG=y
-CONFIG_MEMCG_SWAP=y
 CONFIG_BLK_CGROUP=y
 CONFIG_CGROUP_SCHED=y
 CONFIG_CFS_BANDWIDTH=y
diff --git a/arch/mips/configs/generic_defconfig b/arch/mips/configs/generic_defconfig
index 714169e411cf..48e4e251779b 100644
--- a/arch/mips/configs/generic_defconfig
+++ b/arch/mips/configs/generic_defconfig
@@ -3,7 +3,6 @@ CONFIG_NO_HZ_IDLE=y
 CONFIG_IKCONFIG=y
 CONFIG_IKCONFIG_PROC=y
 CONFIG_MEMCG=y
-CONFIG_MEMCG_SWAP=y
 CONFIG_BLK_CGROUP=y
 CONFIG_CFS_BANDWIDTH=y
 CONFIG_RT_GROUP_SCHED=y
diff --git a/arch/mips/configs/loongson3_defconfig b/arch/mips/configs/loongson3_defconfig
index 51675f5000d6..f91f49595100 100644
--- a/arch/mips/configs/loongson3_defconfig
+++ b/arch/mips/configs/loongson3_defconfig
@@ -13,7 +13,6 @@ CONFIG_TASK_DELAY_ACCT=y
 CONFIG_TASK_XACCT=y
 CONFIG_TASK_IO_ACCOUNTING=y
 CONFIG_MEMCG=y
-CONFIG_MEMCG_SWAP=y
 CONFIG_BLK_CGROUP=y
 CONFIG_CPUSETS=y
 CONFIG_SCHED_AUTOGROUP=y
diff --git a/arch/parisc/configs/generic-64bit_defconfig b/arch/parisc/configs/generic-64bit_defconfig
index 59561e04e659..fab4401dd9f6 100644
--- a/arch/parisc/configs/generic-64bit_defconfig
+++ b/arch/parisc/configs/generic-64bit_defconfig
@@ -10,7 +10,6 @@ CONFIG_TASK_XACCT=y
 CONFIG_TASK_IO_ACCOUNTING=y
 CONFIG_CGROUPS=y
 CONFIG_MEMCG=y
-CONFIG_MEMCG_SWAP=y
 CONFIG_CGROUP_PIDS=y
 CONFIG_CPUSETS=y
 CONFIG_RELAY=y
diff --git a/arch/powerpc/configs/powernv_defconfig b/arch/powerpc/configs/powernv_defconfig
index df8bdbaa5d8f..c60c3a4125f7 100644
--- a/arch/powerpc/configs/powernv_defconfig
+++ b/arch/powerpc/configs/powernv_defconfig
@@ -17,7 +17,6 @@ CONFIG_LOG_CPU_MAX_BUF_SHIFT=13
 CONFIG_NUMA_BALANCING=y
 CONFIG_CGROUPS=y
 CONFIG_MEMCG=y
-CONFIG_MEMCG_SWAP=y
 CONFIG_CGROUP_SCHED=y
 CONFIG_CGROUP_FREEZER=y
 CONFIG_CPUSETS=y
diff --git a/arch/powerpc/configs/pseries_defconfig b/arch/powerpc/configs/pseries_defconfig
index 0bea4d3ffb85..426244f491ec 100644
--- a/arch/powerpc/configs/pseries_defconfig
+++ b/arch/powerpc/configs/pseries_defconfig
@@ -16,7 +16,6 @@ CONFIG_LOG_CPU_MAX_BUF_SHIFT=13
 CONFIG_NUMA_BALANCING=y
 CONFIG_CGROUPS=y
 CONFIG_MEMCG=y
-CONFIG_MEMCG_SWAP=y
 CONFIG_CGROUP_SCHED=y
 CONFIG_CGROUP_FREEZER=y
 CONFIG_CPUSETS=y
diff --git a/arch/s390/configs/debug_defconfig b/arch/s390/configs/debug_defconfig
index 46038bc58c9e..864b32403673 100644
--- a/arch/s390/configs/debug_defconfig
+++ b/arch/s390/configs/debug_defconfig
@@ -14,7 +14,6 @@ CONFIG_IKCONFIG=y
 CONFIG_IKCONFIG_PROC=y
 CONFIG_NUMA_BALANCING=y
 CONFIG_MEMCG=y
-CONFIG_MEMCG_SWAP=y
 CONFIG_BLK_CGROUP=y
 CONFIG_CFS_BANDWIDTH=y
 CONFIG_RT_GROUP_SCHED=y
diff --git a/arch/s390/configs/defconfig b/arch/s390/configs/defconfig
index 7cd0648c1f4e..991e60d7c307 100644
--- a/arch/s390/configs/defconfig
+++ b/arch/s390/configs/defconfig
@@ -13,7 +13,6 @@ CONFIG_IKCONFIG=y
 CONFIG_IKCONFIG_PROC=y
 CONFIG_NUMA_BALANCING=y
 CONFIG_MEMCG=y
-CONFIG_MEMCG_SWAP=y
 CONFIG_BLK_CGROUP=y
 CONFIG_CFS_BANDWIDTH=y
 CONFIG_RT_GROUP_SCHED=y
diff --git a/arch/sh/configs/sdk7786_defconfig b/arch/sh/configs/sdk7786_defconfig
index 7fa116b436c3..4a1d19068a8d 100644
--- a/arch/sh/configs/sdk7786_defconfig
+++ b/arch/sh/configs/sdk7786_defconfig
@@ -17,7 +17,6 @@ CONFIG_CPUSETS=y
 # CONFIG_PROC_PID_CPUSET is not set
 CONFIG_CGROUP_CPUACCT=y
 CONFIG_CGROUP_MEMCG=y
-CONFIG_CGROUP_MEMCG_SWAP=y
 CONFIG_CGROUP_SCHED=y
 CONFIG_RT_GROUP_SCHED=y
 CONFIG_BLK_CGROUP=y
diff --git a/arch/sh/configs/urquell_defconfig b/arch/sh/configs/urquell_defconfig
index cb2f56468fe0..be478f3148f2 100644
--- a/arch/sh/configs/urquell_defconfig
+++ b/arch/sh/configs/urquell_defconfig
@@ -14,7 +14,6 @@ CONFIG_CPUSETS=y
 # CONFIG_PROC_PID_CPUSET is not set
 CONFIG_CGROUP_CPUACCT=y
 CONFIG_CGROUP_MEMCG=y
-CONFIG_CGROUP_MEMCG_SWAP=y
 CONFIG_CGROUP_SCHED=y
 CONFIG_RT_GROUP_SCHED=y
 CONFIG_BLK_DEV_INITRD=y
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 1b4150ff64be..45842ed8ba53 100644
--- a/include/linux/memcontrol.h
+++ b/include/linux/memcontrol.h
@@ -569,9 +569,7 @@ struct mem_cgroup *mem_cgroup_get_oom_group(struct task_struct *victim,
 					    struct mem_cgroup *oom_domain);
 void mem_cgroup_print_oom_group(struct mem_cgroup *memcg);
 
-#ifdef CONFIG_MEMCG_SWAP
 extern int do_swap_account;
-#endif
 
 struct mem_cgroup *lock_page_memcg(struct page *page);
 void __unlock_page_memcg(struct mem_cgroup *memcg);
diff --git a/include/linux/swap.h b/include/linux/swap.h
index b835d8dbea0e..7e44e1e6ef27 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -654,38 +654,11 @@ static inline void mem_cgroup_throttle_swaprate(struct mem_cgroup *memcg,
 }
 #endif
 
-#ifdef CONFIG_MEMCG_SWAP
 extern void mem_cgroup_swapout(struct page *page, swp_entry_t entry);
 extern int mem_cgroup_try_charge_swap(struct page *page, swp_entry_t entry);
 extern void mem_cgroup_uncharge_swap(swp_entry_t entry, unsigned int nr_pages);
 extern long mem_cgroup_get_nr_swap_pages(struct mem_cgroup *memcg);
 extern bool mem_cgroup_swap_full(struct page *page);
-#else
-static inline void mem_cgroup_swapout(struct page *page, swp_entry_t entry)
-{
-}
-
-static inline int mem_cgroup_try_charge_swap(struct page *page,
-					     swp_entry_t entry)
-{
-	return 0;
-}
-
-static inline void mem_cgroup_uncharge_swap(swp_entry_t entry,
-					    unsigned int nr_pages)
-{
-}
-
-static inline long mem_cgroup_get_nr_swap_pages(struct mem_cgroup *memcg)
-{
-	return get_nr_swap_pages();
-}
-
-static inline bool mem_cgroup_swap_full(struct page *page)
-{
-	return vm_swap_full();
-}
-#endif
 
 #endif /* __KERNEL__*/
 #endif /* _LINUX_SWAP_H */
diff --git a/include/linux/swap_cgroup.h b/include/linux/swap_cgroup.h
index a12dd1c3966c..5ac87b3b0e08 100644
--- a/include/linux/swap_cgroup.h
+++ b/include/linux/swap_cgroup.h
@@ -4,8 +4,6 @@
 
 #include <linux/swap.h>
 
-#ifdef CONFIG_MEMCG_SWAP
-
 extern unsigned short swap_cgroup_cmpxchg(swp_entry_t ent,
 					unsigned short old, unsigned short new);
 extern unsigned short swap_cgroup_record(swp_entry_t ent, unsigned short id,
@@ -14,32 +12,4 @@ extern unsigned short swap_cgroup_record(swp_entry_t ent, unsigned short id,
 extern int swap_cgroup_swapon(int type, unsigned long max_pages);
 extern void swap_cgroup_swapoff(int type);
 
-#else
-
-static inline
-unsigned short swap_cgroup_record(swp_entry_t ent, unsigned short id,
-				  unsigned int nr_ents)
-{
-	return 0;
-}
-
-static inline
-unsigned short lookup_swap_cgroup_id(swp_entry_t ent)
-{
-	return 0;
-}
-
-static inline int
-swap_cgroup_swapon(int type, unsigned long max_pages)
-{
-	return 0;
-}
-
-static inline void swap_cgroup_swapoff(int type)
-{
-	return;
-}
-
-#endif /* CONFIG_MEMCG_SWAP */
-
 #endif /* __LINUX_SWAP_CGROUP_H */
diff --git a/init/Kconfig b/init/Kconfig
index 9e22ee8fbd75..3b09c71e9b57 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -834,26 +834,6 @@ config MEMCG
 	help
 	  Provides control over the memory footprint of tasks in a cgroup.
 
-config MEMCG_SWAP
-	bool "Swap controller"
-	depends on MEMCG && SWAP
-	help
-	  Provides control over the swap space consumed by tasks in a cgroup.
-
-config MEMCG_SWAP_ENABLED
-	bool "Swap controller enabled by default"
-	depends on MEMCG_SWAP
-	default y
-	help
-	  Memory Resource Controller Swap Extension comes with its price in
-	  a bigger memory consumption. General purpose distribution kernels
-	  which want to enable the feature but keep it disabled by default
-	  and let the user enable it by swapaccount=1 boot command line
-	  parameter should have this option unselected.
-	  For those who want to have the feature enabled by default should
-	  select this option (if, for some reason, they need to disable it
-	  then swapaccount=0 does the trick).
-
 config MEMCG_KMEM
 	bool
 	depends on MEMCG && !SLOB
diff --git a/mm/Makefile b/mm/Makefile
index fccd3756b25f..812c08323355 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -58,7 +58,8 @@ ifdef CONFIG_MMU
 	obj-$(CONFIG_ADVISE_SYSCALLS)	+= madvise.o
 endif
 
-obj-$(CONFIG_SWAP)	+= page_io.o swap_state.o swapfile.o swap_slots.o
+obj-$(CONFIG_SWAP)	+= page_io.o swap_state.o swapfile.o swap_slots.o \
+			   swap_cgroup.o
 obj-$(CONFIG_FRONTSWAP)	+= frontswap.o
 obj-$(CONFIG_ZSWAP)	+= zswap.o
 obj-$(CONFIG_HAS_DMA)	+= dmapool.o
@@ -80,7 +81,6 @@ obj-$(CONFIG_MIGRATION) += migrate.o
 obj-$(CONFIG_TRANSPARENT_HUGEPAGE) += huge_memory.o khugepaged.o
 obj-$(CONFIG_PAGE_COUNTER) += page_counter.o
 obj-$(CONFIG_MEMCG) += memcontrol.o vmpressure.o
-obj-$(CONFIG_MEMCG_SWAP) += swap_cgroup.o
 obj-$(CONFIG_CGROUP_HUGETLB) += hugetlb_cgroup.o
 obj-$(CONFIG_GUP_BENCHMARK) += gup_benchmark.o
 obj-$(CONFIG_MEMORY_FAILURE) += memory-failure.o
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 5beea03dd58a..446141b6597a 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -82,11 +82,7 @@
 static bool cgroup_memory_nokmem;
 
 /* Whether the swap controller is active */
-#ifdef CONFIG_MEMCG_SWAP
 int do_swap_account __read_mostly;
-#else
-#define do_swap_account		0
-#endif
 
 #ifdef CONFIG_CGROUP_WRITEBACK
 static DECLARE_WAIT_QUEUE_HEAD(memcg_cgwb_frn_waitq);
@@ -3020,7 +3016,6 @@ void mem_cgroup_split_huge_fixup(struct page *head)
 }
 #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
 
-#ifdef CONFIG_MEMCG_SWAP
 /**
  * mem_cgroup_move_swap_account - move swap charge and swap_cgroup's record.
  * @entry: swap entry to be moved
@@ -3050,13 +3045,6 @@ static int mem_cgroup_move_swap_account(swp_entry_t entry,
 	}
 	return -EINVAL;
 }
-#else
-static inline int mem_cgroup_move_swap_account(swp_entry_t entry,
-				struct mem_cgroup *from, struct mem_cgroup *to)
-{
-	return -EINVAL;
-}
-#endif
 
 static DEFINE_MUTEX(memcg_max_mutex);
 
@@ -6936,7 +6924,6 @@ static int __init mem_cgroup_init(void)
 }
 subsys_initcall(mem_cgroup_init);
 
-#ifdef CONFIG_MEMCG_SWAP
 static struct mem_cgroup *mem_cgroup_id_get_online(struct mem_cgroup *memcg)
 {
 	while (!refcount_inc_not_zero(&memcg->id.ref)) {
@@ -7137,11 +7124,7 @@ bool mem_cgroup_swap_full(struct page *page)
 }
 
 /* for remember boot option*/
-#ifdef CONFIG_MEMCG_SWAP_ENABLED
 static int really_do_swap_account __initdata = 1;
-#else
-static int really_do_swap_account __initdata;
-#endif
 
 static int __init enable_swap_account(char *s)
 {
@@ -7257,4 +7240,3 @@ static int __init mem_cgroup_swap_init(void)
 }
 subsys_initcall(mem_cgroup_swap_init);
 
-#endif /* CONFIG_MEMCG_SWAP */
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/2] mm/swap: clean up parameter pass in swapin funcs
  2020-04-17 14:43 [PATCH 1/2] memcg: folding CONFIG_MEMCG_SWAP as default Alex Shi
@ 2020-04-17 14:43 ` Alex Shi
  2020-04-17 15:53 ` [PATCH 1/2] memcg: folding CONFIG_MEMCG_SWAP as default Michal Hocko
  2020-04-17 16:22 ` Shakeel Butt
  2 siblings, 0 replies; 9+ messages in thread
From: Alex Shi @ 2020-04-17 14:43 UTC (permalink / raw)
  Cc: Alex Shi, Andrew Morton, Johannes Weiner, linux-kernel, linux-mm

Folding parameter struct vm_area_struct *vma, unsigned long addr into
struct vm_fault vmf, this makes func path more readble, and give a chance
to pass more parameters as vmf fields on these series funcs.

No functional change yet.

Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Andrew Morton <akpm@linux-foundation.org> 
Cc: Johannes Weiner <hannes@cmpxchg.org> 
Cc: linux-kernel@vger.kernel.org 
Cc: linux-mm@kvack.org 
---
 include/linux/swap.h |  6 ++----
 mm/madvise.c         | 11 +++++++----
 mm/shmem.c           | 12 ++++++------
 mm/swap_state.c      | 23 ++++++++++-------------
 mm/swapfile.c        |  8 +++++---
 mm/zswap.c           |  3 ++-
 6 files changed, 32 insertions(+), 31 deletions(-)

diff --git a/include/linux/swap.h b/include/linux/swap.h
index 7e44e1e6ef27..5627fb8ca827 100644
--- a/include/linux/swap.h
+++ b/include/linux/swap.h
@@ -417,11 +417,9 @@ extern struct page *lookup_swap_cache(swp_entry_t entry,
 				      struct vm_area_struct *vma,
 				      unsigned long addr);
 extern struct page *read_swap_cache_async(swp_entry_t, gfp_t,
-			struct vm_area_struct *vma, unsigned long addr,
-			bool do_poll);
+				struct vm_fault *vmf, bool do_poll);
 extern struct page *__read_swap_cache_async(swp_entry_t, gfp_t,
-			struct vm_area_struct *vma, unsigned long addr,
-			bool *new_page_allocated);
+				struct vm_fault *vmf, bool *new_page_allocated);
 extern struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t flag,
 				struct vm_fault *vmf);
 extern struct page *swapin_readahead(swp_entry_t entry, gfp_t flag,
diff --git a/mm/madvise.c b/mm/madvise.c
index 4bb30ed6c8d2..e9bd80087dbb 100644
--- a/mm/madvise.c
+++ b/mm/madvise.c
@@ -184,8 +184,8 @@ static int swapin_walk_pmd_entry(pmd_t *pmd, unsigned long start,
 	unsigned long end, struct mm_walk *walk)
 {
 	pte_t *orig_pte;
-	struct vm_area_struct *vma = walk->private;
 	unsigned long index;
+	struct vm_fault vmf = { .vma = walk->private};
 
 	if (pmd_none_or_trans_huge_or_clear_bad(pmd))
 		return 0;
@@ -196,7 +196,8 @@ static int swapin_walk_pmd_entry(pmd_t *pmd, unsigned long start,
 		struct page *page;
 		spinlock_t *ptl;
 
-		orig_pte = pte_offset_map_lock(vma->vm_mm, pmd, start, &ptl);
+		orig_pte = pte_offset_map_lock(vmf.vma->vm_mm,
+						pmd, start, &ptl);
 		pte = *(orig_pte + ((index - start) / PAGE_SIZE));
 		pte_unmap_unlock(orig_pte, ptl);
 
@@ -206,8 +207,9 @@ static int swapin_walk_pmd_entry(pmd_t *pmd, unsigned long start,
 		if (unlikely(non_swap_entry(entry)))
 			continue;
 
+		vmf.address = index;
 		page = read_swap_cache_async(entry, GFP_HIGHUSER_MOVABLE,
-							vma, index, false);
+							&vmf, false);
 		if (page)
 			put_page(page);
 	}
@@ -226,6 +228,7 @@ static void force_shm_swapin_readahead(struct vm_area_struct *vma,
 	pgoff_t index;
 	struct page *page;
 	swp_entry_t swap;
+	struct vm_fault vmf = { .vma = NULL, .address = 0};
 
 	for (; start < end; start += PAGE_SIZE) {
 		index = ((start - vma->vm_start) >> PAGE_SHIFT) + vma->vm_pgoff;
@@ -238,7 +241,7 @@ static void force_shm_swapin_readahead(struct vm_area_struct *vma,
 		}
 		swap = radix_to_swp_entry(page);
 		page = read_swap_cache_async(swap, GFP_HIGHUSER_MOVABLE,
-							NULL, 0, false);
+							&vmf, false);
 		if (page)
 			put_page(page);
 	}
diff --git a/mm/shmem.c b/mm/shmem.c
index d722eb830317..d17dd0e49cef 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -1437,16 +1437,15 @@ static void shmem_pseudo_vma_destroy(struct vm_area_struct *vma)
 }
 
 static struct page *shmem_swapin(swp_entry_t swap, gfp_t gfp,
-			struct shmem_inode_info *info, pgoff_t index)
+	struct vm_fault *vmf, struct shmem_inode_info *info, pgoff_t index)
 {
 	struct vm_area_struct pvma;
 	struct page *page;
-	struct vm_fault vmf;
 
 	shmem_pseudo_vma_init(&pvma, info, index);
-	vmf.vma = &pvma;
-	vmf.address = 0;
-	page = swap_cluster_readahead(swap, gfp, &vmf);
+	vmf->vma = &pvma;
+	vmf->address = 0;
+	page = swap_cluster_readahead(swap, gfp, vmf);
 	shmem_pseudo_vma_destroy(&pvma);
 
 	return page;
@@ -1621,6 +1620,7 @@ static int shmem_swapin_page(struct inode *inode, pgoff_t index,
 	struct mm_struct *charge_mm = vma ? vma->vm_mm : current->mm;
 	struct mem_cgroup *memcg;
 	struct page *page;
+	struct vm_fault vmf;
 	swp_entry_t swap;
 	int error;
 
@@ -1638,7 +1638,7 @@ static int shmem_swapin_page(struct inode *inode, pgoff_t index,
 			count_memcg_event_mm(charge_mm, PGMAJFAULT);
 		}
 		/* Here we actually start the io */
-		page = shmem_swapin(swap, gfp, info, index);
+		page = shmem_swapin(swap, gfp, &vmf, info, index);
 		if (!page) {
 			error = -ENOMEM;
 			goto failed;
diff --git a/mm/swap_state.c b/mm/swap_state.c
index ebed37bbf7a3..b04f9893608d 100644
--- a/mm/swap_state.c
+++ b/mm/swap_state.c
@@ -357,11 +357,12 @@ struct page *lookup_swap_cache(swp_entry_t entry, struct vm_area_struct *vma,
 }
 
 struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
-			struct vm_area_struct *vma, unsigned long addr,
-			bool *new_page_allocated)
+			struct vm_fault *vmf, bool *new_page_allocated)
 {
 	struct page *found_page = NULL, *new_page = NULL;
 	struct swap_info_struct *si;
+	struct vm_area_struct *vma = vmf->vma;
+	unsigned long addr = vmf->address;
 	int err;
 	*new_page_allocated = false;
 
@@ -446,11 +447,11 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
  * the swap entry is no longer in use.
  */
 struct page *read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask,
-		struct vm_area_struct *vma, unsigned long addr, bool do_poll)
+					struct vm_fault *vmf, bool do_poll)
 {
 	bool page_was_allocated;
 	struct page *retpage = __read_swap_cache_async(entry, gfp_mask,
-			vma, addr, &page_was_allocated);
+						vmf, &page_was_allocated);
 
 	if (page_was_allocated)
 		swap_readpage(retpage, do_poll);
@@ -547,8 +548,6 @@ struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask,
 	struct swap_info_struct *si = swp_swap_info(entry);
 	struct blk_plug plug;
 	bool do_poll = true, page_allocated;
-	struct vm_area_struct *vma = vmf->vma;
-	unsigned long addr = vmf->address;
 
 	mask = swapin_nr_pages(offset) - 1;
 	if (!mask)
@@ -575,7 +574,7 @@ struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask,
 		/* Ok, do the async read-ahead now */
 		page = __read_swap_cache_async(
 			swp_entry(swp_type(entry), offset),
-			gfp_mask, vma, addr, &page_allocated);
+			gfp_mask, vmf, &page_allocated);
 		if (!page)
 			continue;
 		if (page_allocated) {
@@ -591,7 +590,7 @@ struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask,
 
 	lru_add_drain();	/* Push any new pages onto the LRU now */
 skip:
-	return read_swap_cache_async(entry, gfp_mask, vma, addr, do_poll);
+	return read_swap_cache_async(entry, gfp_mask, vmf, do_poll);
 }
 
 int init_swap_address_space(unsigned int type, unsigned long nr_pages)
@@ -723,7 +722,6 @@ static struct page *swap_vma_readahead(swp_entry_t fentry, gfp_t gfp_mask,
 				       struct vm_fault *vmf)
 {
 	struct blk_plug plug;
-	struct vm_area_struct *vma = vmf->vma;
 	struct page *page;
 	pte_t *pte, pentry;
 	swp_entry_t entry;
@@ -746,8 +744,8 @@ static struct page *swap_vma_readahead(swp_entry_t fentry, gfp_t gfp_mask,
 		entry = pte_to_swp_entry(pentry);
 		if (unlikely(non_swap_entry(entry)))
 			continue;
-		page = __read_swap_cache_async(entry, gfp_mask, vma,
-					       vmf->address, &page_allocated);
+		page = __read_swap_cache_async(entry, gfp_mask, vmf,
+					       &page_allocated);
 		if (!page)
 			continue;
 		if (page_allocated) {
@@ -762,8 +760,7 @@ static struct page *swap_vma_readahead(swp_entry_t fentry, gfp_t gfp_mask,
 	blk_finish_plug(&plug);
 	lru_add_drain();
 skip:
-	return read_swap_cache_async(fentry, gfp_mask, vma, vmf->address,
-				     ra_info.win == 1);
+	return read_swap_cache_async(fentry, gfp_mask, vmf, ra_info.win == 1);
 }
 
 /**
diff --git a/mm/swapfile.c b/mm/swapfile.c
index 5871a2aa86a5..7b3a4aff7d35 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1850,11 +1850,13 @@ static inline int pte_same_as_swp(pte_t pte, pte_t swp_pte)
  * just let do_wp_page work it out if a write is requested later - to
  * force COW, vm_page_prot omits write permission from any private vma.
  */
-static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd,
-		unsigned long addr, swp_entry_t entry, struct page *page)
+static int unuse_pte(struct vm_fault *vmf, swp_entry_t entry, struct page *page)
 {
 	struct page *swapcache;
 	struct mem_cgroup *memcg;
+	struct vm_area_struct *vma = vmf->vma;
+	pmd_t *pmd = vmf->pmd;
+	unsigned long addr = vmf->address;
 	spinlock_t *ptl;
 	pte_t *pte;
 	int ret = 1;
@@ -1949,7 +1951,7 @@ static int unuse_pte_range(struct vm_area_struct *vma, pmd_t *pmd,
 
 		lock_page(page);
 		wait_on_page_writeback(page);
-		ret = unuse_pte(vma, pmd, addr, entry, page);
+		ret = unuse_pte(&vmf, entry, page);
 		if (ret < 0) {
 			unlock_page(page);
 			put_page(page);
diff --git a/mm/zswap.c b/mm/zswap.c
index fbb782924ccc..ef5a3fe442d6 100644
--- a/mm/zswap.c
+++ b/mm/zswap.c
@@ -846,9 +846,10 @@ static int zswap_get_swap_cache_page(swp_entry_t entry,
 				struct page **retpage)
 {
 	bool page_was_allocated;
+	struct vm_fault vmf = { .vma = NULL, .address = 0};
 
 	*retpage = __read_swap_cache_async(entry, GFP_KERNEL,
-			NULL, 0, &page_was_allocated);
+			&vmf, &page_was_allocated);
 	if (page_was_allocated)
 		return ZSWAP_SWAPCACHE_NEW;
 	if (!*retpage)
-- 
1.8.3.1



^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] memcg: folding CONFIG_MEMCG_SWAP as default
  2020-04-17 14:43 [PATCH 1/2] memcg: folding CONFIG_MEMCG_SWAP as default Alex Shi
  2020-04-17 14:43 ` [PATCH 2/2] mm/swap: clean up parameter pass in swapin funcs Alex Shi
@ 2020-04-17 15:53 ` Michal Hocko
  2020-04-17 16:41   ` Shakeel Butt
  2020-04-17 16:22 ` Shakeel Butt
  2 siblings, 1 reply; 9+ messages in thread
From: Michal Hocko @ 2020-04-17 15:53 UTC (permalink / raw)
  To: Alex Shi; +Cc: Andrew Morton, Johannes Weiner, linux-mm, cgroups

On Fri 17-04-20 22:43:43, Alex Shi wrote:
> This patch fold MEMCG_SWAP feature into kernel as default function. That
> required a short size memcg id for each of page. As Johannes mentioned
> 
> "the overhead of tracking is tiny - 512k per G of swap (0.04%).'
> 
> So all swapout page could be tracked for its memcg id.

I am perfectly OK with dropping the CONFIG_MEMCG_SWAP. The code that is
guarded by it is negligible and the resulting code is much easier to
read so no objection on that front. I just do not really see any real
reason to flip the default for cgroup v1. Why do we want/need that?
 
> Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
> Cc: Andrew Morton <akpm@linux-foundation.org> 
> Cc: Johannes Weiner <hannes@cmpxchg.org> 
> Cc: linux-mm@kvack.org 
> Cc: cgroups@vger.kernel.org 
> ---
>  arch/arm/configs/omap2plus_defconfig        |  1 -
>  arch/arm64/configs/defconfig                |  1 -
>  arch/mips/configs/db1xxx_defconfig          |  1 -
>  arch/mips/configs/generic_defconfig         |  1 -
>  arch/mips/configs/loongson3_defconfig       |  1 -
>  arch/parisc/configs/generic-64bit_defconfig |  1 -
>  arch/powerpc/configs/powernv_defconfig      |  1 -
>  arch/powerpc/configs/pseries_defconfig      |  1 -
>  arch/s390/configs/debug_defconfig           |  1 -
>  arch/s390/configs/defconfig                 |  1 -
>  arch/sh/configs/sdk7786_defconfig           |  1 -
>  arch/sh/configs/urquell_defconfig           |  1 -
>  include/linux/memcontrol.h                  |  2 --
>  include/linux/swap.h                        | 27 --------------------------
>  include/linux/swap_cgroup.h                 | 30 -----------------------------
>  init/Kconfig                                | 20 -------------------
>  mm/Makefile                                 |  4 ++--
>  mm/memcontrol.c                             | 18 -----------------
>  18 files changed, 2 insertions(+), 111 deletions(-)
> 
> diff --git a/arch/arm/configs/omap2plus_defconfig b/arch/arm/configs/omap2plus_defconfig
> index 3cc3ca5fa027..929cfc4c062a 100644
> --- a/arch/arm/configs/omap2plus_defconfig
> +++ b/arch/arm/configs/omap2plus_defconfig
> @@ -10,7 +10,6 @@ CONFIG_IKCONFIG_PROC=y
>  CONFIG_LOG_BUF_SHIFT=16
>  CONFIG_CGROUPS=y
>  CONFIG_MEMCG=y
> -CONFIG_MEMCG_SWAP=y
>  CONFIG_BLK_CGROUP=y
>  CONFIG_CGROUP_SCHED=y
>  CONFIG_CFS_BANDWIDTH=y
> diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
> index 24e534d85045..3854c6140a04 100644
> --- a/arch/arm64/configs/defconfig
> +++ b/arch/arm64/configs/defconfig
> @@ -13,7 +13,6 @@ CONFIG_IKCONFIG=y
>  CONFIG_IKCONFIG_PROC=y
>  CONFIG_NUMA_BALANCING=y
>  CONFIG_MEMCG=y
> -CONFIG_MEMCG_SWAP=y
>  CONFIG_BLK_CGROUP=y
>  CONFIG_CGROUP_PIDS=y
>  CONFIG_CGROUP_HUGETLB=y
> diff --git a/arch/mips/configs/db1xxx_defconfig b/arch/mips/configs/db1xxx_defconfig
> index e6f3e8e3da39..4604e826d03f 100644
> --- a/arch/mips/configs/db1xxx_defconfig
> +++ b/arch/mips/configs/db1xxx_defconfig
> @@ -9,7 +9,6 @@ CONFIG_HIGH_RES_TIMERS=y
>  CONFIG_LOG_BUF_SHIFT=16
>  CONFIG_CGROUPS=y
>  CONFIG_MEMCG=y
> -CONFIG_MEMCG_SWAP=y
>  CONFIG_BLK_CGROUP=y
>  CONFIG_CGROUP_SCHED=y
>  CONFIG_CFS_BANDWIDTH=y
> diff --git a/arch/mips/configs/generic_defconfig b/arch/mips/configs/generic_defconfig
> index 714169e411cf..48e4e251779b 100644
> --- a/arch/mips/configs/generic_defconfig
> +++ b/arch/mips/configs/generic_defconfig
> @@ -3,7 +3,6 @@ CONFIG_NO_HZ_IDLE=y
>  CONFIG_IKCONFIG=y
>  CONFIG_IKCONFIG_PROC=y
>  CONFIG_MEMCG=y
> -CONFIG_MEMCG_SWAP=y
>  CONFIG_BLK_CGROUP=y
>  CONFIG_CFS_BANDWIDTH=y
>  CONFIG_RT_GROUP_SCHED=y
> diff --git a/arch/mips/configs/loongson3_defconfig b/arch/mips/configs/loongson3_defconfig
> index 51675f5000d6..f91f49595100 100644
> --- a/arch/mips/configs/loongson3_defconfig
> +++ b/arch/mips/configs/loongson3_defconfig
> @@ -13,7 +13,6 @@ CONFIG_TASK_DELAY_ACCT=y
>  CONFIG_TASK_XACCT=y
>  CONFIG_TASK_IO_ACCOUNTING=y
>  CONFIG_MEMCG=y
> -CONFIG_MEMCG_SWAP=y
>  CONFIG_BLK_CGROUP=y
>  CONFIG_CPUSETS=y
>  CONFIG_SCHED_AUTOGROUP=y
> diff --git a/arch/parisc/configs/generic-64bit_defconfig b/arch/parisc/configs/generic-64bit_defconfig
> index 59561e04e659..fab4401dd9f6 100644
> --- a/arch/parisc/configs/generic-64bit_defconfig
> +++ b/arch/parisc/configs/generic-64bit_defconfig
> @@ -10,7 +10,6 @@ CONFIG_TASK_XACCT=y
>  CONFIG_TASK_IO_ACCOUNTING=y
>  CONFIG_CGROUPS=y
>  CONFIG_MEMCG=y
> -CONFIG_MEMCG_SWAP=y
>  CONFIG_CGROUP_PIDS=y
>  CONFIG_CPUSETS=y
>  CONFIG_RELAY=y
> diff --git a/arch/powerpc/configs/powernv_defconfig b/arch/powerpc/configs/powernv_defconfig
> index df8bdbaa5d8f..c60c3a4125f7 100644
> --- a/arch/powerpc/configs/powernv_defconfig
> +++ b/arch/powerpc/configs/powernv_defconfig
> @@ -17,7 +17,6 @@ CONFIG_LOG_CPU_MAX_BUF_SHIFT=13
>  CONFIG_NUMA_BALANCING=y
>  CONFIG_CGROUPS=y
>  CONFIG_MEMCG=y
> -CONFIG_MEMCG_SWAP=y
>  CONFIG_CGROUP_SCHED=y
>  CONFIG_CGROUP_FREEZER=y
>  CONFIG_CPUSETS=y
> diff --git a/arch/powerpc/configs/pseries_defconfig b/arch/powerpc/configs/pseries_defconfig
> index 0bea4d3ffb85..426244f491ec 100644
> --- a/arch/powerpc/configs/pseries_defconfig
> +++ b/arch/powerpc/configs/pseries_defconfig
> @@ -16,7 +16,6 @@ CONFIG_LOG_CPU_MAX_BUF_SHIFT=13
>  CONFIG_NUMA_BALANCING=y
>  CONFIG_CGROUPS=y
>  CONFIG_MEMCG=y
> -CONFIG_MEMCG_SWAP=y
>  CONFIG_CGROUP_SCHED=y
>  CONFIG_CGROUP_FREEZER=y
>  CONFIG_CPUSETS=y
> diff --git a/arch/s390/configs/debug_defconfig b/arch/s390/configs/debug_defconfig
> index 46038bc58c9e..864b32403673 100644
> --- a/arch/s390/configs/debug_defconfig
> +++ b/arch/s390/configs/debug_defconfig
> @@ -14,7 +14,6 @@ CONFIG_IKCONFIG=y
>  CONFIG_IKCONFIG_PROC=y
>  CONFIG_NUMA_BALANCING=y
>  CONFIG_MEMCG=y
> -CONFIG_MEMCG_SWAP=y
>  CONFIG_BLK_CGROUP=y
>  CONFIG_CFS_BANDWIDTH=y
>  CONFIG_RT_GROUP_SCHED=y
> diff --git a/arch/s390/configs/defconfig b/arch/s390/configs/defconfig
> index 7cd0648c1f4e..991e60d7c307 100644
> --- a/arch/s390/configs/defconfig
> +++ b/arch/s390/configs/defconfig
> @@ -13,7 +13,6 @@ CONFIG_IKCONFIG=y
>  CONFIG_IKCONFIG_PROC=y
>  CONFIG_NUMA_BALANCING=y
>  CONFIG_MEMCG=y
> -CONFIG_MEMCG_SWAP=y
>  CONFIG_BLK_CGROUP=y
>  CONFIG_CFS_BANDWIDTH=y
>  CONFIG_RT_GROUP_SCHED=y
> diff --git a/arch/sh/configs/sdk7786_defconfig b/arch/sh/configs/sdk7786_defconfig
> index 7fa116b436c3..4a1d19068a8d 100644
> --- a/arch/sh/configs/sdk7786_defconfig
> +++ b/arch/sh/configs/sdk7786_defconfig
> @@ -17,7 +17,6 @@ CONFIG_CPUSETS=y
>  # CONFIG_PROC_PID_CPUSET is not set
>  CONFIG_CGROUP_CPUACCT=y
>  CONFIG_CGROUP_MEMCG=y
> -CONFIG_CGROUP_MEMCG_SWAP=y
>  CONFIG_CGROUP_SCHED=y
>  CONFIG_RT_GROUP_SCHED=y
>  CONFIG_BLK_CGROUP=y
> diff --git a/arch/sh/configs/urquell_defconfig b/arch/sh/configs/urquell_defconfig
> index cb2f56468fe0..be478f3148f2 100644
> --- a/arch/sh/configs/urquell_defconfig
> +++ b/arch/sh/configs/urquell_defconfig
> @@ -14,7 +14,6 @@ CONFIG_CPUSETS=y
>  # CONFIG_PROC_PID_CPUSET is not set
>  CONFIG_CGROUP_CPUACCT=y
>  CONFIG_CGROUP_MEMCG=y
> -CONFIG_CGROUP_MEMCG_SWAP=y
>  CONFIG_CGROUP_SCHED=y
>  CONFIG_RT_GROUP_SCHED=y
>  CONFIG_BLK_DEV_INITRD=y
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 1b4150ff64be..45842ed8ba53 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -569,9 +569,7 @@ struct mem_cgroup *mem_cgroup_get_oom_group(struct task_struct *victim,
>  					    struct mem_cgroup *oom_domain);
>  void mem_cgroup_print_oom_group(struct mem_cgroup *memcg);
>  
> -#ifdef CONFIG_MEMCG_SWAP
>  extern int do_swap_account;
> -#endif
>  
>  struct mem_cgroup *lock_page_memcg(struct page *page);
>  void __unlock_page_memcg(struct mem_cgroup *memcg);
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index b835d8dbea0e..7e44e1e6ef27 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -654,38 +654,11 @@ static inline void mem_cgroup_throttle_swaprate(struct mem_cgroup *memcg,
>  }
>  #endif
>  
> -#ifdef CONFIG_MEMCG_SWAP
>  extern void mem_cgroup_swapout(struct page *page, swp_entry_t entry);
>  extern int mem_cgroup_try_charge_swap(struct page *page, swp_entry_t entry);
>  extern void mem_cgroup_uncharge_swap(swp_entry_t entry, unsigned int nr_pages);
>  extern long mem_cgroup_get_nr_swap_pages(struct mem_cgroup *memcg);
>  extern bool mem_cgroup_swap_full(struct page *page);
> -#else
> -static inline void mem_cgroup_swapout(struct page *page, swp_entry_t entry)
> -{
> -}
> -
> -static inline int mem_cgroup_try_charge_swap(struct page *page,
> -					     swp_entry_t entry)
> -{
> -	return 0;
> -}
> -
> -static inline void mem_cgroup_uncharge_swap(swp_entry_t entry,
> -					    unsigned int nr_pages)
> -{
> -}
> -
> -static inline long mem_cgroup_get_nr_swap_pages(struct mem_cgroup *memcg)
> -{
> -	return get_nr_swap_pages();
> -}
> -
> -static inline bool mem_cgroup_swap_full(struct page *page)
> -{
> -	return vm_swap_full();
> -}
> -#endif
>  
>  #endif /* __KERNEL__*/
>  #endif /* _LINUX_SWAP_H */
> diff --git a/include/linux/swap_cgroup.h b/include/linux/swap_cgroup.h
> index a12dd1c3966c..5ac87b3b0e08 100644
> --- a/include/linux/swap_cgroup.h
> +++ b/include/linux/swap_cgroup.h
> @@ -4,8 +4,6 @@
>  
>  #include <linux/swap.h>
>  
> -#ifdef CONFIG_MEMCG_SWAP
> -
>  extern unsigned short swap_cgroup_cmpxchg(swp_entry_t ent,
>  					unsigned short old, unsigned short new);
>  extern unsigned short swap_cgroup_record(swp_entry_t ent, unsigned short id,
> @@ -14,32 +12,4 @@ extern unsigned short swap_cgroup_record(swp_entry_t ent, unsigned short id,
>  extern int swap_cgroup_swapon(int type, unsigned long max_pages);
>  extern void swap_cgroup_swapoff(int type);
>  
> -#else
> -
> -static inline
> -unsigned short swap_cgroup_record(swp_entry_t ent, unsigned short id,
> -				  unsigned int nr_ents)
> -{
> -	return 0;
> -}
> -
> -static inline
> -unsigned short lookup_swap_cgroup_id(swp_entry_t ent)
> -{
> -	return 0;
> -}
> -
> -static inline int
> -swap_cgroup_swapon(int type, unsigned long max_pages)
> -{
> -	return 0;
> -}
> -
> -static inline void swap_cgroup_swapoff(int type)
> -{
> -	return;
> -}
> -
> -#endif /* CONFIG_MEMCG_SWAP */
> -
>  #endif /* __LINUX_SWAP_CGROUP_H */
> diff --git a/init/Kconfig b/init/Kconfig
> index 9e22ee8fbd75..3b09c71e9b57 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -834,26 +834,6 @@ config MEMCG
>  	help
>  	  Provides control over the memory footprint of tasks in a cgroup.
>  
> -config MEMCG_SWAP
> -	bool "Swap controller"
> -	depends on MEMCG && SWAP
> -	help
> -	  Provides control over the swap space consumed by tasks in a cgroup.
> -
> -config MEMCG_SWAP_ENABLED
> -	bool "Swap controller enabled by default"
> -	depends on MEMCG_SWAP
> -	default y
> -	help
> -	  Memory Resource Controller Swap Extension comes with its price in
> -	  a bigger memory consumption. General purpose distribution kernels
> -	  which want to enable the feature but keep it disabled by default
> -	  and let the user enable it by swapaccount=1 boot command line
> -	  parameter should have this option unselected.
> -	  For those who want to have the feature enabled by default should
> -	  select this option (if, for some reason, they need to disable it
> -	  then swapaccount=0 does the trick).
> -
>  config MEMCG_KMEM
>  	bool
>  	depends on MEMCG && !SLOB
> diff --git a/mm/Makefile b/mm/Makefile
> index fccd3756b25f..812c08323355 100644
> --- a/mm/Makefile
> +++ b/mm/Makefile
> @@ -58,7 +58,8 @@ ifdef CONFIG_MMU
>  	obj-$(CONFIG_ADVISE_SYSCALLS)	+= madvise.o
>  endif
>  
> -obj-$(CONFIG_SWAP)	+= page_io.o swap_state.o swapfile.o swap_slots.o
> +obj-$(CONFIG_SWAP)	+= page_io.o swap_state.o swapfile.o swap_slots.o \
> +			   swap_cgroup.o
>  obj-$(CONFIG_FRONTSWAP)	+= frontswap.o
>  obj-$(CONFIG_ZSWAP)	+= zswap.o
>  obj-$(CONFIG_HAS_DMA)	+= dmapool.o
> @@ -80,7 +81,6 @@ obj-$(CONFIG_MIGRATION) += migrate.o
>  obj-$(CONFIG_TRANSPARENT_HUGEPAGE) += huge_memory.o khugepaged.o
>  obj-$(CONFIG_PAGE_COUNTER) += page_counter.o
>  obj-$(CONFIG_MEMCG) += memcontrol.o vmpressure.o
> -obj-$(CONFIG_MEMCG_SWAP) += swap_cgroup.o
>  obj-$(CONFIG_CGROUP_HUGETLB) += hugetlb_cgroup.o
>  obj-$(CONFIG_GUP_BENCHMARK) += gup_benchmark.o
>  obj-$(CONFIG_MEMORY_FAILURE) += memory-failure.o
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 5beea03dd58a..446141b6597a 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -82,11 +82,7 @@
>  static bool cgroup_memory_nokmem;
>  
>  /* Whether the swap controller is active */
> -#ifdef CONFIG_MEMCG_SWAP
>  int do_swap_account __read_mostly;
> -#else
> -#define do_swap_account		0
> -#endif
>  
>  #ifdef CONFIG_CGROUP_WRITEBACK
>  static DECLARE_WAIT_QUEUE_HEAD(memcg_cgwb_frn_waitq);
> @@ -3020,7 +3016,6 @@ void mem_cgroup_split_huge_fixup(struct page *head)
>  }
>  #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
>  
> -#ifdef CONFIG_MEMCG_SWAP
>  /**
>   * mem_cgroup_move_swap_account - move swap charge and swap_cgroup's record.
>   * @entry: swap entry to be moved
> @@ -3050,13 +3045,6 @@ static int mem_cgroup_move_swap_account(swp_entry_t entry,
>  	}
>  	return -EINVAL;
>  }
> -#else
> -static inline int mem_cgroup_move_swap_account(swp_entry_t entry,
> -				struct mem_cgroup *from, struct mem_cgroup *to)
> -{
> -	return -EINVAL;
> -}
> -#endif
>  
>  static DEFINE_MUTEX(memcg_max_mutex);
>  
> @@ -6936,7 +6924,6 @@ static int __init mem_cgroup_init(void)
>  }
>  subsys_initcall(mem_cgroup_init);
>  
> -#ifdef CONFIG_MEMCG_SWAP
>  static struct mem_cgroup *mem_cgroup_id_get_online(struct mem_cgroup *memcg)
>  {
>  	while (!refcount_inc_not_zero(&memcg->id.ref)) {
> @@ -7137,11 +7124,7 @@ bool mem_cgroup_swap_full(struct page *page)
>  }
>  
>  /* for remember boot option*/
> -#ifdef CONFIG_MEMCG_SWAP_ENABLED
>  static int really_do_swap_account __initdata = 1;
> -#else
> -static int really_do_swap_account __initdata;
> -#endif
>  
>  static int __init enable_swap_account(char *s)
>  {
> @@ -7257,4 +7240,3 @@ static int __init mem_cgroup_swap_init(void)
>  }
>  subsys_initcall(mem_cgroup_swap_init);
>  
> -#endif /* CONFIG_MEMCG_SWAP */
> -- 
> 1.8.3.1
> 

-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] memcg: folding CONFIG_MEMCG_SWAP as default
  2020-04-17 14:43 [PATCH 1/2] memcg: folding CONFIG_MEMCG_SWAP as default Alex Shi
  2020-04-17 14:43 ` [PATCH 2/2] mm/swap: clean up parameter pass in swapin funcs Alex Shi
  2020-04-17 15:53 ` [PATCH 1/2] memcg: folding CONFIG_MEMCG_SWAP as default Michal Hocko
@ 2020-04-17 16:22 ` Shakeel Butt
  2 siblings, 0 replies; 9+ messages in thread
From: Shakeel Butt @ 2020-04-17 16:22 UTC (permalink / raw)
  To: Alex Shi; +Cc: Andrew Morton, Johannes Weiner, Linux MM, Cgroups

On Fri, Apr 17, 2020 at 7:44 AM Alex Shi <alex.shi@linux.alibaba.com> wrote:
>
> This patch fold MEMCG_SWAP feature into kernel as default function. That
> required a short size memcg id for each of page. As Johannes mentioned
>
> "the overhead of tracking is tiny - 512k per G of swap (0.04%).'
>
> So all swapout page could be tracked for its memcg id.
>
> Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: linux-mm@kvack.org
> Cc: cgroups@vger.kernel.org
> ---
>  arch/arm/configs/omap2plus_defconfig        |  1 -
>  arch/arm64/configs/defconfig                |  1 -
>  arch/mips/configs/db1xxx_defconfig          |  1 -
>  arch/mips/configs/generic_defconfig         |  1 -
>  arch/mips/configs/loongson3_defconfig       |  1 -
>  arch/parisc/configs/generic-64bit_defconfig |  1 -
>  arch/powerpc/configs/powernv_defconfig      |  1 -
>  arch/powerpc/configs/pseries_defconfig      |  1 -
>  arch/s390/configs/debug_defconfig           |  1 -
>  arch/s390/configs/defconfig                 |  1 -
>  arch/sh/configs/sdk7786_defconfig           |  1 -
>  arch/sh/configs/urquell_defconfig           |  1 -
>  include/linux/memcontrol.h                  |  2 --
>  include/linux/swap.h                        | 27 --------------------------
>  include/linux/swap_cgroup.h                 | 30 -----------------------------
>  init/Kconfig                                | 20 -------------------
>  mm/Makefile                                 |  4 ++--
>  mm/memcontrol.c                             | 18 -----------------
>  18 files changed, 2 insertions(+), 111 deletions(-)
>
> diff --git a/arch/arm/configs/omap2plus_defconfig b/arch/arm/configs/omap2plus_defconfig
> index 3cc3ca5fa027..929cfc4c062a 100644
> --- a/arch/arm/configs/omap2plus_defconfig
> +++ b/arch/arm/configs/omap2plus_defconfig
> @@ -10,7 +10,6 @@ CONFIG_IKCONFIG_PROC=y
>  CONFIG_LOG_BUF_SHIFT=16
>  CONFIG_CGROUPS=y
>  CONFIG_MEMCG=y
> -CONFIG_MEMCG_SWAP=y
>  CONFIG_BLK_CGROUP=y
>  CONFIG_CGROUP_SCHED=y
>  CONFIG_CFS_BANDWIDTH=y
> diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
> index 24e534d85045..3854c6140a04 100644
> --- a/arch/arm64/configs/defconfig
> +++ b/arch/arm64/configs/defconfig
> @@ -13,7 +13,6 @@ CONFIG_IKCONFIG=y
>  CONFIG_IKCONFIG_PROC=y
>  CONFIG_NUMA_BALANCING=y
>  CONFIG_MEMCG=y
> -CONFIG_MEMCG_SWAP=y
>  CONFIG_BLK_CGROUP=y
>  CONFIG_CGROUP_PIDS=y
>  CONFIG_CGROUP_HUGETLB=y
> diff --git a/arch/mips/configs/db1xxx_defconfig b/arch/mips/configs/db1xxx_defconfig
> index e6f3e8e3da39..4604e826d03f 100644
> --- a/arch/mips/configs/db1xxx_defconfig
> +++ b/arch/mips/configs/db1xxx_defconfig
> @@ -9,7 +9,6 @@ CONFIG_HIGH_RES_TIMERS=y
>  CONFIG_LOG_BUF_SHIFT=16
>  CONFIG_CGROUPS=y
>  CONFIG_MEMCG=y
> -CONFIG_MEMCG_SWAP=y
>  CONFIG_BLK_CGROUP=y
>  CONFIG_CGROUP_SCHED=y
>  CONFIG_CFS_BANDWIDTH=y
> diff --git a/arch/mips/configs/generic_defconfig b/arch/mips/configs/generic_defconfig
> index 714169e411cf..48e4e251779b 100644
> --- a/arch/mips/configs/generic_defconfig
> +++ b/arch/mips/configs/generic_defconfig
> @@ -3,7 +3,6 @@ CONFIG_NO_HZ_IDLE=y
>  CONFIG_IKCONFIG=y
>  CONFIG_IKCONFIG_PROC=y
>  CONFIG_MEMCG=y
> -CONFIG_MEMCG_SWAP=y
>  CONFIG_BLK_CGROUP=y
>  CONFIG_CFS_BANDWIDTH=y
>  CONFIG_RT_GROUP_SCHED=y
> diff --git a/arch/mips/configs/loongson3_defconfig b/arch/mips/configs/loongson3_defconfig
> index 51675f5000d6..f91f49595100 100644
> --- a/arch/mips/configs/loongson3_defconfig
> +++ b/arch/mips/configs/loongson3_defconfig
> @@ -13,7 +13,6 @@ CONFIG_TASK_DELAY_ACCT=y
>  CONFIG_TASK_XACCT=y
>  CONFIG_TASK_IO_ACCOUNTING=y
>  CONFIG_MEMCG=y
> -CONFIG_MEMCG_SWAP=y
>  CONFIG_BLK_CGROUP=y
>  CONFIG_CPUSETS=y
>  CONFIG_SCHED_AUTOGROUP=y
> diff --git a/arch/parisc/configs/generic-64bit_defconfig b/arch/parisc/configs/generic-64bit_defconfig
> index 59561e04e659..fab4401dd9f6 100644
> --- a/arch/parisc/configs/generic-64bit_defconfig
> +++ b/arch/parisc/configs/generic-64bit_defconfig
> @@ -10,7 +10,6 @@ CONFIG_TASK_XACCT=y
>  CONFIG_TASK_IO_ACCOUNTING=y
>  CONFIG_CGROUPS=y
>  CONFIG_MEMCG=y
> -CONFIG_MEMCG_SWAP=y
>  CONFIG_CGROUP_PIDS=y
>  CONFIG_CPUSETS=y
>  CONFIG_RELAY=y
> diff --git a/arch/powerpc/configs/powernv_defconfig b/arch/powerpc/configs/powernv_defconfig
> index df8bdbaa5d8f..c60c3a4125f7 100644
> --- a/arch/powerpc/configs/powernv_defconfig
> +++ b/arch/powerpc/configs/powernv_defconfig
> @@ -17,7 +17,6 @@ CONFIG_LOG_CPU_MAX_BUF_SHIFT=13
>  CONFIG_NUMA_BALANCING=y
>  CONFIG_CGROUPS=y
>  CONFIG_MEMCG=y
> -CONFIG_MEMCG_SWAP=y
>  CONFIG_CGROUP_SCHED=y
>  CONFIG_CGROUP_FREEZER=y
>  CONFIG_CPUSETS=y
> diff --git a/arch/powerpc/configs/pseries_defconfig b/arch/powerpc/configs/pseries_defconfig
> index 0bea4d3ffb85..426244f491ec 100644
> --- a/arch/powerpc/configs/pseries_defconfig
> +++ b/arch/powerpc/configs/pseries_defconfig
> @@ -16,7 +16,6 @@ CONFIG_LOG_CPU_MAX_BUF_SHIFT=13
>  CONFIG_NUMA_BALANCING=y
>  CONFIG_CGROUPS=y
>  CONFIG_MEMCG=y
> -CONFIG_MEMCG_SWAP=y
>  CONFIG_CGROUP_SCHED=y
>  CONFIG_CGROUP_FREEZER=y
>  CONFIG_CPUSETS=y
> diff --git a/arch/s390/configs/debug_defconfig b/arch/s390/configs/debug_defconfig
> index 46038bc58c9e..864b32403673 100644
> --- a/arch/s390/configs/debug_defconfig
> +++ b/arch/s390/configs/debug_defconfig
> @@ -14,7 +14,6 @@ CONFIG_IKCONFIG=y
>  CONFIG_IKCONFIG_PROC=y
>  CONFIG_NUMA_BALANCING=y
>  CONFIG_MEMCG=y
> -CONFIG_MEMCG_SWAP=y
>  CONFIG_BLK_CGROUP=y
>  CONFIG_CFS_BANDWIDTH=y
>  CONFIG_RT_GROUP_SCHED=y
> diff --git a/arch/s390/configs/defconfig b/arch/s390/configs/defconfig
> index 7cd0648c1f4e..991e60d7c307 100644
> --- a/arch/s390/configs/defconfig
> +++ b/arch/s390/configs/defconfig
> @@ -13,7 +13,6 @@ CONFIG_IKCONFIG=y
>  CONFIG_IKCONFIG_PROC=y
>  CONFIG_NUMA_BALANCING=y
>  CONFIG_MEMCG=y
> -CONFIG_MEMCG_SWAP=y
>  CONFIG_BLK_CGROUP=y
>  CONFIG_CFS_BANDWIDTH=y
>  CONFIG_RT_GROUP_SCHED=y
> diff --git a/arch/sh/configs/sdk7786_defconfig b/arch/sh/configs/sdk7786_defconfig
> index 7fa116b436c3..4a1d19068a8d 100644
> --- a/arch/sh/configs/sdk7786_defconfig
> +++ b/arch/sh/configs/sdk7786_defconfig
> @@ -17,7 +17,6 @@ CONFIG_CPUSETS=y
>  # CONFIG_PROC_PID_CPUSET is not set
>  CONFIG_CGROUP_CPUACCT=y
>  CONFIG_CGROUP_MEMCG=y
> -CONFIG_CGROUP_MEMCG_SWAP=y
>  CONFIG_CGROUP_SCHED=y
>  CONFIG_RT_GROUP_SCHED=y
>  CONFIG_BLK_CGROUP=y
> diff --git a/arch/sh/configs/urquell_defconfig b/arch/sh/configs/urquell_defconfig
> index cb2f56468fe0..be478f3148f2 100644
> --- a/arch/sh/configs/urquell_defconfig
> +++ b/arch/sh/configs/urquell_defconfig
> @@ -14,7 +14,6 @@ CONFIG_CPUSETS=y
>  # CONFIG_PROC_PID_CPUSET is not set
>  CONFIG_CGROUP_CPUACCT=y
>  CONFIG_CGROUP_MEMCG=y
> -CONFIG_CGROUP_MEMCG_SWAP=y
>  CONFIG_CGROUP_SCHED=y
>  CONFIG_RT_GROUP_SCHED=y
>  CONFIG_BLK_DEV_INITRD=y
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 1b4150ff64be..45842ed8ba53 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -569,9 +569,7 @@ struct mem_cgroup *mem_cgroup_get_oom_group(struct task_struct *victim,
>                                             struct mem_cgroup *oom_domain);
>  void mem_cgroup_print_oom_group(struct mem_cgroup *memcg);
>
> -#ifdef CONFIG_MEMCG_SWAP
>  extern int do_swap_account;

This can be removed too.

> -#endif
>
>  struct mem_cgroup *lock_page_memcg(struct page *page);
>  void __unlock_page_memcg(struct mem_cgroup *memcg);
> diff --git a/include/linux/swap.h b/include/linux/swap.h
> index b835d8dbea0e..7e44e1e6ef27 100644
> --- a/include/linux/swap.h
> +++ b/include/linux/swap.h
> @@ -654,38 +654,11 @@ static inline void mem_cgroup_throttle_swaprate(struct mem_cgroup *memcg,
>  }
>  #endif
>
> -#ifdef CONFIG_MEMCG_SWAP
>  extern void mem_cgroup_swapout(struct page *page, swp_entry_t entry);
>  extern int mem_cgroup_try_charge_swap(struct page *page, swp_entry_t entry);
>  extern void mem_cgroup_uncharge_swap(swp_entry_t entry, unsigned int nr_pages);
>  extern long mem_cgroup_get_nr_swap_pages(struct mem_cgroup *memcg);
>  extern bool mem_cgroup_swap_full(struct page *page);
> -#else
> -static inline void mem_cgroup_swapout(struct page *page, swp_entry_t entry)
> -{
> -}
> -
> -static inline int mem_cgroup_try_charge_swap(struct page *page,
> -                                            swp_entry_t entry)
> -{
> -       return 0;
> -}
> -
> -static inline void mem_cgroup_uncharge_swap(swp_entry_t entry,
> -                                           unsigned int nr_pages)
> -{
> -}
> -
> -static inline long mem_cgroup_get_nr_swap_pages(struct mem_cgroup *memcg)
> -{
> -       return get_nr_swap_pages();
> -}
> -
> -static inline bool mem_cgroup_swap_full(struct page *page)
> -{
> -       return vm_swap_full();
> -}
> -#endif
>
>  #endif /* __KERNEL__*/
>  #endif /* _LINUX_SWAP_H */
> diff --git a/include/linux/swap_cgroup.h b/include/linux/swap_cgroup.h
> index a12dd1c3966c..5ac87b3b0e08 100644
> --- a/include/linux/swap_cgroup.h
> +++ b/include/linux/swap_cgroup.h
> @@ -4,8 +4,6 @@
>
>  #include <linux/swap.h>
>
> -#ifdef CONFIG_MEMCG_SWAP
> -
>  extern unsigned short swap_cgroup_cmpxchg(swp_entry_t ent,
>                                         unsigned short old, unsigned short new);
>  extern unsigned short swap_cgroup_record(swp_entry_t ent, unsigned short id,
> @@ -14,32 +12,4 @@ extern unsigned short swap_cgroup_record(swp_entry_t ent, unsigned short id,
>  extern int swap_cgroup_swapon(int type, unsigned long max_pages);
>  extern void swap_cgroup_swapoff(int type);
>
> -#else
> -
> -static inline
> -unsigned short swap_cgroup_record(swp_entry_t ent, unsigned short id,
> -                                 unsigned int nr_ents)
> -{
> -       return 0;
> -}
> -
> -static inline
> -unsigned short lookup_swap_cgroup_id(swp_entry_t ent)
> -{
> -       return 0;
> -}
> -
> -static inline int
> -swap_cgroup_swapon(int type, unsigned long max_pages)
> -{
> -       return 0;
> -}
> -
> -static inline void swap_cgroup_swapoff(int type)
> -{
> -       return;
> -}
> -
> -#endif /* CONFIG_MEMCG_SWAP */
> -
>  #endif /* __LINUX_SWAP_CGROUP_H */
> diff --git a/init/Kconfig b/init/Kconfig
> index 9e22ee8fbd75..3b09c71e9b57 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -834,26 +834,6 @@ config MEMCG
>         help
>           Provides control over the memory footprint of tasks in a cgroup.
>
> -config MEMCG_SWAP
> -       bool "Swap controller"
> -       depends on MEMCG && SWAP
> -       help
> -         Provides control over the swap space consumed by tasks in a cgroup.
> -
> -config MEMCG_SWAP_ENABLED
> -       bool "Swap controller enabled by default"
> -       depends on MEMCG_SWAP
> -       default y
> -       help
> -         Memory Resource Controller Swap Extension comes with its price in
> -         a bigger memory consumption. General purpose distribution kernels
> -         which want to enable the feature but keep it disabled by default
> -         and let the user enable it by swapaccount=1 boot command line
> -         parameter should have this option unselected.
> -         For those who want to have the feature enabled by default should
> -         select this option (if, for some reason, they need to disable it
> -         then swapaccount=0 does the trick).
> -
>  config MEMCG_KMEM
>         bool
>         depends on MEMCG && !SLOB
> diff --git a/mm/Makefile b/mm/Makefile
> index fccd3756b25f..812c08323355 100644
> --- a/mm/Makefile
> +++ b/mm/Makefile
> @@ -58,7 +58,8 @@ ifdef CONFIG_MMU
>         obj-$(CONFIG_ADVISE_SYSCALLS)   += madvise.o
>  endif
>
> -obj-$(CONFIG_SWAP)     += page_io.o swap_state.o swapfile.o swap_slots.o
> +obj-$(CONFIG_SWAP)     += page_io.o swap_state.o swapfile.o swap_slots.o \
> +                          swap_cgroup.o
>  obj-$(CONFIG_FRONTSWAP)        += frontswap.o
>  obj-$(CONFIG_ZSWAP)    += zswap.o
>  obj-$(CONFIG_HAS_DMA)  += dmapool.o
> @@ -80,7 +81,6 @@ obj-$(CONFIG_MIGRATION) += migrate.o
>  obj-$(CONFIG_TRANSPARENT_HUGEPAGE) += huge_memory.o khugepaged.o
>  obj-$(CONFIG_PAGE_COUNTER) += page_counter.o
>  obj-$(CONFIG_MEMCG) += memcontrol.o vmpressure.o
> -obj-$(CONFIG_MEMCG_SWAP) += swap_cgroup.o
>  obj-$(CONFIG_CGROUP_HUGETLB) += hugetlb_cgroup.o
>  obj-$(CONFIG_GUP_BENCHMARK) += gup_benchmark.o
>  obj-$(CONFIG_MEMORY_FAILURE) += memory-failure.o
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 5beea03dd58a..446141b6597a 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -82,11 +82,7 @@
>  static bool cgroup_memory_nokmem;
>
>  /* Whether the swap controller is active */
> -#ifdef CONFIG_MEMCG_SWAP
>  int do_swap_account __read_mostly;
> -#else
> -#define do_swap_account                0
> -#endif
>
>  #ifdef CONFIG_CGROUP_WRITEBACK
>  static DECLARE_WAIT_QUEUE_HEAD(memcg_cgwb_frn_waitq);
> @@ -3020,7 +3016,6 @@ void mem_cgroup_split_huge_fixup(struct page *head)
>  }
>  #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
>
> -#ifdef CONFIG_MEMCG_SWAP
>  /**
>   * mem_cgroup_move_swap_account - move swap charge and swap_cgroup's record.
>   * @entry: swap entry to be moved
> @@ -3050,13 +3045,6 @@ static int mem_cgroup_move_swap_account(swp_entry_t entry,
>         }
>         return -EINVAL;
>  }
> -#else
> -static inline int mem_cgroup_move_swap_account(swp_entry_t entry,
> -                               struct mem_cgroup *from, struct mem_cgroup *to)
> -{
> -       return -EINVAL;
> -}
> -#endif
>
>  static DEFINE_MUTEX(memcg_max_mutex);
>
> @@ -6936,7 +6924,6 @@ static int __init mem_cgroup_init(void)
>  }
>  subsys_initcall(mem_cgroup_init);
>
> -#ifdef CONFIG_MEMCG_SWAP
>  static struct mem_cgroup *mem_cgroup_id_get_online(struct mem_cgroup *memcg)
>  {
>         while (!refcount_inc_not_zero(&memcg->id.ref)) {
> @@ -7137,11 +7124,7 @@ bool mem_cgroup_swap_full(struct page *page)
>  }
>
>  /* for remember boot option*/
> -#ifdef CONFIG_MEMCG_SWAP_ENABLED
>  static int really_do_swap_account __initdata = 1;
> -#else
> -static int really_do_swap_account __initdata;
> -#endif
>
>  static int __init enable_swap_account(char *s)
>  {
> @@ -7257,4 +7240,3 @@ static int __init mem_cgroup_swap_init(void)
>  }
>  subsys_initcall(mem_cgroup_swap_init);
>
> -#endif /* CONFIG_MEMCG_SWAP */
> --
> 1.8.3.1
>


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] memcg: folding CONFIG_MEMCG_SWAP as default
  2020-04-17 15:53 ` [PATCH 1/2] memcg: folding CONFIG_MEMCG_SWAP as default Michal Hocko
@ 2020-04-17 16:41   ` Shakeel Butt
  2020-04-17 16:54     ` Michal Hocko
  0 siblings, 1 reply; 9+ messages in thread
From: Shakeel Butt @ 2020-04-17 16:41 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Alex Shi, Andrew Morton, Johannes Weiner, Linux MM, Cgroups

On Fri, Apr 17, 2020 at 9:03 AM Michal Hocko <mhocko@kernel.org> wrote:
>
> On Fri 17-04-20 22:43:43, Alex Shi wrote:
> > This patch fold MEMCG_SWAP feature into kernel as default function. That
> > required a short size memcg id for each of page. As Johannes mentioned
> >
> > "the overhead of tracking is tiny - 512k per G of swap (0.04%).'
> >
> > So all swapout page could be tracked for its memcg id.
>
> I am perfectly OK with dropping the CONFIG_MEMCG_SWAP. The code that is
> guarded by it is negligible and the resulting code is much easier to
> read so no objection on that front. I just do not really see any real
> reason to flip the default for cgroup v1. Why do we want/need that?
>

Yes, the changelog is lacking the motivation of this change. This is
proposed by Johannes and I was actually expecting the patch from him.
The motivation is to make the things simpler for per-memcg LRU locking
and workingset for anon memory (Johannes has described these really
well, lemme find the email). If we keep the differentiation between
cgroup v1 and v2, then there is actually no point of this cleanup as
per-memcg LRU locking and anon workingset still has to handle the
!do_swap_account case.

Shakeel


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] memcg: folding CONFIG_MEMCG_SWAP as default
  2020-04-17 16:41   ` Shakeel Butt
@ 2020-04-17 16:54     ` Michal Hocko
  2020-04-17 17:35       ` Shakeel Butt
  2020-04-18 13:44       ` Alex Shi
  0 siblings, 2 replies; 9+ messages in thread
From: Michal Hocko @ 2020-04-17 16:54 UTC (permalink / raw)
  To: Shakeel Butt; +Cc: Alex Shi, Andrew Morton, Johannes Weiner, Linux MM, Cgroups

On Fri 17-04-20 09:41:04, Shakeel Butt wrote:
> On Fri, Apr 17, 2020 at 9:03 AM Michal Hocko <mhocko@kernel.org> wrote:
> >
> > On Fri 17-04-20 22:43:43, Alex Shi wrote:
> > > This patch fold MEMCG_SWAP feature into kernel as default function. That
> > > required a short size memcg id for each of page. As Johannes mentioned
> > >
> > > "the overhead of tracking is tiny - 512k per G of swap (0.04%).'
> > >
> > > So all swapout page could be tracked for its memcg id.
> >
> > I am perfectly OK with dropping the CONFIG_MEMCG_SWAP. The code that is
> > guarded by it is negligible and the resulting code is much easier to
> > read so no objection on that front. I just do not really see any real
> > reason to flip the default for cgroup v1. Why do we want/need that?
> >
> 
> Yes, the changelog is lacking the motivation of this change. This is
> proposed by Johannes and I was actually expecting the patch from him.
> The motivation is to make the things simpler for per-memcg LRU locking
> and workingset for anon memory (Johannes has described these really
> well, lemme find the email). If we keep the differentiation between
> cgroup v1 and v2, then there is actually no point of this cleanup as
> per-memcg LRU locking and anon workingset still has to handle the
> !do_swap_account case.

All those details really have to go into the changelog. I have to say
that I still do not understand why the actual accounting swap or not
makes any difference for per per-memcg LRU. Especially when your patch
keeps the kernel command line parameter still in place.

Anyway, it would be much more simpler to have a patch that drops the
CONFIG_MEMCG_SWAP and a separate one which switches the default
beahvior. I am not saying I am ok with the later but if the
justification is convincing then I might change my mind.
-- 
Michal Hocko
SUSE Labs


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] memcg: folding CONFIG_MEMCG_SWAP as default
  2020-04-17 16:54     ` Michal Hocko
@ 2020-04-17 17:35       ` Shakeel Butt
  2020-04-18 13:44       ` Alex Shi
  1 sibling, 0 replies; 9+ messages in thread
From: Shakeel Butt @ 2020-04-17 17:35 UTC (permalink / raw)
  To: Michal Hocko; +Cc: Alex Shi, Andrew Morton, Johannes Weiner, Linux MM, Cgroups

On Fri, Apr 17, 2020 at 9:54 AM Michal Hocko <mhocko@kernel.org> wrote:
>
> On Fri 17-04-20 09:41:04, Shakeel Butt wrote:
> > On Fri, Apr 17, 2020 at 9:03 AM Michal Hocko <mhocko@kernel.org> wrote:
> > >
> > > On Fri 17-04-20 22:43:43, Alex Shi wrote:
> > > > This patch fold MEMCG_SWAP feature into kernel as default function. That
> > > > required a short size memcg id for each of page. As Johannes mentioned
> > > >
> > > > "the overhead of tracking is tiny - 512k per G of swap (0.04%).'
> > > >
> > > > So all swapout page could be tracked for its memcg id.
> > >
> > > I am perfectly OK with dropping the CONFIG_MEMCG_SWAP. The code that is
> > > guarded by it is negligible and the resulting code is much easier to
> > > read so no objection on that front. I just do not really see any real
> > > reason to flip the default for cgroup v1. Why do we want/need that?
> > >
> >
> > Yes, the changelog is lacking the motivation of this change. This is
> > proposed by Johannes and I was actually expecting the patch from him.
> > The motivation is to make the things simpler for per-memcg LRU locking
> > and workingset for anon memory (Johannes has described these really
> > well, lemme find the email). If we keep the differentiation between
> > cgroup v1 and v2, then there is actually no point of this cleanup as
> > per-memcg LRU locking and anon workingset still has to handle the
> > !do_swap_account case.
>
> All those details really have to go into the changelog. I have to say
> that I still do not understand why the actual accounting swap or not
> makes any difference for per per-memcg LRU.

Here is Johannes explanation:
https://lore.kernel.org/linux-mm/20200413180725.GA99267@cmpxchg.org/

> Especially when your patch

You mean Alex's patch.

> keeps the kernel command line parameter still in place.
>
> Anyway, it would be much more simpler to have a patch that drops the
> CONFIG_MEMCG_SWAP and a separate one which switches the default
> beahvior. I am not saying I am ok with the later but if the
> justification is convincing then I might change my mind.
> --
> Michal Hocko
> SUSE Labs


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] memcg: folding CONFIG_MEMCG_SWAP as default
  2020-04-17 16:54     ` Michal Hocko
  2020-04-17 17:35       ` Shakeel Butt
@ 2020-04-18 13:44       ` Alex Shi
  2020-04-18 16:17         ` Shakeel Butt
  1 sibling, 1 reply; 9+ messages in thread
From: Alex Shi @ 2020-04-18 13:44 UTC (permalink / raw)
  To: Michal Hocko, Shakeel Butt
  Cc: Andrew Morton, Johannes Weiner, Linux MM, Cgroups



在 2020/4/18 上午12:54, Michal Hocko 写道:
> On Fri 17-04-20 09:41:04, Shakeel Butt wrote:
>> On Fri, Apr 17, 2020 at 9:03 AM Michal Hocko <mhocko@kernel.org> wrote:
>>>
>>> On Fri 17-04-20 22:43:43, Alex Shi wrote:
>>>> This patch fold MEMCG_SWAP feature into kernel as default function. That
>>>> required a short size memcg id for each of page. As Johannes mentioned
>>>>
>>>> "the overhead of tracking is tiny - 512k per G of swap (0.04%).'
>>>>
>>>> So all swapout page could be tracked for its memcg id.
>>>
>>> I am perfectly OK with dropping the CONFIG_MEMCG_SWAP. The code that is
>>> guarded by it is negligible and the resulting code is much easier to
>>> read so no objection on that front. I just do not really see any real
>>> reason to flip the default for cgroup v1. Why do we want/need that?
>>>
>>
>> Yes, the changelog is lacking the motivation of this change. This is
>> proposed by Johannes and I was actually expecting the patch from him.
>> The motivation is to make the things simpler for per-memcg LRU locking
>> and workingset for anon memory (Johannes has described these really
>> well, lemme find the email). If we keep the differentiation between
>> cgroup v1 and v2, then there is actually no point of this cleanup as
>> per-memcg LRU locking and anon workingset still has to handle the
>> !do_swap_account case.
> 
> All those details really have to go into the changelog. I have to say
> that I still do not understand why the actual accounting swap or not
> makes any difference for per per-memcg LRU. Especially when your patch
> keeps the kernel command line parameter still in place.
> 
> Anyway, it would be much more simpler to have a patch that drops the
> CONFIG_MEMCG_SWAP and a separate one which switches the default
> beahvior. I am not saying I am ok with the later but if the
> justification is convincing then I might change my mind.
> 

Hi Shakeel & Michal,

Thanks for all comments!

Yes, we still need to remove swapaccount from cmdline and keep swap_cgroup.id
permanently. Just I don't know if this patch could fit into the details of 
Johannes new solution.

Anyway, I will send out v2 for complete memcg id record patch, just in case
if they are useful.

Thanks
Alex


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/2] memcg: folding CONFIG_MEMCG_SWAP as default
  2020-04-18 13:44       ` Alex Shi
@ 2020-04-18 16:17         ` Shakeel Butt
  0 siblings, 0 replies; 9+ messages in thread
From: Shakeel Butt @ 2020-04-18 16:17 UTC (permalink / raw)
  To: Alex Shi; +Cc: Michal Hocko, Andrew Morton, Johannes Weiner, Linux MM, Cgroups

On Sat, Apr 18, 2020 at 6:45 AM Alex Shi <alex.shi@linux.alibaba.com> wrote:
>
>
>
> 在 2020/4/18 上午12:54, Michal Hocko 写道:
> > On Fri 17-04-20 09:41:04, Shakeel Butt wrote:
> >> On Fri, Apr 17, 2020 at 9:03 AM Michal Hocko <mhocko@kernel.org> wrote:
> >>>
> >>> On Fri 17-04-20 22:43:43, Alex Shi wrote:
> >>>> This patch fold MEMCG_SWAP feature into kernel as default function. That
> >>>> required a short size memcg id for each of page. As Johannes mentioned
> >>>>
> >>>> "the overhead of tracking is tiny - 512k per G of swap (0.04%).'
> >>>>
> >>>> So all swapout page could be tracked for its memcg id.
> >>>
> >>> I am perfectly OK with dropping the CONFIG_MEMCG_SWAP. The code that is
> >>> guarded by it is negligible and the resulting code is much easier to
> >>> read so no objection on that front. I just do not really see any real
> >>> reason to flip the default for cgroup v1. Why do we want/need that?
> >>>
> >>
> >> Yes, the changelog is lacking the motivation of this change. This is
> >> proposed by Johannes and I was actually expecting the patch from him.
> >> The motivation is to make the things simpler for per-memcg LRU locking
> >> and workingset for anon memory (Johannes has described these really
> >> well, lemme find the email). If we keep the differentiation between
> >> cgroup v1 and v2, then there is actually no point of this cleanup as
> >> per-memcg LRU locking and anon workingset still has to handle the
> >> !do_swap_account case.
> >
> > All those details really have to go into the changelog. I have to say
> > that I still do not understand why the actual accounting swap or not
> > makes any difference for per per-memcg LRU. Especially when your patch
> > keeps the kernel command line parameter still in place.
> >
> > Anyway, it would be much more simpler to have a patch that drops the
> > CONFIG_MEMCG_SWAP and a separate one which switches the default
> > beahvior. I am not saying I am ok with the later but if the
> > justification is convincing then I might change my mind.
> >
>
> Hi Shakeel & Michal,
>
> Thanks for all comments!
>
> Yes, we still need to remove swapaccount from cmdline and keep swap_cgroup.id
> permanently. Just I don't know if this patch could fit into the details of
> Johannes new solution.
>
> Anyway, I will send out v2 for complete memcg id record patch, just in case
> if they are useful.
>

I would recommend waiting for Johannes patch series. The cleanup this
patch is doing makes more sense to be part of Johannes's lrucare
cleanup series.

thanks,
Shakeel


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2020-04-18 16:18 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-17 14:43 [PATCH 1/2] memcg: folding CONFIG_MEMCG_SWAP as default Alex Shi
2020-04-17 14:43 ` [PATCH 2/2] mm/swap: clean up parameter pass in swapin funcs Alex Shi
2020-04-17 15:53 ` [PATCH 1/2] memcg: folding CONFIG_MEMCG_SWAP as default Michal Hocko
2020-04-17 16:41   ` Shakeel Butt
2020-04-17 16:54     ` Michal Hocko
2020-04-17 17:35       ` Shakeel Butt
2020-04-18 13:44       ` Alex Shi
2020-04-18 16:17         ` Shakeel Butt
2020-04-17 16:22 ` Shakeel Butt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).