linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] mm: Add configuration to control whether vmpressure notifier is enabled
@ 2021-08-19 23:53 yongw.pur
  2021-08-20 11:26 ` Michal Hocko
  2021-08-20 11:42 ` Shakeel Butt
  0 siblings, 2 replies; 11+ messages in thread
From: yongw.pur @ 2021-08-19 23:53 UTC (permalink / raw)
  To: tj, corbet, akpm, mhocko, vdavydov.dev, tglx, peterz, shakeelb,
	guro, alexs, richard.weiyang, sh_def, sfr, wang.yong12, cgroups,
	linux-doc, linux-kernel, linux-mm, yang.yang29
  Cc: wangyong

From: wangyong <wang.yong@zte.com.cn>

Inspired by PSI features, vmpressure inotifier function should
also be configured to decide whether it is used, because it is an
independent feature which notifies the user of memory pressure.

So we add configuration to control whether vmpressure notifier is
enabled, and provide a boot parameter to use vmpressure notifier
flexibly.

Use Christoph Lamenter’s pagefault tool
(https://lkml.org/lkml/2006/8/29/294) for comparative testing.
Test with 5.14.0-rc5-next-20210813 on x86_64 4G Ram
To ensure that the vmpressure function is executed, we enable zram
and let the program occupy memory so that some memory is swapped out

unpatched:
Gb	Rep	Thr	CLine	User(s)	System(s) Wall(s) flt/cpu/s	fault/wsec
2	1	1	1	0.1	0.97	1.13	485490.062	463533.34
2	1	1	1	0.11	0.96	1.12	483086.072	465309.495
2	1	1	1	0.1	0.95	1.11	496687.098	469887.643
2	1	1	1	0.09	0.97	1.11	489711.434	468402.102
2	1	1	1	0.13	0.94	1.12	484159.415	466080.941
average				0.106	0.958	1.118	487826.8162	466642.7042

patched and CONFIG_MEMCG_VMPRESSURE is not set:
Gb	Rep	Thr	CLine	User(s)	System(s) Wall(s) flt/cpu/s	fault/wsec
2	1	1	1	0.1	0.96	1.1	490942.682	473125.98
2	1	1	1	0.08	0.99	1.13	484987.521	463161.975
2	1	1	1	0.09	0.96	1.09	498824.98	476696.066
2	1	1	1	0.1	0.97	1.12	484127.673	465951.238
2	1	1	1	0.1	0.97	1.11	487032		468964.662
average				0.094	0.97	1.11	489182.9712	469579.9842

According to flt/cpu/s, performance improved by 0.2% which is not obvious.

By the way, the patch is mainly to make the vmpressure inotifier function
configurable.

Tested-by: Zeal Robot <zealci@zte.com.cn>
Signed-off-by: wangyong <wang.yong@zte.com.cn>
---

Changes since v1:
-Modify some problems of document format and code

 Documentation/admin-guide/cgroup-v1/memory.rst  |  5 +++--
 Documentation/admin-guide/kernel-parameters.txt |  3 +++
 include/linux/vmpressure.h                      |  7 +++++--
 init/Kconfig                                    | 20 +++++++++++++++++++
 mm/Makefile                                     |  3 ++-
 mm/memcontrol.c                                 |  7 ++++++-
 mm/vmpressure.c                                 | 26 +++++++++++++++++++++++++
 7 files changed, 65 insertions(+), 6 deletions(-)

diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst
index 41191b5..d605035 100644
--- a/Documentation/admin-guide/cgroup-v1/memory.rst
+++ b/Documentation/admin-guide/cgroup-v1/memory.rst
@@ -388,6 +388,7 @@ a. Enable CONFIG_CGROUPS
 b. Enable CONFIG_MEMCG
 c. Enable CONFIG_MEMCG_SWAP (to use swap extension)
 d. Enable CONFIG_MEMCG_KMEM (to use kmem extension)
+e. Enable CONFIG_MEMCG_VMPRESSURE (to use vmpressure extension)
 
 3.1. Prepare the cgroups (see cgroups.txt, Why are cgroups needed?)
 -------------------------------------------------------------------
@@ -855,8 +856,8 @@ At reading, current status of OOM is shown.
           The number of processes belonging to this cgroup killed by any
           kind of OOM killer.
 
-11. Memory Pressure
-===================
+11. Memory Pressure (CONFIG_MEMCG_VMPRESSURE)
+=============================================
 
 The pressure level notifications can be used to monitor the memory
 allocation cost; based on the pressure, applications can implement
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 69ded4b..7cf541d 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -6156,6 +6156,9 @@
 	vmpoff=		[KNL,S390] Perform z/VM CP command after power off.
 			Format: <command>
 
+	vmpressure=	[KNL] Enable or disable vmpressure notifier.
+			Format: <bool>
+
 	vsyscall=	[X86-64]
 			Controls the behavior of vsyscalls (i.e. calls to
 			fixed addresses of 0xffffffffff600x00 from legacy
diff --git a/include/linux/vmpressure.h b/include/linux/vmpressure.h
index 6a2f51e..dcae02e 100644
--- a/include/linux/vmpressure.h
+++ b/include/linux/vmpressure.h
@@ -29,7 +29,8 @@ struct vmpressure {
 
 struct mem_cgroup;
 
-#ifdef CONFIG_MEMCG
+#ifdef CONFIG_MEMCG_VMPRESSURE
+extern bool vmpressure_enable;
 extern void vmpressure(gfp_t gfp, struct mem_cgroup *memcg, bool tree,
 		       unsigned long scanned, unsigned long reclaimed);
 extern void vmpressure_prio(gfp_t gfp, struct mem_cgroup *memcg, int prio);
@@ -48,5 +49,7 @@ static inline void vmpressure(gfp_t gfp, struct mem_cgroup *memcg, bool tree,
 			      unsigned long scanned, unsigned long reclaimed) {}
 static inline void vmpressure_prio(gfp_t gfp, struct mem_cgroup *memcg,
 				   int prio) {}
-#endif /* CONFIG_MEMCG */
+static inline void vmpressure_init(struct vmpressure *vmpr) {}
+static inline void vmpressure_cleanup(struct vmpressure *vmpr) {}
+#endif /* CONFIG_MEMCG_PRESSURE */
 #endif /* __LINUX_VMPRESSURE_H */
diff --git a/init/Kconfig b/init/Kconfig
index 71a028d..d3afeb2 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -948,6 +948,26 @@ config MEMCG_KMEM
 	depends on MEMCG && !SLOB
 	default y
 
+config MEMCG_VMPRESSURE
+	bool "Memory pressure notifier"
+	depends on MEMCG
+	default y
+	help
+	  Vmpressure extension is used to monitor the memory allocation cost.
+	  The pressure level can be set according to the use scenario and
+	  application will be notified through eventfd when memory pressure is at
+	  the specific level (or higher).
+
+config VMPRESSURE_DEFAULT_DISABLED
+	bool "Require boot parameter to enable memory pressure notifier"
+	depends on MEMCG_VMPRESSURE
+	default n
+	help
+	  If set, memory pressure notifier will be disabled  but can be
+	  enabled through passing vmpressure=1 on the kernel commandline
+	  during boot.
+	  For those who want to use memory pressure notifier flexibly.
+
 config BLK_CGROUP
 	bool "IO controller"
 	depends on BLOCK
diff --git a/mm/Makefile b/mm/Makefile
index 970604e..e4f99c1 100644
--- a/mm/Makefile
+++ b/mm/Makefile
@@ -92,7 +92,8 @@ obj-$(CONFIG_MEMTEST)		+= memtest.o
 obj-$(CONFIG_MIGRATION) += migrate.o
 obj-$(CONFIG_TRANSPARENT_HUGEPAGE) += huge_memory.o khugepaged.o
 obj-$(CONFIG_PAGE_COUNTER) += page_counter.o
-obj-$(CONFIG_MEMCG) += memcontrol.o vmpressure.o
+obj-$(CONFIG_MEMCG) += memcontrol.o
+obj-$(CONFIG_MEMCG_VMPRESSURE) += vmpressure.o
 obj-$(CONFIG_MEMCG_SWAP) += swap_cgroup.o
 obj-$(CONFIG_CGROUP_HUGETLB) += hugetlb_cgroup.o
 obj-$(CONFIG_GUP_TEST) += gup_test.o
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index e55f422..975e098 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -248,6 +248,7 @@ static inline bool should_force_charge(void)
 		(current->flags & PF_EXITING);
 }
 
+#ifdef CONFIG_MEMCG_VMPRESSURE
 /* Some nice accessors for the vmpressure. */
 struct vmpressure *memcg_to_vmpressure(struct mem_cgroup *memcg)
 {
@@ -260,6 +261,7 @@ struct mem_cgroup *vmpressure_to_memcg(struct vmpressure *vmpr)
 {
 	return container_of(vmpr, struct mem_cgroup, vmpressure);
 }
+#endif
 
 #ifdef CONFIG_MEMCG_KMEM
 extern spinlock_t css_set_lock;
@@ -4823,9 +4825,12 @@ static ssize_t memcg_write_event_control(struct kernfs_open_file *of,
 	} else if (!strcmp(name, "memory.oom_control")) {
 		event->register_event = mem_cgroup_oom_register_event;
 		event->unregister_event = mem_cgroup_oom_unregister_event;
-	} else if (!strcmp(name, "memory.pressure_level")) {
+#ifdef CONFIG_MEMCG_VMPRESSURE
+	} else if (vmpressure_enable &&
+		   !strcmp(name, "memory.pressure_level")) {
 		event->register_event = vmpressure_register_event;
 		event->unregister_event = vmpressure_unregister_event;
+#endif
 	} else if (!strcmp(name, "memory.memsw.usage_in_bytes")) {
 		event->register_event = memsw_cgroup_usage_register_event;
 		event->unregister_event = memsw_cgroup_usage_unregister_event;
diff --git a/mm/vmpressure.c b/mm/vmpressure.c
index 76518e4..b0d4358 100644
--- a/mm/vmpressure.c
+++ b/mm/vmpressure.c
@@ -67,6 +67,19 @@ static const unsigned int vmpressure_level_critical = 95;
  */
 static const unsigned int vmpressure_level_critical_prio = ilog2(100 / 10);
 
+DEFINE_STATIC_KEY_FALSE(vmpressure_disabled);
+#ifdef CONFIG_VMPRESSURE_DEFAULT_DISABLED
+bool vmpressure_enable;
+#else
+bool vmpressure_enable = true;
+#endif
+static int __init setup_vmpressure(char *str)
+{
+	return kstrtobool(str, &vmpressure_enable) == 0;
+}
+__setup("vmpressure=", setup_vmpressure);
+
+
 static struct vmpressure *work_to_vmpressure(struct work_struct *work)
 {
 	return container_of(work, struct vmpressure, work);
@@ -246,6 +259,9 @@ void vmpressure(gfp_t gfp, struct mem_cgroup *memcg, bool tree,
 
 	vmpr = memcg_to_vmpressure(memcg);
 
+	if (static_branch_likely(&vmpressure_disabled))
+		return;
+
 	/*
 	 * Here we only want to account pressure that userland is able to
 	 * help us with. For example, suppose that DMA zone is under
@@ -326,6 +342,8 @@ void vmpressure(gfp_t gfp, struct mem_cgroup *memcg, bool tree,
  */
 void vmpressure_prio(gfp_t gfp, struct mem_cgroup *memcg, int prio)
 {
+	if (static_branch_likely(&vmpressure_disabled))
+		return;
 	/*
 	 * We only use prio for accounting critical level. For more info
 	 * see comment for vmpressure_level_critical_prio variable above.
@@ -450,6 +468,11 @@ void vmpressure_unregister_event(struct mem_cgroup *memcg,
  */
 void vmpressure_init(struct vmpressure *vmpr)
 {
+	if (!vmpressure_enable) {
+		static_branch_enable(&vmpressure_disabled);
+		return;
+	}
+
 	spin_lock_init(&vmpr->sr_lock);
 	mutex_init(&vmpr->events_lock);
 	INIT_LIST_HEAD(&vmpr->events);
@@ -465,6 +488,9 @@ void vmpressure_init(struct vmpressure *vmpr)
  */
 void vmpressure_cleanup(struct vmpressure *vmpr)
 {
+
+	if (static_branch_likely(&vmpressure_disabled))
+		return;
 	/*
 	 * Make sure there is no pending work before eventfd infrastructure
 	 * goes away.
-- 
2.7.4


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] mm: Add configuration to control whether vmpressure notifier is enabled
  2021-08-19 23:53 [PATCH v2] mm: Add configuration to control whether vmpressure notifier is enabled yongw.pur
@ 2021-08-20 11:26 ` Michal Hocko
  2021-08-20 15:20   ` yong w
  2021-08-20 11:42 ` Shakeel Butt
  1 sibling, 1 reply; 11+ messages in thread
From: Michal Hocko @ 2021-08-20 11:26 UTC (permalink / raw)
  To: yongw.pur
  Cc: tj, corbet, akpm, vdavydov.dev, tglx, peterz, shakeelb, guro,
	alexs, richard.weiyang, sh_def, sfr, wang.yong12, cgroups,
	linux-doc, linux-kernel, linux-mm, yang.yang29, wangyong

On Thu 19-08-21 16:53:39, yongw.pur@gmail.com wrote:
> From: wangyong <wang.yong@zte.com.cn>
> 
> Inspired by PSI features, vmpressure inotifier function should
> also be configured to decide whether it is used, because it is an
> independent feature which notifies the user of memory pressure.

Yes, it is an independent feature indeed but what is the actual reason
to put a more configuration space here. Config options are not free both
from the user experience POV as well as the code maintenance. Why do we
need to disable this feature. Who can benefit from such a setup?

> So we add configuration to control whether vmpressure notifier is
> enabled, and provide a boot parameter to use vmpressure notifier
> flexibly.

Flexibility is nice but not free as mentioned above.

> Use Christoph Lamenter’s pagefault tool
> (https://lkml.org/lkml/2006/8/29/294) for comparative testing.
> Test with 5.14.0-rc5-next-20210813 on x86_64 4G Ram
> To ensure that the vmpressure function is executed, we enable zram
> and let the program occupy memory so that some memory is swapped out
> 
> unpatched:
> Gb	Rep	Thr	CLine	User(s)	System(s) Wall(s) flt/cpu/s	fault/wsec
> 2	1	1	1	0.1	0.97	1.13	485490.062	463533.34
> 2	1	1	1	0.11	0.96	1.12	483086.072	465309.495
> 2	1	1	1	0.1	0.95	1.11	496687.098	469887.643
> 2	1	1	1	0.09	0.97	1.11	489711.434	468402.102
> 2	1	1	1	0.13	0.94	1.12	484159.415	466080.941
> average				0.106	0.958	1.118	487826.8162	466642.7042
> 
> patched and CONFIG_MEMCG_VMPRESSURE is not set:
> Gb	Rep	Thr	CLine	User(s)	System(s) Wall(s) flt/cpu/s	fault/wsec
> 2	1	1	1	0.1	0.96	1.1	490942.682	473125.98
> 2	1	1	1	0.08	0.99	1.13	484987.521	463161.975
> 2	1	1	1	0.09	0.96	1.09	498824.98	476696.066
> 2	1	1	1	0.1	0.97	1.12	484127.673	465951.238
> 2	1	1	1	0.1	0.97	1.11	487032		468964.662
> average				0.094	0.97	1.11	489182.9712	469579.9842
> 
> According to flt/cpu/s, performance improved by 0.2% which is not obvious.

I haven't checked how are those numbers calculated but from a very brief
look it seems like the variation between different runs is higher than
0.2%. Have you checked the average against standard deviation to get a
better idea whether the difference is really outside of the noise?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] mm: Add configuration to control whether vmpressure notifier is enabled
  2021-08-19 23:53 [PATCH v2] mm: Add configuration to control whether vmpressure notifier is enabled yongw.pur
  2021-08-20 11:26 ` Michal Hocko
@ 2021-08-20 11:42 ` Shakeel Butt
  2021-08-20 14:29   ` yong w
  1 sibling, 1 reply; 11+ messages in thread
From: Shakeel Butt @ 2021-08-20 11:42 UTC (permalink / raw)
  To: yongw.pur
  Cc: Tejun Heo, Jonathan Corbet, Andrew Morton, Michal Hocko,
	Vladimir Davydov, Thomas Gleixner, Peter Zijlstra (Intel),
	Roman Gushchin, alexs, Wei Yang, Hui Su, Stephen Rothwell,
	wang.yong12, Cgroups, linux-doc, LKML, Linux MM, yang.yang29,
	wangyong

On Thu, Aug 19, 2021 at 4:54 PM <yongw.pur@gmail.com> wrote:
>
> From: wangyong <wang.yong@zte.com.cn>
>
> Inspired by PSI features, vmpressure inotifier function should
> also be configured to decide whether it is used, because it is an
> independent feature which notifies the user of memory pressure.
>

It is also used by the networking stack to check memory pressure. See
mem_cgroup_under_socket_pressure().

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] mm: Add configuration to control whether vmpressure notifier is enabled
  2021-08-20 11:42 ` Shakeel Butt
@ 2021-08-20 14:29   ` yong w
  2021-08-22 10:06     ` yong w
  0 siblings, 1 reply; 11+ messages in thread
From: yong w @ 2021-08-20 14:29 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Tejun Heo, Jonathan Corbet, Andrew Morton, Michal Hocko,
	Vladimir Davydov, Thomas Gleixner, Peter Zijlstra (Intel),
	Roman Gushchin, alexs, Wei Yang, Hui Su, Stephen Rothwell,
	wang.yong12, Cgroups, linux-doc, LKML, Linux MM, yang.yang29,
	wangyong

Shakeel Butt <shakeelb@google.com> 于2021年8月20日周五 下午7:42写道:
>
> On Thu, Aug 19, 2021 at 4:54 PM <yongw.pur@gmail.com> wrote:
> >
> > From: wangyong <wang.yong@zte.com.cn>
> >
> > Inspired by PSI features, vmpressure inotifier function should
> > also be configured to decide whether it is used, because it is an
> > independent feature which notifies the user of memory pressure.
> >
>
> It is also used by the networking stack to check memory pressure. See
> mem_cgroup_under_socket_pressure().

 Thanks for your replly, mem_cgroup_under_socket_pressure does use vmpressue,
 I'll check it.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] mm: Add configuration to control whether vmpressure notifier is enabled
  2021-08-20 11:26 ` Michal Hocko
@ 2021-08-20 15:20   ` yong w
  2021-08-20 15:41     ` Michal Hocko
  0 siblings, 1 reply; 11+ messages in thread
From: yong w @ 2021-08-20 15:20 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Tejun Heo, Jonathan Corbet, Andrew Morton, Vladimir Davydov,
	Thomas Gleixner, Peter Zijlstra (Intel),
	Shakeel Butt, Roman Gushchin, alexs, Wei Yang, Hui Su,
	Stephen Rothwell, wang.yong12, Cgroups, linux-doc, LKML,
	Linux MM, yang.yang29

Michal Hocko <mhocko@suse.com> 于2021年8月20日周五 下午7:26写道:
>
> On Thu 19-08-21 16:53:39, yongw.pur@gmail.com wrote:
> > From: wangyong <wang.yong@zte.com.cn>
> >
> > Inspired by PSI features, vmpressure inotifier function should
> > also be configured to decide whether it is used, because it is an
> > independent feature which notifies the user of memory pressure.
>
> Yes, it is an independent feature indeed but what is the actual reason
> to put a more configuration space here. Config options are not free both
> from the user experience POV as well as the code maintenance. Why do we
> need to disable this feature. Who can benefit from such a setup?
>
> > So we add configuration to control whether vmpressure notifier is
> > enabled, and provide a boot parameter to use vmpressure notifier
> > flexibly.
>
> Flexibility is nice but not free as mentioned above.
>
> > Use Christoph Lamenter’s pagefault tool
> > (https://lkml.org/lkml/2006/8/29/294) for comparative testing.
> > Test with 5.14.0-rc5-next-20210813 on x86_64 4G Ram
> > To ensure that the vmpressure function is executed, we enable zram
> > and let the program occupy memory so that some memory is swapped out
> >
> > unpatched:
> > Gb    Rep     Thr     CLine   User(s) System(s) Wall(s) flt/cpu/s     fault/wsec
> > 2     1       1       1       0.1     0.97    1.13    485490.062      463533.34
> > 2     1       1       1       0.11    0.96    1.12    483086.072      465309.495
> > 2     1       1       1       0.1     0.95    1.11    496687.098      469887.643
> > 2     1       1       1       0.09    0.97    1.11    489711.434      468402.102
> > 2     1       1       1       0.13    0.94    1.12    484159.415      466080.941
> > average                               0.106   0.958   1.118   487826.8162     466642.7042
> >
> > patched and CONFIG_MEMCG_VMPRESSURE is not set:
> > Gb    Rep     Thr     CLine   User(s) System(s) Wall(s) flt/cpu/s     fault/wsec
> > 2     1       1       1       0.1     0.96    1.1     490942.682      473125.98
> > 2     1       1       1       0.08    0.99    1.13    484987.521      463161.975
> > 2     1       1       1       0.09    0.96    1.09    498824.98       476696.066
> > 2     1       1       1       0.1     0.97    1.12    484127.673      465951.238
> > 2     1       1       1       0.1     0.97    1.11    487032          468964.662
> > average                               0.094   0.97    1.11    489182.9712     469579.9842
> >
> > According to flt/cpu/s, performance improved by 0.2% which is not obvious.
>
> I haven't checked how are those numbers calculated but from a very brief
> look it seems like the variation between different runs is higher than
> 0.2%. Have you checked the average against standard deviation to get a
> better idea whether the difference is really outside of the noise?
> --
> Michal Hocko
> SUSE Labs

Thanks for your reply.
The reason for adding configuration is as follows:
1. Referring to [PATCH] psi: make disabling/enabling easier for vendor
kernels, the modification
is also applicable to vmpressure.

2. With the introduction of psi into the kernel, there are two memory
pressure monitoring methods,
it is not necessary to use both and it makes sense to make vmpressure
configurable.

3. In the case where the user does not need vmpressure,  vmpressure
calculation is additional overhead.
In some special scenes with tight memory, vmpressure will be executed
frequently.we use "likely" and "inline"
to improve the performance of the kernel, why not reduce some
unnecessary calculations?

4. This patch is forward compatible, because VMPRESSURE is set by
default and user does not need to
make any changes. For users who do not need to use vmpressure notifier
or users who use psi, they
can choose not to configure this function.

Thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] mm: Add configuration to control whether vmpressure notifier is enabled
  2021-08-20 15:20   ` yong w
@ 2021-08-20 15:41     ` Michal Hocko
  2021-08-22  9:46       ` yong w
  0 siblings, 1 reply; 11+ messages in thread
From: Michal Hocko @ 2021-08-20 15:41 UTC (permalink / raw)
  To: yong w
  Cc: Tejun Heo, Jonathan Corbet, Andrew Morton, Vladimir Davydov,
	Thomas Gleixner, Peter Zijlstra (Intel),
	Shakeel Butt, Roman Gushchin, alexs, Wei Yang, Hui Su,
	Stephen Rothwell, wang.yong12, Cgroups, linux-doc, LKML,
	Linux MM, yang.yang29

On Fri 20-08-21 23:20:40, yong w wrote:
> Michal Hocko <mhocko@suse.com> 于2021年8月20日周五 下午7:26写道:
> >
> > On Thu 19-08-21 16:53:39, yongw.pur@gmail.com wrote:
> > > From: wangyong <wang.yong@zte.com.cn>
> > >
> > > Inspired by PSI features, vmpressure inotifier function should
> > > also be configured to decide whether it is used, because it is an
> > > independent feature which notifies the user of memory pressure.
> >
> > Yes, it is an independent feature indeed but what is the actual reason
> > to put a more configuration space here. Config options are not free both
> > from the user experience POV as well as the code maintenance. Why do we
> > need to disable this feature. Who can benefit from such a setup?
> >
> > > So we add configuration to control whether vmpressure notifier is
> > > enabled, and provide a boot parameter to use vmpressure notifier
> > > flexibly.
> >
> > Flexibility is nice but not free as mentioned above.
> >
> > > Use Christoph Lamenter’s pagefault tool
> > > (https://lkml.org/lkml/2006/8/29/294) for comparative testing.
> > > Test with 5.14.0-rc5-next-20210813 on x86_64 4G Ram
> > > To ensure that the vmpressure function is executed, we enable zram
> > > and let the program occupy memory so that some memory is swapped out
> > >
> > > unpatched:
> > > Gb    Rep     Thr     CLine   User(s) System(s) Wall(s) flt/cpu/s     fault/wsec
> > > 2     1       1       1       0.1     0.97    1.13    485490.062      463533.34
> > > 2     1       1       1       0.11    0.96    1.12    483086.072      465309.495
> > > 2     1       1       1       0.1     0.95    1.11    496687.098      469887.643
> > > 2     1       1       1       0.09    0.97    1.11    489711.434      468402.102
> > > 2     1       1       1       0.13    0.94    1.12    484159.415      466080.941
> > > average                               0.106   0.958   1.118   487826.8162     466642.7042
> > >
> > > patched and CONFIG_MEMCG_VMPRESSURE is not set:
> > > Gb    Rep     Thr     CLine   User(s) System(s) Wall(s) flt/cpu/s     fault/wsec
> > > 2     1       1       1       0.1     0.96    1.1     490942.682      473125.98
> > > 2     1       1       1       0.08    0.99    1.13    484987.521      463161.975
> > > 2     1       1       1       0.09    0.96    1.09    498824.98       476696.066
> > > 2     1       1       1       0.1     0.97    1.12    484127.673      465951.238
> > > 2     1       1       1       0.1     0.97    1.11    487032          468964.662
> > > average                               0.094   0.97    1.11    489182.9712     469579.9842
> > >
> > > According to flt/cpu/s, performance improved by 0.2% which is not obvious.
> >
> > I haven't checked how are those numbers calculated but from a very brief
> > look it seems like the variation between different runs is higher than
> > 0.2%. Have you checked the average against standard deviation to get a
> > better idea whether the difference is really outside of the noise?
> > --
> > Michal Hocko
> > SUSE Labs
> 
> Thanks for your reply.
> The reason for adding configuration is as follows:

All those reasons should be a part of the changelog.

> 1. Referring to [PATCH] psi: make disabling/enabling easier for vendor
> kernels, the modification
> is also applicable to vmpressure.
> 
> 2. With the introduction of psi into the kernel, there are two memory
> pressure monitoring methods,
> it is not necessary to use both and it makes sense to make vmpressure
> configurable.

I am not sure these are sufficient justifications but that is something
to discuss. And hence it should be a part of the changelog.

> 3. In the case where the user does not need vmpressure,  vmpressure
> calculation is additional overhead.

You should quantify that and argue why that overhead cannot be further
reduced without config/boot time knobs.

> In some special scenes with tight memory, vmpressure will be executed
> frequently.we use "likely" and "inline"
> to improve the performance of the kernel, why not reduce some
> unnecessary calculations?

I am all for improving the code. Is it possible to do it by other means?
E.g. reduce a potential overhead when there no events registered?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] mm: Add configuration to control whether vmpressure notifier is enabled
  2021-08-20 15:41     ` Michal Hocko
@ 2021-08-22  9:46       ` yong w
  2021-08-30 13:49         ` Michal Hocko
  0 siblings, 1 reply; 11+ messages in thread
From: yong w @ 2021-08-22  9:46 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Tejun Heo, Jonathan Corbet, Andrew Morton, Vladimir Davydov,
	Thomas Gleixner, Peter Zijlstra (Intel),
	Shakeel Butt, Roman Gushchin, alexs, Wei Yang, Hui Su,
	Stephen Rothwell, wang.yong12, Cgroups, linux-doc, LKML,
	Linux MM, yang.yang29

> All those reasons should be a part of the changelog.
>....
> I am not sure these are sufficient justifications but that is something
> to discuss. And hence it should be a part of the changelog.
>
OK, These reasons will be added to the patch notesin later versions.

> > 3. In the case where the user does not need vmpressure,  vmpressure
> > calculation is additional overhead.
>
> You should quantify that and argue why that overhead cannot be further
> reduced without config/boot time knobs.
>
The test results of the previously used PFT tool may not be obvious.
Is there a better way to quantify it?

> > In some special scenes with tight memory, vmpressure will be executed
> > frequently.we use "likely" and "inline"
> > to improve the performance of the kernel, why not reduce some
> > unnecessary calculations?
>
> I am all for improving the code. Is it possible to do it by other means?
> E.g. reduce a potential overhead when there no events registered?
Yes, the method you mentioned may be feasible, but it does not conflict
with this patch.

Thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] mm: Add configuration to control whether vmpressure notifier is enabled
  2021-08-20 14:29   ` yong w
@ 2021-08-22 10:06     ` yong w
  0 siblings, 0 replies; 11+ messages in thread
From: yong w @ 2021-08-22 10:06 UTC (permalink / raw)
  To: Shakeel Butt
  Cc: Tejun Heo, Jonathan Corbet, Andrew Morton, Michal Hocko,
	Vladimir Davydov, Thomas Gleixner, Peter Zijlstra (Intel),
	Roman Gushchin, alexs, Wei Yang, Hui Su, Stephen Rothwell,
	wang.yong12, Cgroups, linux-doc, LKML, Linux MM, yang.yang29,
	wangyong

Hello, I find the code of socket pressure and vmpressure are related.
And there are two ways to be discussed
1. The calculation of socket pressure is handed over to the psi.
2. Vmpresssure and psi update socket pressure respectively, and are
configured separately.

Thanks.

yong w <yongw.pur@gmail.com> 于2021年8月20日周五 下午10:29写道:
>
> Shakeel Butt <shakeelb@google.com> 于2021年8月20日周五 下午7:42写道:
> >
> > On Thu, Aug 19, 2021 at 4:54 PM <yongw.pur@gmail.com> wrote:
> > >
> > > From: wangyong <wang.yong@zte.com.cn>
> > >
> > > Inspired by PSI features, vmpressure inotifier function should
> > > also be configured to decide whether it is used, because it is an
> > > independent feature which notifies the user of memory pressure.
> > >
> >
> > It is also used by the networking stack to check memory pressure. See
> > mem_cgroup_under_socket_pressure().
>
>  Thanks for your replly, mem_cgroup_under_socket_pressure does use vmpressue,
>  I'll check it.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] mm: Add configuration to control whether vmpressure notifier is enabled
  2021-08-22  9:46       ` yong w
@ 2021-08-30 13:49         ` Michal Hocko
  2021-09-04 10:41           ` yong w
  0 siblings, 1 reply; 11+ messages in thread
From: Michal Hocko @ 2021-08-30 13:49 UTC (permalink / raw)
  To: yong w
  Cc: Tejun Heo, Jonathan Corbet, Andrew Morton, Vladimir Davydov,
	Thomas Gleixner, Peter Zijlstra (Intel),
	Shakeel Butt, Roman Gushchin, alexs, Wei Yang, Hui Su,
	Stephen Rothwell, wang.yong12, Cgroups, linux-doc, LKML,
	Linux MM, yang.yang29

On Sun 22-08-21 17:46:08, yong w wrote:
> > All those reasons should be a part of the changelog.
> >....
> > I am not sure these are sufficient justifications but that is something
> > to discuss. And hence it should be a part of the changelog.
> >
> OK, These reasons will be added to the patch notesin later versions.
> 
> > > 3. In the case where the user does not need vmpressure,  vmpressure
> > > calculation is additional overhead.
> >
> > You should quantify that and argue why that overhead cannot be further
> > reduced without config/boot time knobs.
> >
> The test results of the previously used PFT tool may not be obvious.
> Is there a better way to quantify it?

This is a question for you to answer I am afraid. You want to add a
configuration option and (as explained) that is not free of cost from
the maintenance POV. There must a very good reason to do that.

> > > In some special scenes with tight memory, vmpressure will be executed
> > > frequently.we use "likely" and "inline"
> > > to improve the performance of the kernel, why not reduce some
> > > unnecessary calculations?
> >
> > I am all for improving the code. Is it possible to do it by other means?
> > E.g. reduce a potential overhead when there no events registered?
> Yes, the method you mentioned may be feasible, but it does not conflict
> with this patch.

It is not in conflict but runtime overhead reduction without more burden
on the configurability is usually a preferred approach.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] mm: Add configuration to control whether vmpressure notifier is enabled
  2021-08-30 13:49         ` Michal Hocko
@ 2021-09-04 10:41           ` yong w
  2021-09-06  6:37             ` Michal Hocko
  0 siblings, 1 reply; 11+ messages in thread
From: yong w @ 2021-09-04 10:41 UTC (permalink / raw)
  To: Michal Hocko
  Cc: Tejun Heo, Jonathan Corbet, Andrew Morton, Vladimir Davydov,
	Thomas Gleixner, Peter Zijlstra (Intel),
	Shakeel Butt, Roman Gushchin, alexs, Wei Yang, Hui Su,
	Stephen Rothwell, wang.yong12, Cgroups, linux-doc, LKML,
	Linux MM, yang.yang29

Michal Hocko <mhocko@suse.com> 于2021年8月30日周一 下午9:49写道:
>
> On Sun 22-08-21 17:46:08, yong w wrote:
> > > All those reasons should be a part of the changelog.
> > >....
> > > I am not sure these are sufficient justifications but that is something
> > > to discuss. And hence it should be a part of the changelog.
> > >
> > OK, These reasons will be added to the patch notesin later versions.
> >
> > > > 3. In the case where the user does not need vmpressure,  vmpressure
> > > > calculation is additional overhead.
> > >
> > > You should quantify that and argue why that overhead cannot be further
> > > reduced without config/boot time knobs.
> > >
> > The test results of the previously used PFT tool may not be obvious.
> > Is there a better way to quantify it?
>
> This is a question for you to answer I am afraid. You want to add a
> configuration option and (as explained) that is not free of cost from
> the maintenance POV. There must a very good reason to do that.
Sorry for the late reply.The previous email mentions some reasons.
and several tools were used to test, but the data did not meet the expectations.
I'll try other test methods later.

> > > > In some special scenes with tight memory, vmpressure will be executed
> > > > frequently.we use "likely" and "inline"
> > > > to improve the performance of the kernel, why not reduce some
> > > > unnecessary calculations?
> > >
> > > I am all for improving the code. Is it possible to do it by other means?
> > > E.g. reduce a potential overhead when there no events registered?
> > Yes, the method you mentioned may be feasible, but it does not conflict
> > with this patch.
>
> It is not in conflict but runtime overhead reduction without more burden
> on the configurability is usually a preferred approach.
I agree with you.I had an idea that we use global variables to identify whether
there is event registration,however, global variables need to be
protected with locks.
I feel that there is little room for optimization in the code.

Thanks.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] mm: Add configuration to control whether vmpressure notifier is enabled
  2021-09-04 10:41           ` yong w
@ 2021-09-06  6:37             ` Michal Hocko
  0 siblings, 0 replies; 11+ messages in thread
From: Michal Hocko @ 2021-09-06  6:37 UTC (permalink / raw)
  To: yong w
  Cc: Tejun Heo, Jonathan Corbet, Andrew Morton, Vladimir Davydov,
	Thomas Gleixner, Peter Zijlstra (Intel),
	Shakeel Butt, Roman Gushchin, alexs, Wei Yang, Hui Su,
	Stephen Rothwell, wang.yong12, Cgroups, linux-doc, LKML,
	Linux MM, yang.yang29

On Sat 04-09-21 18:41:00, yong w wrote:
[..]
> > It is not in conflict but runtime overhead reduction without more burden
> > on the configurability is usually a preferred approach.
> I agree with you.I had an idea that we use global variables to identify whether
> there is event registration,however, global variables need to be
> protected with locks.

Have a look at static keys which are usual tool to provide effectivelly
zero overhead disabled branch.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-09-06  6:37 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-19 23:53 [PATCH v2] mm: Add configuration to control whether vmpressure notifier is enabled yongw.pur
2021-08-20 11:26 ` Michal Hocko
2021-08-20 15:20   ` yong w
2021-08-20 15:41     ` Michal Hocko
2021-08-22  9:46       ` yong w
2021-08-30 13:49         ` Michal Hocko
2021-09-04 10:41           ` yong w
2021-09-06  6:37             ` Michal Hocko
2021-08-20 11:42 ` Shakeel Butt
2021-08-20 14:29   ` yong w
2021-08-22 10:06     ` yong w

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).