All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/5] Fix bugs in memcontroller cgroup tests
@ 2022-04-23 15:56 ` David Vernet
  0 siblings, 0 replies; 70+ messages in thread
From: David Vernet @ 2022-04-23 15:56 UTC (permalink / raw)
  To: akpm
  Cc: tj, roman.gushchin, linux-kernel, linux-mm, cgroups, hannes,
	mhocko, shakeelb, kernel-team, void

tools/testing/selftests/cgroup/test_memcontrol.c contains a set of
testcases which validate expected behavior of the cgroup memory controller.
Roman Gushchin recently sent out a patchset that fixed a few issues in the
test. This patchset continues that effort by fixing a few more issues that
were causing non-deterministic failures in the suite. With this patchset,
I'm unable to reproduce any more errors after running the tests in a
continuous loop for many iterations. Before, I was able to reproduce at
least one of the errors fixed in this patchset with just one or two runs.

Changelog:
v2:
  - Fixed the comment headers in test_memcg_min() and test_memcg_low() to
    reflect the new ordering of child cgroups in those tests.
  - Fixed the comment I added in test_memcg_oom_group_leaf_events() to use /* */
    for multiline comments, as is the norm according to the kernel style guide.
  - Changed some of the conditional logic in test_memcg_oom_group_leaf_events()
    that checks for OOM event counts based on memory_localevents to be more
    intuitive.

David Vernet (5):
  cgroups: Refactor children cgroups in memcg tests
  cgroup: Account for memory_recursiveprot in test_memcg_low()
  cgroup: Account for memory_localevents in
    test_memcg_oom_group_leaf_events()
  cgroup: Removing racy check in test_memcg_sock()
  cgroup: Fix racy check in alloc_pagecache_max_30M() helper function

 tools/testing/selftests/cgroup/cgroup_util.c  | 12 +++
 tools/testing/selftests/cgroup/cgroup_util.h  |  1 +
 .../selftests/cgroup/test_memcontrol.c        | 77 ++++++++++++-------
 3 files changed, 64 insertions(+), 26 deletions(-)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v2 0/5] Fix bugs in memcontroller cgroup tests
@ 2022-04-23 15:56 ` David Vernet
  0 siblings, 0 replies; 70+ messages in thread
From: David Vernet @ 2022-04-23 15:56 UTC (permalink / raw)
  To: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b
  Cc: tj-DgEjT+Ai2ygdnm+yROfE0A, roman.gushchin-fxUVXftIFDnyG1zEObXtfA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	hannes-druUgvl0LCNAfugRpC6u6w, mhocko-DgEjT+Ai2ygdnm+yROfE0A,
	shakeelb-hpIqsD4AKlfQT0dZR+AlfA, kernel-team-b10kYP2dOMg,
	void-gq6j2QGBifHby3iVrkZq2A

tools/testing/selftests/cgroup/test_memcontrol.c contains a set of
testcases which validate expected behavior of the cgroup memory controller.
Roman Gushchin recently sent out a patchset that fixed a few issues in the
test. This patchset continues that effort by fixing a few more issues that
were causing non-deterministic failures in the suite. With this patchset,
I'm unable to reproduce any more errors after running the tests in a
continuous loop for many iterations. Before, I was able to reproduce at
least one of the errors fixed in this patchset with just one or two runs.

Changelog:
v2:
  - Fixed the comment headers in test_memcg_min() and test_memcg_low() to
    reflect the new ordering of child cgroups in those tests.
  - Fixed the comment I added in test_memcg_oom_group_leaf_events() to use /* */
    for multiline comments, as is the norm according to the kernel style guide.
  - Changed some of the conditional logic in test_memcg_oom_group_leaf_events()
    that checks for OOM event counts based on memory_localevents to be more
    intuitive.

David Vernet (5):
  cgroups: Refactor children cgroups in memcg tests
  cgroup: Account for memory_recursiveprot in test_memcg_low()
  cgroup: Account for memory_localevents in
    test_memcg_oom_group_leaf_events()
  cgroup: Removing racy check in test_memcg_sock()
  cgroup: Fix racy check in alloc_pagecache_max_30M() helper function

 tools/testing/selftests/cgroup/cgroup_util.c  | 12 +++
 tools/testing/selftests/cgroup/cgroup_util.h  |  1 +
 .../selftests/cgroup/test_memcontrol.c        | 77 ++++++++++++-------
 3 files changed, 64 insertions(+), 26 deletions(-)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH v2 1/5] cgroups: Refactor children cgroups in memcg tests
  2022-04-23 15:56 ` David Vernet
  (?)
@ 2022-04-23 15:56 ` David Vernet
  2022-04-26  1:56     ` Roman Gushchin
  -1 siblings, 1 reply; 70+ messages in thread
From: David Vernet @ 2022-04-23 15:56 UTC (permalink / raw)
  To: akpm
  Cc: tj, roman.gushchin, linux-kernel, linux-mm, cgroups, hannes,
	mhocko, shakeelb, kernel-team, void

In test_memcg_min() and test_memcg_low(), there is an array of four sibling
cgroups. All but one of these sibling groups does a 50MB allocation, and
the group that does no allocation is the third of four in the array.  This
is not a problem per se, but makes it a bit tricky to do some assertions in
test_memcg_low(), as we want to make assertions on the siblings based on
whether or not they performed allocations. Having a static index before
which all groups have performed an allocation makes this cleaner.

This patch therefore reorders the sibling groups so that the group that
performs no allocations is the last in the array. A follow-on patch will
leverage this to fix a bug in the test that incorrectly asserts that a
sibling group that had performed an allocation, but only had protection
from its parent, will not observe any memory.events.low events during
reclaim.

Signed-off-by: David Vernet <void@manifault.com>
---
 .../selftests/cgroup/test_memcontrol.c        | 28 +++++++++----------
 1 file changed, 14 insertions(+), 14 deletions(-)

diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
index 6b5259394e68..284d912e7d3e 100644
--- a/tools/testing/selftests/cgroup/test_memcontrol.c
+++ b/tools/testing/selftests/cgroup/test_memcontrol.c
@@ -244,8 +244,8 @@ static int cg_test_proc_killed(const char *cgroup)
  * A/B     memory.min = 50M,  memory.current = 50M
  * A/B/C   memory.min = 75M,  memory.current = 50M
  * A/B/D   memory.min = 25M,  memory.current = 50M
- * A/B/E   memory.min = 500M, memory.current = 0
- * A/B/F   memory.min = 0,    memory.current = 50M
+ * A/B/E   memory.min = 0,    memory.current = 50M
+ * A/B/F   memory.min = 500M, memory.current = 0
  *
  * Usages are pagecache, but the test keeps a running
  * process in every leaf cgroup.
@@ -255,7 +255,7 @@ static int cg_test_proc_killed(const char *cgroup)
  * A/B    memory.current ~= 50M
  * A/B/C  memory.current ~= 33M
  * A/B/D  memory.current ~= 17M
- * A/B/E  memory.current ~= 0
+ * A/B/F  memory.current ~= 0
  *
  * After that it tries to allocate more than there is
  * unprotected memory in A available, and checks
@@ -321,7 +321,7 @@ static int test_memcg_min(const char *root)
 		if (cg_create(children[i]))
 			goto cleanup;
 
-		if (i == 2)
+		if (i > 2)
 			continue;
 
 		cg_run_nowait(children[i], alloc_pagecache_50M_noexit,
@@ -336,9 +336,9 @@ static int test_memcg_min(const char *root)
 		goto cleanup;
 	if (cg_write(children[1], "memory.min", "25M"))
 		goto cleanup;
-	if (cg_write(children[2], "memory.min", "500M"))
+	if (cg_write(children[2], "memory.min", "0"))
 		goto cleanup;
-	if (cg_write(children[3], "memory.min", "0"))
+	if (cg_write(children[3], "memory.min", "500M"))
 		goto cleanup;
 
 	attempts = 0;
@@ -364,7 +364,7 @@ static int test_memcg_min(const char *root)
 	if (!values_close(c[1], MB(17), 20))
 		goto cleanup;
 
-	if (!values_close(c[2], 0, 1))
+	if (c[3] != 0)
 		goto cleanup;
 
 	if (!cg_run(parent[2], alloc_anon, (void *)MB(170)))
@@ -401,8 +401,8 @@ static int test_memcg_min(const char *root)
  * A/B     memory.low = 50M,  memory.current = 50M
  * A/B/C   memory.low = 75M,  memory.current = 50M
  * A/B/D   memory.low = 25M,  memory.current = 50M
- * A/B/E   memory.low = 500M, memory.current = 0
- * A/B/F   memory.low = 0,    memory.current = 50M
+ * A/B/E   memory.low = 0,    memory.current = 50M
+ * A/B/F   memory.low = 500M, memory.current = 0
  *
  * Usages are pagecache.
  * Then it creates A/G an creates a significant
@@ -412,7 +412,7 @@ static int test_memcg_min(const char *root)
  * A/B    memory.current ~= 50M
  * A/B/   memory.current ~= 33M
  * A/B/D  memory.current ~= 17M
- * A/B/E  memory.current ~= 0
+ * A/B/F  memory.current ~= 0
  *
  * After that it tries to allocate more than there is
  * unprotected memory in A available,
@@ -476,7 +476,7 @@ static int test_memcg_low(const char *root)
 		if (cg_create(children[i]))
 			goto cleanup;
 
-		if (i == 2)
+		if (i > 2)
 			continue;
 
 		if (cg_run(children[i], alloc_pagecache_50M, (void *)(long)fd))
@@ -491,9 +491,9 @@ static int test_memcg_low(const char *root)
 		goto cleanup;
 	if (cg_write(children[1], "memory.low", "25M"))
 		goto cleanup;
-	if (cg_write(children[2], "memory.low", "500M"))
+	if (cg_write(children[2], "memory.low", "0"))
 		goto cleanup;
-	if (cg_write(children[3], "memory.low", "0"))
+	if (cg_write(children[3], "memory.low", "500M"))
 		goto cleanup;
 
 	if (cg_run(parent[2], alloc_anon, (void *)MB(148)))
@@ -511,7 +511,7 @@ static int test_memcg_low(const char *root)
 	if (!values_close(c[1], MB(17), 20))
 		goto cleanup;
 
-	if (!values_close(c[2], 0, 1))
+	if (c[3] != 0)
 		goto cleanup;
 
 	if (cg_run(parent[2], alloc_anon, (void *)MB(166))) {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v2 2/5] cgroup: Account for memory_recursiveprot in test_memcg_low()
@ 2022-04-23 15:56   ` David Vernet
  0 siblings, 0 replies; 70+ messages in thread
From: David Vernet @ 2022-04-23 15:56 UTC (permalink / raw)
  To: akpm
  Cc: tj, roman.gushchin, linux-kernel, linux-mm, cgroups, hannes,
	mhocko, shakeelb, kernel-team, void

The test_memcg_low() testcase in test_memcontrol.c verifies the expected
behavior of groups using the memory.low knob. Part of the testcase verifies
that a group with memory.low that experiences reclaim due to memory
pressure elsewhere in the system, observes memory.events.low events as a
result of that reclaim.

In commit 8a931f801340 ("mm: memcontrol: recursive memory.low protection"),
the memory controller was updated to propagate memory.low and memory.min
protection from a parent group to its children via a configurable
memory_recursiveprot mount option. This unfortunately broke the memcg
tests, which asserts that a sibling that experienced reclaim but had a
memory.low value of 0, would not observe any memory.low events. This patch
updates test_memcg_low() to account for the new behavior introduced by
memory_recursiveprot.

So as to make the test resilient to multiple configurations, the patch also
adds a new proc_mount_contains() helper that checks for a string in
/proc/mounts, and is used to toggle behavior based on whether the default
memory_recursiveprot was present.

Signed-off-by: David Vernet <void@manifault.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
---
 tools/testing/selftests/cgroup/cgroup_util.c     | 12 ++++++++++++
 tools/testing/selftests/cgroup/cgroup_util.h     |  1 +
 tools/testing/selftests/cgroup/test_memcontrol.c | 16 +++++++++++++---
 3 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/cgroup/cgroup_util.c b/tools/testing/selftests/cgroup/cgroup_util.c
index dbaa7aabbb4a..e5d8d727bdcf 100644
--- a/tools/testing/selftests/cgroup/cgroup_util.c
+++ b/tools/testing/selftests/cgroup/cgroup_util.c
@@ -535,6 +535,18 @@ int set_oom_adj_score(int pid, int score)
 	return 0;
 }
 
+int proc_mount_contains(const char *option)
+{
+	char buf[4 * PAGE_SIZE];
+	ssize_t read;
+
+	read = read_text("/proc/mounts", buf, sizeof(buf));
+	if (read < 0)
+		return read;
+
+	return strstr(buf, option) != NULL;
+}
+
 ssize_t proc_read_text(int pid, bool thread, const char *item, char *buf, size_t size)
 {
 	char path[PATH_MAX];
diff --git a/tools/testing/selftests/cgroup/cgroup_util.h b/tools/testing/selftests/cgroup/cgroup_util.h
index 628738532ac9..756f76052b44 100644
--- a/tools/testing/selftests/cgroup/cgroup_util.h
+++ b/tools/testing/selftests/cgroup/cgroup_util.h
@@ -48,6 +48,7 @@ extern int is_swap_enabled(void);
 extern int set_oom_adj_score(int pid, int score);
 extern int cg_wait_for_proc_count(const char *cgroup, int count);
 extern int cg_killall(const char *cgroup);
+int proc_mount_contains(const char *option);
 extern ssize_t proc_read_text(int pid, bool thread, const char *item, char *buf, size_t size);
 extern int proc_read_strstr(int pid, bool thread, const char *item, const char *needle);
 extern pid_t clone_into_cgroup(int cgroup_fd);
diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
index 284d912e7d3e..d37e8dfb1248 100644
--- a/tools/testing/selftests/cgroup/test_memcontrol.c
+++ b/tools/testing/selftests/cgroup/test_memcontrol.c
@@ -21,6 +21,8 @@
 #include "../kselftest.h"
 #include "cgroup_util.h"
 
+static bool has_recursiveprot;
+
 /*
  * This test creates two nested cgroups with and without enabling
  * the memory controller.
@@ -521,15 +523,18 @@ static int test_memcg_low(const char *root)
 	}
 
 	for (i = 0; i < ARRAY_SIZE(children); i++) {
+		int no_low_events_index = has_recursiveprot ? 2 : 1;
+
 		oom = cg_read_key_long(children[i], "memory.events", "oom ");
 		low = cg_read_key_long(children[i], "memory.events", "low ");
 
 		if (oom)
 			goto cleanup;
-		if (i < 2 && low <= 0)
+		if (i <= no_low_events_index && low <= 0)
 			goto cleanup;
-		if (i >= 2 && low)
+		if (i > no_low_events_index && low)
 			goto cleanup;
+
 	}
 
 	ret = KSFT_PASS;
@@ -1272,7 +1277,7 @@ struct memcg_test {
 int main(int argc, char **argv)
 {
 	char root[PATH_MAX];
-	int i, ret = EXIT_SUCCESS;
+	int i, proc_status, ret = EXIT_SUCCESS;
 
 	if (cg_find_unified_root(root, sizeof(root)))
 		ksft_exit_skip("cgroup v2 isn't mounted\n");
@@ -1288,6 +1293,11 @@ int main(int argc, char **argv)
 		if (cg_write(root, "cgroup.subtree_control", "+memory"))
 			ksft_exit_skip("Failed to set memory controller\n");
 
+	proc_status = proc_mount_contains("memory_recursiveprot");
+	if (proc_status < 0)
+		ksft_exit_skip("Failed to query cgroup mount option\n");
+	has_recursiveprot = proc_status;
+
 	for (i = 0; i < ARRAY_SIZE(tests); i++) {
 		switch (tests[i].fn(root)) {
 		case KSFT_PASS:
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v2 2/5] cgroup: Account for memory_recursiveprot in test_memcg_low()
@ 2022-04-23 15:56   ` David Vernet
  0 siblings, 0 replies; 70+ messages in thread
From: David Vernet @ 2022-04-23 15:56 UTC (permalink / raw)
  To: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b
  Cc: tj-DgEjT+Ai2ygdnm+yROfE0A, roman.gushchin-fxUVXftIFDnyG1zEObXtfA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	hannes-druUgvl0LCNAfugRpC6u6w, mhocko-DgEjT+Ai2ygdnm+yROfE0A,
	shakeelb-hpIqsD4AKlfQT0dZR+AlfA, kernel-team-b10kYP2dOMg,
	void-gq6j2QGBifHby3iVrkZq2A

The test_memcg_low() testcase in test_memcontrol.c verifies the expected
behavior of groups using the memory.low knob. Part of the testcase verifies
that a group with memory.low that experiences reclaim due to memory
pressure elsewhere in the system, observes memory.events.low events as a
result of that reclaim.

In commit 8a931f801340 ("mm: memcontrol: recursive memory.low protection"),
the memory controller was updated to propagate memory.low and memory.min
protection from a parent group to its children via a configurable
memory_recursiveprot mount option. This unfortunately broke the memcg
tests, which asserts that a sibling that experienced reclaim but had a
memory.low value of 0, would not observe any memory.low events. This patch
updates test_memcg_low() to account for the new behavior introduced by
memory_recursiveprot.

So as to make the test resilient to multiple configurations, the patch also
adds a new proc_mount_contains() helper that checks for a string in
/proc/mounts, and is used to toggle behavior based on whether the default
memory_recursiveprot was present.

Signed-off-by: David Vernet <void-gq6j2QGBifHby3iVrkZq2A@public.gmane.org>
Acked-by: Roman Gushchin <roman.gushchin-fxUVXftIFDnyG1zEObXtfA@public.gmane.org>
---
 tools/testing/selftests/cgroup/cgroup_util.c     | 12 ++++++++++++
 tools/testing/selftests/cgroup/cgroup_util.h     |  1 +
 tools/testing/selftests/cgroup/test_memcontrol.c | 16 +++++++++++++---
 3 files changed, 26 insertions(+), 3 deletions(-)

diff --git a/tools/testing/selftests/cgroup/cgroup_util.c b/tools/testing/selftests/cgroup/cgroup_util.c
index dbaa7aabbb4a..e5d8d727bdcf 100644
--- a/tools/testing/selftests/cgroup/cgroup_util.c
+++ b/tools/testing/selftests/cgroup/cgroup_util.c
@@ -535,6 +535,18 @@ int set_oom_adj_score(int pid, int score)
 	return 0;
 }
 
+int proc_mount_contains(const char *option)
+{
+	char buf[4 * PAGE_SIZE];
+	ssize_t read;
+
+	read = read_text("/proc/mounts", buf, sizeof(buf));
+	if (read < 0)
+		return read;
+
+	return strstr(buf, option) != NULL;
+}
+
 ssize_t proc_read_text(int pid, bool thread, const char *item, char *buf, size_t size)
 {
 	char path[PATH_MAX];
diff --git a/tools/testing/selftests/cgroup/cgroup_util.h b/tools/testing/selftests/cgroup/cgroup_util.h
index 628738532ac9..756f76052b44 100644
--- a/tools/testing/selftests/cgroup/cgroup_util.h
+++ b/tools/testing/selftests/cgroup/cgroup_util.h
@@ -48,6 +48,7 @@ extern int is_swap_enabled(void);
 extern int set_oom_adj_score(int pid, int score);
 extern int cg_wait_for_proc_count(const char *cgroup, int count);
 extern int cg_killall(const char *cgroup);
+int proc_mount_contains(const char *option);
 extern ssize_t proc_read_text(int pid, bool thread, const char *item, char *buf, size_t size);
 extern int proc_read_strstr(int pid, bool thread, const char *item, const char *needle);
 extern pid_t clone_into_cgroup(int cgroup_fd);
diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
index 284d912e7d3e..d37e8dfb1248 100644
--- a/tools/testing/selftests/cgroup/test_memcontrol.c
+++ b/tools/testing/selftests/cgroup/test_memcontrol.c
@@ -21,6 +21,8 @@
 #include "../kselftest.h"
 #include "cgroup_util.h"
 
+static bool has_recursiveprot;
+
 /*
  * This test creates two nested cgroups with and without enabling
  * the memory controller.
@@ -521,15 +523,18 @@ static int test_memcg_low(const char *root)
 	}
 
 	for (i = 0; i < ARRAY_SIZE(children); i++) {
+		int no_low_events_index = has_recursiveprot ? 2 : 1;
+
 		oom = cg_read_key_long(children[i], "memory.events", "oom ");
 		low = cg_read_key_long(children[i], "memory.events", "low ");
 
 		if (oom)
 			goto cleanup;
-		if (i < 2 && low <= 0)
+		if (i <= no_low_events_index && low <= 0)
 			goto cleanup;
-		if (i >= 2 && low)
+		if (i > no_low_events_index && low)
 			goto cleanup;
+
 	}
 
 	ret = KSFT_PASS;
@@ -1272,7 +1277,7 @@ struct memcg_test {
 int main(int argc, char **argv)
 {
 	char root[PATH_MAX];
-	int i, ret = EXIT_SUCCESS;
+	int i, proc_status, ret = EXIT_SUCCESS;
 
 	if (cg_find_unified_root(root, sizeof(root)))
 		ksft_exit_skip("cgroup v2 isn't mounted\n");
@@ -1288,6 +1293,11 @@ int main(int argc, char **argv)
 		if (cg_write(root, "cgroup.subtree_control", "+memory"))
 			ksft_exit_skip("Failed to set memory controller\n");
 
+	proc_status = proc_mount_contains("memory_recursiveprot");
+	if (proc_status < 0)
+		ksft_exit_skip("Failed to query cgroup mount option\n");
+	has_recursiveprot = proc_status;
+
 	for (i = 0; i < ARRAY_SIZE(tests); i++) {
 		switch (tests[i].fn(root)) {
 		case KSFT_PASS:
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v2 3/5] cgroup: Account for memory_localevents in test_memcg_oom_group_leaf_events()
@ 2022-04-23 15:56   ` David Vernet
  0 siblings, 0 replies; 70+ messages in thread
From: David Vernet @ 2022-04-23 15:56 UTC (permalink / raw)
  To: akpm
  Cc: tj, roman.gushchin, linux-kernel, linux-mm, cgroups, hannes,
	mhocko, shakeelb, kernel-team, void

The test_memcg_oom_group_leaf_events() testcase in the cgroup memcg tests
validates that processes in a group that perform allocations exceeding
memory.oom.group are killed. It also validates that the
memory.events.oom_kill events are properly propagated in this case.  Commit
06e11c907ea4 ("kselftests: memcg: update the oom group leaf events test")
fixed test_memcg_oom_group_leaf_events() to account for the fact that the
memory.events.oom_kill events in a child cgroup is propagated up to its
parent. This behavior can actually be configured by the memory_localevents
mount option, so this patch updates the testcase to properly account for
the possible presence of this mount option.

Signed-off-by: David Vernet <void@manifault.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
---
 .../selftests/cgroup/test_memcontrol.c        | 21 +++++++++++++++----
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
index d37e8dfb1248..e899b3f28c22 100644
--- a/tools/testing/selftests/cgroup/test_memcontrol.c
+++ b/tools/testing/selftests/cgroup/test_memcontrol.c
@@ -21,6 +21,7 @@
 #include "../kselftest.h"
 #include "cgroup_util.h"
 
+static bool has_localevents;
 static bool has_recursiveprot;
 
 /*
@@ -1091,6 +1092,7 @@ static int test_memcg_oom_group_leaf_events(const char *root)
 {
 	int ret = KSFT_FAIL;
 	char *parent, *child;
+	long parent_oom_events;
 
 	parent = cg_name(root, "memcg_test_0");
 	child = cg_name(root, "memcg_test_0/memcg_test_1");
@@ -1128,10 +1130,16 @@ static int test_memcg_oom_group_leaf_events(const char *root)
 	if (cg_read_key_long(child, "memory.events", "oom_kill ") <= 0)
 		goto cleanup;
 
-	if (cg_read_key_long(parent, "memory.events", "oom_kill ") <= 0)
-		goto cleanup;
-
-	ret = KSFT_PASS;
+	parent_oom_events = cg_read_key_long(
+			parent, "memory.events", "oom_kill ");
+	/*
+	 * If memory_localevents is not enabled (the default), the parent should
+	 * count OOM events in its children groups. Otherwise, it should not
+	 * have observed any events.
+	 */
+	if ((has_localevents && parent_oom_events == 0) ||
+	    parent_oom_events > 0)
+		ret = KSFT_PASS;
 
 cleanup:
 	if (child)
@@ -1298,6 +1306,11 @@ int main(int argc, char **argv)
 		ksft_exit_skip("Failed to query cgroup mount option\n");
 	has_recursiveprot = proc_status;
 
+	proc_status = proc_mount_contains("memory_localevents");
+	if (proc_status < 0)
+		ksft_exit_skip("Failed to query cgroup mount option\n");
+	has_localevents = proc_status;
+
 	for (i = 0; i < ARRAY_SIZE(tests); i++) {
 		switch (tests[i].fn(root)) {
 		case KSFT_PASS:
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v2 3/5] cgroup: Account for memory_localevents in test_memcg_oom_group_leaf_events()
@ 2022-04-23 15:56   ` David Vernet
  0 siblings, 0 replies; 70+ messages in thread
From: David Vernet @ 2022-04-23 15:56 UTC (permalink / raw)
  To: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b
  Cc: tj-DgEjT+Ai2ygdnm+yROfE0A, roman.gushchin-fxUVXftIFDnyG1zEObXtfA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	hannes-druUgvl0LCNAfugRpC6u6w, mhocko-DgEjT+Ai2ygdnm+yROfE0A,
	shakeelb-hpIqsD4AKlfQT0dZR+AlfA, kernel-team-b10kYP2dOMg,
	void-gq6j2QGBifHby3iVrkZq2A

The test_memcg_oom_group_leaf_events() testcase in the cgroup memcg tests
validates that processes in a group that perform allocations exceeding
memory.oom.group are killed. It also validates that the
memory.events.oom_kill events are properly propagated in this case.  Commit
06e11c907ea4 ("kselftests: memcg: update the oom group leaf events test")
fixed test_memcg_oom_group_leaf_events() to account for the fact that the
memory.events.oom_kill events in a child cgroup is propagated up to its
parent. This behavior can actually be configured by the memory_localevents
mount option, so this patch updates the testcase to properly account for
the possible presence of this mount option.

Signed-off-by: David Vernet <void-gq6j2QGBifHby3iVrkZq2A@public.gmane.org>
Acked-by: Roman Gushchin <roman.gushchin-fxUVXftIFDnyG1zEObXtfA@public.gmane.org>
---
 .../selftests/cgroup/test_memcontrol.c        | 21 +++++++++++++++----
 1 file changed, 17 insertions(+), 4 deletions(-)

diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
index d37e8dfb1248..e899b3f28c22 100644
--- a/tools/testing/selftests/cgroup/test_memcontrol.c
+++ b/tools/testing/selftests/cgroup/test_memcontrol.c
@@ -21,6 +21,7 @@
 #include "../kselftest.h"
 #include "cgroup_util.h"
 
+static bool has_localevents;
 static bool has_recursiveprot;
 
 /*
@@ -1091,6 +1092,7 @@ static int test_memcg_oom_group_leaf_events(const char *root)
 {
 	int ret = KSFT_FAIL;
 	char *parent, *child;
+	long parent_oom_events;
 
 	parent = cg_name(root, "memcg_test_0");
 	child = cg_name(root, "memcg_test_0/memcg_test_1");
@@ -1128,10 +1130,16 @@ static int test_memcg_oom_group_leaf_events(const char *root)
 	if (cg_read_key_long(child, "memory.events", "oom_kill ") <= 0)
 		goto cleanup;
 
-	if (cg_read_key_long(parent, "memory.events", "oom_kill ") <= 0)
-		goto cleanup;
-
-	ret = KSFT_PASS;
+	parent_oom_events = cg_read_key_long(
+			parent, "memory.events", "oom_kill ");
+	/*
+	 * If memory_localevents is not enabled (the default), the parent should
+	 * count OOM events in its children groups. Otherwise, it should not
+	 * have observed any events.
+	 */
+	if ((has_localevents && parent_oom_events == 0) ||
+	    parent_oom_events > 0)
+		ret = KSFT_PASS;
 
 cleanup:
 	if (child)
@@ -1298,6 +1306,11 @@ int main(int argc, char **argv)
 		ksft_exit_skip("Failed to query cgroup mount option\n");
 	has_recursiveprot = proc_status;
 
+	proc_status = proc_mount_contains("memory_localevents");
+	if (proc_status < 0)
+		ksft_exit_skip("Failed to query cgroup mount option\n");
+	has_localevents = proc_status;
+
 	for (i = 0; i < ARRAY_SIZE(tests); i++) {
 		switch (tests[i].fn(root)) {
 		case KSFT_PASS:
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v2 4/5] cgroup: Removing racy check in test_memcg_sock()
@ 2022-04-23 15:56   ` David Vernet
  0 siblings, 0 replies; 70+ messages in thread
From: David Vernet @ 2022-04-23 15:56 UTC (permalink / raw)
  To: akpm
  Cc: tj, roman.gushchin, linux-kernel, linux-mm, cgroups, hannes,
	mhocko, shakeelb, kernel-team, void

test_memcg_sock() in the cgroup memcg tests, verifies expected memory
accounting for sockets. The test forks a process which functions as a TCP
server, and sends large buffers back and forth between itself (as the TCP
client) and the forked TCP server. While doing so, it verifies that
memory.current and memory.stat.sock look correct.

There is currently a check in tcp_client() which asserts memory.current >=
memory.stat.sock. This check is racy, as between memory.current and
memory.stat.sock being queried, a packet could come in which causes
mem_cgroup_charge_skmem() to be invoked. This could cause memory.stat.sock
to exceed memory.current. Reversing the order of querying doesn't address
the problem either, as memory may be reclaimed between the two calls.
Instead, this patch just removes that assertion altogether, and instead
relies on the values_close() check that follows to validate the expected
accounting.

Signed-off-by: David Vernet <void@manifault.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
---
 tools/testing/selftests/cgroup/test_memcontrol.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
index e899b3f28c22..38d2054eefe6 100644
--- a/tools/testing/selftests/cgroup/test_memcontrol.c
+++ b/tools/testing/selftests/cgroup/test_memcontrol.c
@@ -993,9 +993,6 @@ static int tcp_client(const char *cgroup, unsigned short port)
 		if (current < 0 || sock < 0)
 			goto close_sk;
 
-		if (current < sock)
-			goto close_sk;
-
 		if (values_close(current, sock, 10)) {
 			ret = KSFT_PASS;
 			break;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v2 4/5] cgroup: Removing racy check in test_memcg_sock()
@ 2022-04-23 15:56   ` David Vernet
  0 siblings, 0 replies; 70+ messages in thread
From: David Vernet @ 2022-04-23 15:56 UTC (permalink / raw)
  To: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b
  Cc: tj-DgEjT+Ai2ygdnm+yROfE0A, roman.gushchin-fxUVXftIFDnyG1zEObXtfA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	hannes-druUgvl0LCNAfugRpC6u6w, mhocko-DgEjT+Ai2ygdnm+yROfE0A,
	shakeelb-hpIqsD4AKlfQT0dZR+AlfA, kernel-team-b10kYP2dOMg,
	void-gq6j2QGBifHby3iVrkZq2A

test_memcg_sock() in the cgroup memcg tests, verifies expected memory
accounting for sockets. The test forks a process which functions as a TCP
server, and sends large buffers back and forth between itself (as the TCP
client) and the forked TCP server. While doing so, it verifies that
memory.current and memory.stat.sock look correct.

There is currently a check in tcp_client() which asserts memory.current >=
memory.stat.sock. This check is racy, as between memory.current and
memory.stat.sock being queried, a packet could come in which causes
mem_cgroup_charge_skmem() to be invoked. This could cause memory.stat.sock
to exceed memory.current. Reversing the order of querying doesn't address
the problem either, as memory may be reclaimed between the two calls.
Instead, this patch just removes that assertion altogether, and instead
relies on the values_close() check that follows to validate the expected
accounting.

Signed-off-by: David Vernet <void-gq6j2QGBifHby3iVrkZq2A@public.gmane.org>
Acked-by: Roman Gushchin <roman.gushchin-fxUVXftIFDnyG1zEObXtfA@public.gmane.org>
---
 tools/testing/selftests/cgroup/test_memcontrol.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
index e899b3f28c22..38d2054eefe6 100644
--- a/tools/testing/selftests/cgroup/test_memcontrol.c
+++ b/tools/testing/selftests/cgroup/test_memcontrol.c
@@ -993,9 +993,6 @@ static int tcp_client(const char *cgroup, unsigned short port)
 		if (current < 0 || sock < 0)
 			goto close_sk;
 
-		if (current < sock)
-			goto close_sk;
-
 		if (values_close(current, sock, 10)) {
 			ret = KSFT_PASS;
 			break;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v2 5/5] cgroup: Fix racy check in alloc_pagecache_max_30M() helper function
@ 2022-04-23 15:56   ` David Vernet
  0 siblings, 0 replies; 70+ messages in thread
From: David Vernet @ 2022-04-23 15:56 UTC (permalink / raw)
  To: akpm
  Cc: tj, roman.gushchin, linux-kernel, linux-mm, cgroups, hannes,
	mhocko, shakeelb, kernel-team, void

alloc_pagecache_max_30M() in the cgroup memcg tests performs a 50MB
pagecache allocation, which it expects to be capped at 30MB due to the
calling process having a memory.high setting of 30MB. After the allocation,
the function contains a check that verifies that MB(29) < memory.current <=
MB(30). This check can actually fail non-deterministically.

The testcases that use this function are test_memcg_high() and
test_memcg_max(), which set memory.min and memory.max to 30MB respectively
for the cgroup under test. The allocation can slightly exceed this number
in both cases, and for memory.max, the process performing the allocation
will not have the OOM killer invoked as it's performing a pagecache
allocation.  This patchset therefore updates the above check to instead use
the verify_close() helper function.

Signed-off-by: David Vernet <void@manifault.com>
Acked-by: Roman Gushchin <roman.gushchin@linux.dev>
---
 tools/testing/selftests/cgroup/test_memcontrol.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
index 38d2054eefe6..3bac06999354 100644
--- a/tools/testing/selftests/cgroup/test_memcontrol.c
+++ b/tools/testing/selftests/cgroup/test_memcontrol.c
@@ -564,9 +564,14 @@ static int alloc_pagecache_max_30M(const char *cgroup, void *arg)
 {
 	size_t size = MB(50);
 	int ret = -1;
-	long current;
+	long current, high, max;
 	int fd;
 
+	high = cg_read_long(cgroup, "memory.high");
+	max = cg_read_long(cgroup, "memory.max");
+	if (high != MB(30) && max != MB(30))
+		goto cleanup;
+
 	fd = get_temp_fd();
 	if (fd < 0)
 		return -1;
@@ -575,7 +580,7 @@ static int alloc_pagecache_max_30M(const char *cgroup, void *arg)
 		goto cleanup;
 
 	current = cg_read_long(cgroup, "memory.current");
-	if (current <= MB(29) || current > MB(30))
+	if (!values_close(current, MB(30), 5))
 		goto cleanup;
 
 	ret = 0;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH v2 5/5] cgroup: Fix racy check in alloc_pagecache_max_30M() helper function
@ 2022-04-23 15:56   ` David Vernet
  0 siblings, 0 replies; 70+ messages in thread
From: David Vernet @ 2022-04-23 15:56 UTC (permalink / raw)
  To: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b
  Cc: tj-DgEjT+Ai2ygdnm+yROfE0A, roman.gushchin-fxUVXftIFDnyG1zEObXtfA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	hannes-druUgvl0LCNAfugRpC6u6w, mhocko-DgEjT+Ai2ygdnm+yROfE0A,
	shakeelb-hpIqsD4AKlfQT0dZR+AlfA, kernel-team-b10kYP2dOMg,
	void-gq6j2QGBifHby3iVrkZq2A

alloc_pagecache_max_30M() in the cgroup memcg tests performs a 50MB
pagecache allocation, which it expects to be capped at 30MB due to the
calling process having a memory.high setting of 30MB. After the allocation,
the function contains a check that verifies that MB(29) < memory.current <=
MB(30). This check can actually fail non-deterministically.

The testcases that use this function are test_memcg_high() and
test_memcg_max(), which set memory.min and memory.max to 30MB respectively
for the cgroup under test. The allocation can slightly exceed this number
in both cases, and for memory.max, the process performing the allocation
will not have the OOM killer invoked as it's performing a pagecache
allocation.  This patchset therefore updates the above check to instead use
the verify_close() helper function.

Signed-off-by: David Vernet <void-gq6j2QGBifHby3iVrkZq2A@public.gmane.org>
Acked-by: Roman Gushchin <roman.gushchin-fxUVXftIFDnyG1zEObXtfA@public.gmane.org>
---
 tools/testing/selftests/cgroup/test_memcontrol.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
index 38d2054eefe6..3bac06999354 100644
--- a/tools/testing/selftests/cgroup/test_memcontrol.c
+++ b/tools/testing/selftests/cgroup/test_memcontrol.c
@@ -564,9 +564,14 @@ static int alloc_pagecache_max_30M(const char *cgroup, void *arg)
 {
 	size_t size = MB(50);
 	int ret = -1;
-	long current;
+	long current, high, max;
 	int fd;
 
+	high = cg_read_long(cgroup, "memory.high");
+	max = cg_read_long(cgroup, "memory.max");
+	if (high != MB(30) && max != MB(30))
+		goto cleanup;
+
 	fd = get_temp_fd();
 	if (fd < 0)
 		return -1;
@@ -575,7 +580,7 @@ static int alloc_pagecache_max_30M(const char *cgroup, void *arg)
 		goto cleanup;
 
 	current = cg_read_long(cgroup, "memory.current");
-	if (current <= MB(29) || current > MB(30))
+	if (!values_close(current, MB(30), 5))
 		goto cleanup;
 
 	ret = 0;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH v2 1/5] cgroups: Refactor children cgroups in memcg tests
@ 2022-04-26  1:56     ` Roman Gushchin
  0 siblings, 0 replies; 70+ messages in thread
From: Roman Gushchin @ 2022-04-26  1:56 UTC (permalink / raw)
  To: David Vernet
  Cc: akpm, tj, linux-kernel, linux-mm, cgroups, hannes, mhocko,
	shakeelb, kernel-team

On Sat, Apr 23, 2022 at 08:56:17AM -0700, David Vernet wrote:
> In test_memcg_min() and test_memcg_low(), there is an array of four sibling
> cgroups. All but one of these sibling groups does a 50MB allocation, and
> the group that does no allocation is the third of four in the array.  This
> is not a problem per se, but makes it a bit tricky to do some assertions in
> test_memcg_low(), as we want to make assertions on the siblings based on
> whether or not they performed allocations. Having a static index before
> which all groups have performed an allocation makes this cleaner.
> 
> This patch therefore reorders the sibling groups so that the group that
> performs no allocations is the last in the array. A follow-on patch will
> leverage this to fix a bug in the test that incorrectly asserts that a
> sibling group that had performed an allocation, but only had protection
> from its parent, will not observe any memory.events.low events during
> reclaim.
> 
> Signed-off-by: David Vernet <void@manifault.com>

Acked-by: Roman Gushchin <roman.gushchin@linux.dev>

Thanks!

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v2 1/5] cgroups: Refactor children cgroups in memcg tests
@ 2022-04-26  1:56     ` Roman Gushchin
  0 siblings, 0 replies; 70+ messages in thread
From: Roman Gushchin @ 2022-04-26  1:56 UTC (permalink / raw)
  To: David Vernet
  Cc: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, tj-DgEjT+Ai2ygdnm+yROfE0A,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	hannes-druUgvl0LCNAfugRpC6u6w, mhocko-DgEjT+Ai2ygdnm+yROfE0A,
	shakeelb-hpIqsD4AKlfQT0dZR+AlfA, kernel-team-b10kYP2dOMg

On Sat, Apr 23, 2022 at 08:56:17AM -0700, David Vernet wrote:
> In test_memcg_min() and test_memcg_low(), there is an array of four sibling
> cgroups. All but one of these sibling groups does a 50MB allocation, and
> the group that does no allocation is the third of four in the array.  This
> is not a problem per se, but makes it a bit tricky to do some assertions in
> test_memcg_low(), as we want to make assertions on the siblings based on
> whether or not they performed allocations. Having a static index before
> which all groups have performed an allocation makes this cleaner.
> 
> This patch therefore reorders the sibling groups so that the group that
> performs no allocations is the last in the array. A follow-on patch will
> leverage this to fix a bug in the test that incorrectly asserts that a
> sibling group that had performed an allocation, but only had protection
> from its parent, will not observe any memory.events.low events during
> reclaim.
> 
> Signed-off-by: David Vernet <void-gq6j2QGBifHby3iVrkZq2A@public.gmane.org>

Acked-by: Roman Gushchin <roman.gushchin-fxUVXftIFDnyG1zEObXtfA@public.gmane.org>

Thanks!

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v2 2/5] cgroup: Account for memory_recursiveprot in test_memcg_low()
@ 2022-04-27 14:09     ` Michal Koutný
  0 siblings, 0 replies; 70+ messages in thread
From: Michal Koutný @ 2022-04-27 14:09 UTC (permalink / raw)
  To: David Vernet
  Cc: akpm, tj, roman.gushchin, linux-kernel, linux-mm, cgroups,
	hannes, mhocko, shakeelb, kernel-team, Richard Palethorpe

Hello David.

On Sat, Apr 23, 2022 at 08:56:19AM -0700, David Vernet <void@manifault.com> wrote:
> This unfortunately broke the memcg tests, which asserts that a sibling
> that experienced reclaim but had a memory.low value of 0, would not
> observe any memory.low events. This patch updates test_memcg_low() to
> account for the new behavior introduced by memory_recursiveprot.

I think the test is correct, there should be no (not even recursive)
protection in this particular case (when the remaining siblings consume
all of parental protection).

This should be fixed in the kernel (see also [1], no updates from me yet
:-/)

Michal

[1] https://lore.kernel.org/lkml/20220322182248.29121-1-mkoutny@suse.com/

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v2 2/5] cgroup: Account for memory_recursiveprot in test_memcg_low()
@ 2022-04-27 14:09     ` Michal Koutný
  0 siblings, 0 replies; 70+ messages in thread
From: Michal Koutný @ 2022-04-27 14:09 UTC (permalink / raw)
  To: David Vernet
  Cc: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, tj-DgEjT+Ai2ygdnm+yROfE0A,
	roman.gushchin-fxUVXftIFDnyG1zEObXtfA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	hannes-druUgvl0LCNAfugRpC6u6w, mhocko-DgEjT+Ai2ygdnm+yROfE0A,
	shakeelb-hpIqsD4AKlfQT0dZR+AlfA, kernel-team-b10kYP2dOMg,
	Richard Palethorpe

Hello David.

On Sat, Apr 23, 2022 at 08:56:19AM -0700, David Vernet <void-gq6j2QGBifHby3iVrkZq2A@public.gmane.org> wrote:
> This unfortunately broke the memcg tests, which asserts that a sibling
> that experienced reclaim but had a memory.low value of 0, would not
> observe any memory.low events. This patch updates test_memcg_low() to
> account for the new behavior introduced by memory_recursiveprot.

I think the test is correct, there should be no (not even recursive)
protection in this particular case (when the remaining siblings consume
all of parental protection).

This should be fixed in the kernel (see also [1], no updates from me yet
:-/)

Michal

[1] https://lore.kernel.org/lkml/20220322182248.29121-1-mkoutny-IBi9RG/b67k@public.gmane.org/

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v2 2/5] cgroup: Account for memory_recursiveprot in test_memcg_low()
@ 2022-04-29  1:03       ` David Vernet
  0 siblings, 0 replies; 70+ messages in thread
From: David Vernet @ 2022-04-29  1:03 UTC (permalink / raw)
  To: Michal Koutný
  Cc: akpm, tj, roman.gushchin, linux-kernel, linux-mm, cgroups,
	hannes, mhocko, shakeelb, kernel-team, Richard Palethorpe

Hi Michal,

On Wed, Apr 27, 2022 at 04:09:28PM +0200, Michal Koutný wrote:
> Hello David.
> 
> On Sat, Apr 23, 2022 at 08:56:19AM -0700, David Vernet <void@manifault.com> wrote:
> > This unfortunately broke the memcg tests, which asserts that a sibling
> > that experienced reclaim but had a memory.low value of 0, would not
> > observe any memory.low events. This patch updates test_memcg_low() to
> > account for the new behavior introduced by memory_recursiveprot.
> 
> I think the test is correct, there should be no (not even recursive)
> protection in this particular case (when the remaining siblings consume
> all of parental protection).
> 
> This should be fixed in the kernel (see also [1], no updates from me yet
> :-/)
> 
> Michal
> 
> [1] https://lore.kernel.org/lkml/20220322182248.29121-1-mkoutny@suse.com/
> 

I see, thanks for sharing that context. I think I see your point about the
implementation of the reclaim mechanism potentially overcounting, but my
interpretation of the rest of that discussion with Roman is that we haven't
yet decided whether we don't want to propagate memory.low events from
children cgroups with memory.low == 0. Or at the very least, some more
justification was requested on why not counting such events was prudent.

Would you be ok with merging this patch so that the cgroup selftests can
pass again based on the current behavior of the kernel, and we can then
revert the changes to test_memcg_low() later on if and when we decide that
we don't want to propagate memory.low events for memory.low == 0 children?

Thanks,
David

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v2 2/5] cgroup: Account for memory_recursiveprot in test_memcg_low()
@ 2022-04-29  1:03       ` David Vernet
  0 siblings, 0 replies; 70+ messages in thread
From: David Vernet @ 2022-04-29  1:03 UTC (permalink / raw)
  To: Michal Koutný
  Cc: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, tj-DgEjT+Ai2ygdnm+yROfE0A,
	roman.gushchin-fxUVXftIFDnyG1zEObXtfA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	hannes-druUgvl0LCNAfugRpC6u6w, mhocko-DgEjT+Ai2ygdnm+yROfE0A,
	shakeelb-hpIqsD4AKlfQT0dZR+AlfA, kernel-team-b10kYP2dOMg,
	Richard Palethorpe

Hi Michal,

On Wed, Apr 27, 2022 at 04:09:28PM +0200, Michal Koutný wrote:
> Hello David.
> 
> On Sat, Apr 23, 2022 at 08:56:19AM -0700, David Vernet <void-gq6j2QGBifHby3iVrkZq2A@public.gmane.org> wrote:
> > This unfortunately broke the memcg tests, which asserts that a sibling
> > that experienced reclaim but had a memory.low value of 0, would not
> > observe any memory.low events. This patch updates test_memcg_low() to
> > account for the new behavior introduced by memory_recursiveprot.
> 
> I think the test is correct, there should be no (not even recursive)
> protection in this particular case (when the remaining siblings consume
> all of parental protection).
> 
> This should be fixed in the kernel (see also [1], no updates from me yet
> :-/)
> 
> Michal
> 
> [1] https://lore.kernel.org/lkml/20220322182248.29121-1-mkoutny-IBi9RG/b67k@public.gmane.org/
> 

I see, thanks for sharing that context. I think I see your point about the
implementation of the reclaim mechanism potentially overcounting, but my
interpretation of the rest of that discussion with Roman is that we haven't
yet decided whether we don't want to propagate memory.low events from
children cgroups with memory.low == 0. Or at the very least, some more
justification was requested on why not counting such events was prudent.

Would you be ok with merging this patch so that the cgroup selftests can
pass again based on the current behavior of the kernel, and we can then
revert the changes to test_memcg_low() later on if and when we decide that
we don't want to propagate memory.low events for memory.low == 0 children?

Thanks,
David

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v2 2/5] cgroup: Account for memory_recursiveprot in test_memcg_low()
@ 2022-04-29  9:26         ` Michal Koutný
  0 siblings, 0 replies; 70+ messages in thread
From: Michal Koutný @ 2022-04-29  9:26 UTC (permalink / raw)
  To: David Vernet
  Cc: akpm, tj, roman.gushchin, linux-kernel, linux-mm, cgroups,
	hannes, mhocko, shakeelb, kernel-team, Richard Palethorpe

On Thu, Apr 28, 2022 at 06:03:33PM -0700, David Vernet <void@manifault.com> wrote:
> but my interpretation of the rest of that discussion with Roman is
> that we haven't yet decided whether we don't want to propagate
> memory.low events from children cgroups with memory.low == 0. Or at
> the very least, some more justification was requested on why not
> counting such events was prudent.

I'm not a fan of that original proposal of mine anymore (to be more
precise, of _only_ that patch, there's still the RFCness reason 1) to
consider).
As I shared with the last reply there, there's a problem in the behavior
which shouldn't be masked by filtering some events.

> Would you be ok with merging this patch so that the cgroup selftests can
> pass again based on the current behavior of the kernel, and we can then
> revert the changes to test_memcg_low() later on if and when we decide that
> we don't want to propagate memory.low events for memory.low == 0 children?

I still think that the behavior when there's no protection left for the
memory.low == 0 child, there should be no memory.low events (not just
uncounted but not happening) and test should not accept this (even
though it's the current behavior).

What might improve the test space would be to have two configs like

Original one (simplified here)
	parent		memory.low=50M	memory.current=100M
	` child1	memory.low=50M	memory.current=50M
	` child2	memory.low=0M	memory.current=50M

New one (checks events due to recursive protection)
	parent		memory.low=50M	memory.current=100M
	` child1	memory.low=40M	memory.current=50M
	` child2	memory.low=0M	memory.current=50M

The second config assigns recursive protection to child2 and should
therefore cause memory.low events in child2 (with memory_recursiveprot
enabled of course).

Or alternative new one (checks events due to recursive protection)
	parent		memory.low=50M	memory.current=100M
	` child1	memory.low=0M	memory.current=50M
	` child2	memory.low=0M	memory.current=50M

HTH,
Michal

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v2 2/5] cgroup: Account for memory_recursiveprot in test_memcg_low()
@ 2022-04-29  9:26         ` Michal Koutný
  0 siblings, 0 replies; 70+ messages in thread
From: Michal Koutný @ 2022-04-29  9:26 UTC (permalink / raw)
  To: David Vernet
  Cc: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, tj-DgEjT+Ai2ygdnm+yROfE0A,
	roman.gushchin-fxUVXftIFDnyG1zEObXtfA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	hannes-druUgvl0LCNAfugRpC6u6w, mhocko-DgEjT+Ai2ygdnm+yROfE0A,
	shakeelb-hpIqsD4AKlfQT0dZR+AlfA, kernel-team-b10kYP2dOMg,
	Richard Palethorpe

On Thu, Apr 28, 2022 at 06:03:33PM -0700, David Vernet <void-gq6j2QGBifHby3iVrkZq2A@public.gmane.org> wrote:
> but my interpretation of the rest of that discussion with Roman is
> that we haven't yet decided whether we don't want to propagate
> memory.low events from children cgroups with memory.low == 0. Or at
> the very least, some more justification was requested on why not
> counting such events was prudent.

I'm not a fan of that original proposal of mine anymore (to be more
precise, of _only_ that patch, there's still the RFCness reason 1) to
consider).
As I shared with the last reply there, there's a problem in the behavior
which shouldn't be masked by filtering some events.

> Would you be ok with merging this patch so that the cgroup selftests can
> pass again based on the current behavior of the kernel, and we can then
> revert the changes to test_memcg_low() later on if and when we decide that
> we don't want to propagate memory.low events for memory.low == 0 children?

I still think that the behavior when there's no protection left for the
memory.low == 0 child, there should be no memory.low events (not just
uncounted but not happening) and test should not accept this (even
though it's the current behavior).

What might improve the test space would be to have two configs like

Original one (simplified here)
	parent		memory.low=50M	memory.current=100M
	` child1	memory.low=50M	memory.current=50M
	` child2	memory.low=0M	memory.current=50M

New one (checks events due to recursive protection)
	parent		memory.low=50M	memory.current=100M
	` child1	memory.low=40M	memory.current=50M
	` child2	memory.low=0M	memory.current=50M

The second config assigns recursive protection to child2 and should
therefore cause memory.low events in child2 (with memory_recursiveprot
enabled of course).

Or alternative new one (checks events due to recursive protection)
	parent		memory.low=50M	memory.current=100M
	` child1	memory.low=0M	memory.current=50M
	` child2	memory.low=0M	memory.current=50M

HTH,
Michal

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v2 2/5] cgroup: Account for memory_recursiveprot in test_memcg_low()
@ 2022-05-06 16:40           ` David Vernet
  0 siblings, 0 replies; 70+ messages in thread
From: David Vernet @ 2022-05-06 16:40 UTC (permalink / raw)
  To: Michal Koutný
  Cc: akpm, tj, roman.gushchin, linux-kernel, linux-mm, cgroups,
	hannes, mhocko, shakeelb, kernel-team, Richard Palethorpe

Sorry for the delayed reply, Michal. I've been at LSFMM this week.

On Fri, Apr 29, 2022 at 11:26:20AM +0200, Michal Koutný wrote:
> I still think that the behavior when there's no protection left for the
> memory.low == 0 child, there should be no memory.low events (not just
> uncounted but not happening) and test should not accept this (even
> though it's the current behavior).

That's fair. I think part of the problem here is that in general, the
memcontroller itself is quite heuristic, so it's tough to write tests that
provide useful coverage while also being sufficiently flexible to avoid
flakiness and over-prescribing expected behavior. In this case I think it's
probably correct that the memory.low == 0 child shouldn't inherit
protection from its parent under any circumstances due to its siblings
overcommitting the parent's protection, but I also wonder if it's really
necessary to enforce that. If you look at how much memory A/B/E gets at the
end of the reclaim, it's still far less than 1MB (though should it be 0?).
I'd be curious to hear what Johannes thinks.

> What might improve the test space would be to have two configs like
> 
> Original one (simplified here)
> 	parent		memory.low=50M	memory.current=100M
> 	` child1	memory.low=50M	memory.current=50M
> 	` child2	memory.low=0M	memory.current=50M
> 
> New one (checks events due to recursive protection)
> 	parent		memory.low=50M	memory.current=100M
> 	` child1	memory.low=40M	memory.current=50M
> 	` child2	memory.low=0M	memory.current=50M
> 
> The second config assigns recursive protection to child2 and should
> therefore cause memory.low events in child2 (with memory_recursiveprot
> enabled of course).

Something like this would work, though I think it's useful to specifically
validate the behavior of the memcontroller when the children overcommit the
parent's memory.low protection, which the current test does. So I'm
inclined to keep this testcase, and add your next suggestion:

> Or alternative new one (checks events due to recursive protection)
> 	parent		memory.low=50M	memory.current=100M
> 	` child1	memory.low=0M	memory.current=50M
> 	` child2	memory.low=0M	memory.current=50M

This definitely sounds to me like a useful testcase to add, and I'm happy
to do so in a follow-on patch. If we added this, do you think we need to
keep the check for memory.low events for the memory.low == 0 child in the
overcommit testcase? It arguably helped to catch the SWAP_CLUSTER_MAX
rounding issue you pointed out. Again, curious to hear what Johannes thinks
as well.

Thanks,
David

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v2 2/5] cgroup: Account for memory_recursiveprot in test_memcg_low()
@ 2022-05-06 16:40           ` David Vernet
  0 siblings, 0 replies; 70+ messages in thread
From: David Vernet @ 2022-05-06 16:40 UTC (permalink / raw)
  To: Michal Koutný
  Cc: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, tj-DgEjT+Ai2ygdnm+yROfE0A,
	roman.gushchin-fxUVXftIFDnyG1zEObXtfA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	hannes-druUgvl0LCNAfugRpC6u6w, mhocko-DgEjT+Ai2ygdnm+yROfE0A,
	shakeelb-hpIqsD4AKlfQT0dZR+AlfA, kernel-team-b10kYP2dOMg,
	Richard Palethorpe

Sorry for the delayed reply, Michal. I've been at LSFMM this week.

On Fri, Apr 29, 2022 at 11:26:20AM +0200, Michal Koutný wrote:
> I still think that the behavior when there's no protection left for the
> memory.low == 0 child, there should be no memory.low events (not just
> uncounted but not happening) and test should not accept this (even
> though it's the current behavior).

That's fair. I think part of the problem here is that in general, the
memcontroller itself is quite heuristic, so it's tough to write tests that
provide useful coverage while also being sufficiently flexible to avoid
flakiness and over-prescribing expected behavior. In this case I think it's
probably correct that the memory.low == 0 child shouldn't inherit
protection from its parent under any circumstances due to its siblings
overcommitting the parent's protection, but I also wonder if it's really
necessary to enforce that. If you look at how much memory A/B/E gets at the
end of the reclaim, it's still far less than 1MB (though should it be 0?).
I'd be curious to hear what Johannes thinks.

> What might improve the test space would be to have two configs like
> 
> Original one (simplified here)
> 	parent		memory.low=50M	memory.current=100M
> 	` child1	memory.low=50M	memory.current=50M
> 	` child2	memory.low=0M	memory.current=50M
> 
> New one (checks events due to recursive protection)
> 	parent		memory.low=50M	memory.current=100M
> 	` child1	memory.low=40M	memory.current=50M
> 	` child2	memory.low=0M	memory.current=50M
> 
> The second config assigns recursive protection to child2 and should
> therefore cause memory.low events in child2 (with memory_recursiveprot
> enabled of course).

Something like this would work, though I think it's useful to specifically
validate the behavior of the memcontroller when the children overcommit the
parent's memory.low protection, which the current test does. So I'm
inclined to keep this testcase, and add your next suggestion:

> Or alternative new one (checks events due to recursive protection)
> 	parent		memory.low=50M	memory.current=100M
> 	` child1	memory.low=0M	memory.current=50M
> 	` child2	memory.low=0M	memory.current=50M

This definitely sounds to me like a useful testcase to add, and I'm happy
to do so in a follow-on patch. If we added this, do you think we need to
keep the check for memory.low events for the memory.low == 0 child in the
overcommit testcase? It arguably helped to catch the SWAP_CLUSTER_MAX
rounding issue you pointed out. Again, curious to hear what Johannes thinks
as well.

Thanks,
David

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v2 2/5] cgroup: Account for memory_recursiveprot in test_memcg_low()
@ 2022-05-09 15:09             ` Johannes Weiner
  0 siblings, 0 replies; 70+ messages in thread
From: Johannes Weiner @ 2022-05-09 15:09 UTC (permalink / raw)
  To: David Vernet
  Cc: Michal Koutný,
	akpm, tj, roman.gushchin, linux-kernel, linux-mm, cgroups,
	mhocko, shakeelb, kernel-team, Richard Palethorpe

On Fri, May 06, 2022 at 09:40:15AM -0700, David Vernet wrote:
> Sorry for the delayed reply, Michal. I've been at LSFMM this week.
> 
> On Fri, Apr 29, 2022 at 11:26:20AM +0200, Michal Koutný wrote:
> > I still think that the behavior when there's no protection left for the
> > memory.low == 0 child, there should be no memory.low events (not just
> > uncounted but not happening) and test should not accept this (even
> > though it's the current behavior).
>
> That's fair. I think part of the problem here is that in general, the
> memcontroller itself is quite heuristic, so it's tough to write tests that
> provide useful coverage while also being sufficiently flexible to avoid
> flakiness and over-prescribing expected behavior. In this case I think it's
> probably correct that the memory.low == 0 child shouldn't inherit
> protection from its parent under any circumstances due to its siblings
> overcommitting the parent's protection, but I also wonder if it's really
> necessary to enforce that. If you look at how much memory A/B/E gets at the
> end of the reclaim, it's still far less than 1MB (though should it be 0?).
> I'd be curious to hear what Johannes thinks.

We need to distinguish between what the siblings declare and what they
consume.

My understanding of the issue you're raising, Michal, is that
protected siblings start with current > low, then get reclaimed
slightly too much and end up with current < low. This results in a
tiny bit of float that then gets assigned to the low=0 sibling; when
that sibling gets reclaimed regardless, it sees a low event. Correct
me if I missed a detail or nuance here.

But unused float going to siblings is intentional. This is documented
in point 3 in the comment above effective_protection(): if you use
less than you're legitimately claiming, the float goes to your
siblings. So the problem doesn't seem to be with low accounting and
event generation, but rather it's simply overreclaim.

It's conceivable to make reclaim more precise and then tighten up the
test. But right now, David's patch looks correct to me.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v2 2/5] cgroup: Account for memory_recursiveprot in test_memcg_low()
@ 2022-05-09 15:09             ` Johannes Weiner
  0 siblings, 0 replies; 70+ messages in thread
From: Johannes Weiner @ 2022-05-09 15:09 UTC (permalink / raw)
  To: David Vernet
  Cc: Michal Koutný,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, tj-DgEjT+Ai2ygdnm+yROfE0A,
	roman.gushchin-fxUVXftIFDnyG1zEObXtfA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	mhocko-DgEjT+Ai2ygdnm+yROfE0A, shakeelb-hpIqsD4AKlfQT0dZR+AlfA,
	kernel-team-b10kYP2dOMg, Richard Palethorpe

On Fri, May 06, 2022 at 09:40:15AM -0700, David Vernet wrote:
> Sorry for the delayed reply, Michal. I've been at LSFMM this week.
> 
> On Fri, Apr 29, 2022 at 11:26:20AM +0200, Michal Koutný wrote:
> > I still think that the behavior when there's no protection left for the
> > memory.low == 0 child, there should be no memory.low events (not just
> > uncounted but not happening) and test should not accept this (even
> > though it's the current behavior).
>
> That's fair. I think part of the problem here is that in general, the
> memcontroller itself is quite heuristic, so it's tough to write tests that
> provide useful coverage while also being sufficiently flexible to avoid
> flakiness and over-prescribing expected behavior. In this case I think it's
> probably correct that the memory.low == 0 child shouldn't inherit
> protection from its parent under any circumstances due to its siblings
> overcommitting the parent's protection, but I also wonder if it's really
> necessary to enforce that. If you look at how much memory A/B/E gets at the
> end of the reclaim, it's still far less than 1MB (though should it be 0?).
> I'd be curious to hear what Johannes thinks.

We need to distinguish between what the siblings declare and what they
consume.

My understanding of the issue you're raising, Michal, is that
protected siblings start with current > low, then get reclaimed
slightly too much and end up with current < low. This results in a
tiny bit of float that then gets assigned to the low=0 sibling; when
that sibling gets reclaimed regardless, it sees a low event. Correct
me if I missed a detail or nuance here.

But unused float going to siblings is intentional. This is documented
in point 3 in the comment above effective_protection(): if you use
less than you're legitimately claiming, the float goes to your
siblings. So the problem doesn't seem to be with low accounting and
event generation, but rather it's simply overreclaim.

It's conceivable to make reclaim more precise and then tighten up the
test. But right now, David's patch looks correct to me.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v2 2/5] cgroup: Account for memory_recursiveprot in test_memcg_low()
@ 2022-05-10  0:44               ` Andrew Morton
  0 siblings, 0 replies; 70+ messages in thread
From: Andrew Morton @ 2022-05-10  0:44 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: David Vernet, Michal Koutný,
	tj, roman.gushchin, linux-kernel, linux-mm, cgroups, mhocko,
	shakeelb, kernel-team, Richard Palethorpe

On Mon, 9 May 2022 11:09:15 -0400 Johannes Weiner <hannes@cmpxchg.org> wrote:

> On Fri, May 06, 2022 at 09:40:15AM -0700, David Vernet wrote:
> > Sorry for the delayed reply, Michal. I've been at LSFMM this week.
> > 
> > On Fri, Apr 29, 2022 at 11:26:20AM +0200, Michal Koutný wrote:
> > > I still think that the behavior when there's no protection left for the
> > > memory.low == 0 child, there should be no memory.low events (not just
> > > uncounted but not happening) and test should not accept this (even
> > > though it's the current behavior).
> >
> > That's fair. I think part of the problem here is that in general, the
> > memcontroller itself is quite heuristic, so it's tough to write tests that
> > provide useful coverage while also being sufficiently flexible to avoid
> > flakiness and over-prescribing expected behavior. In this case I think it's
> > probably correct that the memory.low == 0 child shouldn't inherit
> > protection from its parent under any circumstances due to its siblings
> > overcommitting the parent's protection, but I also wonder if it's really
> > necessary to enforce that. If you look at how much memory A/B/E gets at the
> > end of the reclaim, it's still far less than 1MB (though should it be 0?).
> > I'd be curious to hear what Johannes thinks.
> 
> We need to distinguish between what the siblings declare and what they
> consume.
> 
> My understanding of the issue you're raising, Michal, is that
> protected siblings start with current > low, then get reclaimed
> slightly too much and end up with current < low. This results in a
> tiny bit of float that then gets assigned to the low=0 sibling; when
> that sibling gets reclaimed regardless, it sees a low event. Correct
> me if I missed a detail or nuance here.
> 
> But unused float going to siblings is intentional. This is documented
> in point 3 in the comment above effective_protection(): if you use
> less than you're legitimately claiming, the float goes to your
> siblings. So the problem doesn't seem to be with low accounting and
> event generation, but rather it's simply overreclaim.
> 
> It's conceivable to make reclaim more precise and then tighten up the
> test. But right now, David's patch looks correct to me.

So I think we're OK with [2/5] now.  Unless there be objections, I'll
be looking to get this series into mm-stable later this week.


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v2 2/5] cgroup: Account for memory_recursiveprot in test_memcg_low()
@ 2022-05-10  0:44               ` Andrew Morton
  0 siblings, 0 replies; 70+ messages in thread
From: Andrew Morton @ 2022-05-10  0:44 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: David Vernet, Michal Koutný,
	tj-DgEjT+Ai2ygdnm+yROfE0A, roman.gushchin-fxUVXftIFDnyG1zEObXtfA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	mhocko-DgEjT+Ai2ygdnm+yROfE0A, shakeelb-hpIqsD4AKlfQT0dZR+AlfA,
	kernel-team-b10kYP2dOMg, Richard Palethorpe

On Mon, 9 May 2022 11:09:15 -0400 Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org> wrote:

> On Fri, May 06, 2022 at 09:40:15AM -0700, David Vernet wrote:
> > Sorry for the delayed reply, Michal. I've been at LSFMM this week.
> > 
> > On Fri, Apr 29, 2022 at 11:26:20AM +0200, Michal Koutný wrote:
> > > I still think that the behavior when there's no protection left for the
> > > memory.low == 0 child, there should be no memory.low events (not just
> > > uncounted but not happening) and test should not accept this (even
> > > though it's the current behavior).
> >
> > That's fair. I think part of the problem here is that in general, the
> > memcontroller itself is quite heuristic, so it's tough to write tests that
> > provide useful coverage while also being sufficiently flexible to avoid
> > flakiness and over-prescribing expected behavior. In this case I think it's
> > probably correct that the memory.low == 0 child shouldn't inherit
> > protection from its parent under any circumstances due to its siblings
> > overcommitting the parent's protection, but I also wonder if it's really
> > necessary to enforce that. If you look at how much memory A/B/E gets at the
> > end of the reclaim, it's still far less than 1MB (though should it be 0?).
> > I'd be curious to hear what Johannes thinks.
> 
> We need to distinguish between what the siblings declare and what they
> consume.
> 
> My understanding of the issue you're raising, Michal, is that
> protected siblings start with current > low, then get reclaimed
> slightly too much and end up with current < low. This results in a
> tiny bit of float that then gets assigned to the low=0 sibling; when
> that sibling gets reclaimed regardless, it sees a low event. Correct
> me if I missed a detail or nuance here.
> 
> But unused float going to siblings is intentional. This is documented
> in point 3 in the comment above effective_protection(): if you use
> less than you're legitimately claiming, the float goes to your
> siblings. So the problem doesn't seem to be with low accounting and
> event generation, but rather it's simply overreclaim.
> 
> It's conceivable to make reclaim more precise and then tighten up the
> test. But right now, David's patch looks correct to me.

So I think we're OK with [2/5] now.  Unless there be objections, I'll
be looking to get this series into mm-stable later this week.


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v2 2/5] cgroup: Account for memory_recursiveprot in test_memcg_low()
@ 2022-05-10 17:43                 ` Michal Koutný
  0 siblings, 0 replies; 70+ messages in thread
From: Michal Koutný @ 2022-05-10 17:43 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, David Vernet, tj, roman.gushchin, linux-kernel,
	linux-mm, cgroups, mhocko, shakeelb, kernel-team,
	Richard Palethorpe, Chris Down

Hello all.

On Mon, May 09, 2022 at 05:44:24PM -0700, Andrew Morton <akpm@linux-foundation.org> wrote:
> So I think we're OK with [2/5] now.  Unless there be objections, I'll
> be looking to get this series into mm-stable later this week.

I'm sorry, I think the current form of the test reveals an unexpected
behavior of reclaim and silencing the test is not the way to go.
Although, I may be convinced that my understanding is wrong.


On Mon, May 09, 2022 at 11:09:15AM -0400, Johannes Weiner <hannes@cmpxchg.org> wrote:
> My understanding of the issue you're raising, Michal, is that
> protected siblings start with current > low, then get reclaimed
> slightly too much and end up with current < low. This results in a
> tiny bit of float that then gets assigned to the low=0 sibling; 

Up until here, we're on the same page.

> when that sibling gets reclaimed regardless, it sees a low event.
> Correct me if I missed a detail or nuance here.

Here, I'd like to stress that the event itself is just a messenger (whom
my original RFC patch attempted to get rid of). The problem is that if
the sibling with recursive protection is active enough to claim it, it's
effectively stolen from the passive sibling. See the comparison of
'precious' vs 'victim' in [1].

> But unused float going to siblings is intentional. This is documented
> in point 3 in the comment above effective_protection(): if you use
> less than you're legitimately claiming, the float goes to your
> siblings.

The problem is how the unused protection came to be (voluntarily not
consumed vs reclaimed).

> So the problem doesn't seem to be with low accounting and
> event generation, but rather it's simply overreclaim.

Exactly.

> It's conceivable to make reclaim more precise and then tighten up the
> test. But right now, David's patch looks correct to me.

The obvious fix is at the end of this message, it resolves the case I
posted earlier (with memory_recursiveprot), however, it "breaks"
memory.events:low accounting inside recursive children, hence I'm not
considering it finished. (I may elaborate on the breaking case if
interested, I also need to look more into that myself).


On Fri, May 06, 2022 at 09:40:15AM -0700, David Vernet <void@manifault.com> wrote:
> If you look at how much memory A/B/E gets at the end of the reclaim,
> it's still far less than 1MB (though should it be 0?).

This selftest has two ±equal workloads in siblings, however, if their
activity varies, it can end up even opposite (the example [1]).

> This definitely sounds to me like a useful testcase to add, and I'm
> happy to do so in a follow-on patch. If we added this, do you think
> we need to keep the check for memory.low events for the memory.low ==
> 0 child in the overcommit testcase?

I think it's still useful, to check the behavior when inherited vs
explicit siblings coexist under protected parent.
Actually, the second case of all siblings having the inherited
(implicit) protection is also interesting (it seems that's that I'm
seeing in my tests with the attached patch).

+Cc: Chris, who reasoned about the SWAP_CLUSTER_MAX rounding vs too high
priority (too low numerically IIUC) [2].

Michal

[1] https://lore.kernel.org/r/20220325103118.GC2828@blackbody.suse.cz/
[2] https://lore.kernel.org/all/20190128214213.GB15349@chrisdown.name/

--- 8< ---
From e18caf7a5a1b0f39185fbdc11e4034def42cde88 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Michal=20Koutn=C3=BD?= <mkoutny@suse.com>
Date: Tue, 10 May 2022 18:48:31 +0200
Subject: [RFC PATCH] mm: memcg: Do not overreclaim SWAP_CLUSTER_MAX from
 protected memcg
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This was observed with memcontrol selftest/new LTP test but can be also
reproduced in simplified setup of two siblings:

	`parent .low=50M
	  ` s1	.low=50M  .current=50M+ε
	  ` s2  .low=0M   .current=50M

The expectation is that s2/memory.events:low will be zero under outer
reclaimer since no protection should be given to cgroup s2 (even with
memory_recursiveprot).

However, this does not happen. The apparent reason is that when s1 is
considered for (proportional) reclaim the scanned proportion is rounded
up to SWAP_CLUSTER_MAX and slightly over-proportional amount is
reclaimed. Consequently, when the effective low value of s2 is
calculated, it observes unclaimed parent's protection from s1
(ε-SWAP_CLUSTER_MAX in theory) and effectively appropriates it.

What is worse, when the sibling s2 has more (memory) greedy workload, it
can repeatedly "steal" the protection from s1 and the distribution ends
up with s1 mostly reclaimed despite explicit prioritization over s2.

Simply fix it by _not_ rounding up to SWAP_CLUSTER_MAX. This would have
saved us ~5 levels of reclaim priority. I.e. we may be reclaiming from
protected memcgs at relatively low priority _without_ counting any
memory.events:low (due to overreclaim). Now, if the moderated scan is
not enough, we must bring priority to zero to open protected reserves.
And that's correct, we want to be explicit when reclaiming those.


Fixes: 8a931f801340 ("mm: memcontrol: recursive memory.low protection")
Fixes: 9783aa9917f8 ("mm, memcg: proportional memory.{low,min} reclaim")
Reported-by: Richard Palethorpe <rpalethorpe@suse.com>
Link: https://lore.kernel.org/all/20220321101429.3703-1-rpalethorpe@suse.com/
Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
 mm/vmscan.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 1678802e03e7..cd760842b9ad 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2798,13 +2798,6 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc,
 
 			scan = lruvec_size - lruvec_size * protection /
 				(cgroup_size + 1);
-
-			/*
-			 * Minimally target SWAP_CLUSTER_MAX pages to keep
-			 * reclaim moving forwards, avoiding decrementing
-			 * sc->priority further than desirable.
-			 */
-			scan = max(scan, SWAP_CLUSTER_MAX);
 		} else {
 			scan = lruvec_size;
 		}
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH v2 2/5] cgroup: Account for memory_recursiveprot in test_memcg_low()
@ 2022-05-10 17:43                 ` Michal Koutný
  0 siblings, 0 replies; 70+ messages in thread
From: Michal Koutný @ 2022-05-10 17:43 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Johannes Weiner, David Vernet, tj-DgEjT+Ai2ygdnm+yROfE0A,
	roman.gushchin-fxUVXftIFDnyG1zEObXtfA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	mhocko-DgEjT+Ai2ygdnm+yROfE0A, shakeelb-hpIqsD4AKlfQT0dZR+AlfA,
	kernel-team-b10kYP2dOMg, Richard Palethorpe, Chris Down

Hello all.

On Mon, May 09, 2022 at 05:44:24PM -0700, Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org> wrote:
> So I think we're OK with [2/5] now.  Unless there be objections, I'll
> be looking to get this series into mm-stable later this week.

I'm sorry, I think the current form of the test reveals an unexpected
behavior of reclaim and silencing the test is not the way to go.
Although, I may be convinced that my understanding is wrong.


On Mon, May 09, 2022 at 11:09:15AM -0400, Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org> wrote:
> My understanding of the issue you're raising, Michal, is that
> protected siblings start with current > low, then get reclaimed
> slightly too much and end up with current < low. This results in a
> tiny bit of float that then gets assigned to the low=0 sibling; 

Up until here, we're on the same page.

> when that sibling gets reclaimed regardless, it sees a low event.
> Correct me if I missed a detail or nuance here.

Here, I'd like to stress that the event itself is just a messenger (whom
my original RFC patch attempted to get rid of). The problem is that if
the sibling with recursive protection is active enough to claim it, it's
effectively stolen from the passive sibling. See the comparison of
'precious' vs 'victim' in [1].

> But unused float going to siblings is intentional. This is documented
> in point 3 in the comment above effective_protection(): if you use
> less than you're legitimately claiming, the float goes to your
> siblings.

The problem is how the unused protection came to be (voluntarily not
consumed vs reclaimed).

> So the problem doesn't seem to be with low accounting and
> event generation, but rather it's simply overreclaim.

Exactly.

> It's conceivable to make reclaim more precise and then tighten up the
> test. But right now, David's patch looks correct to me.

The obvious fix is at the end of this message, it resolves the case I
posted earlier (with memory_recursiveprot), however, it "breaks"
memory.events:low accounting inside recursive children, hence I'm not
considering it finished. (I may elaborate on the breaking case if
interested, I also need to look more into that myself).


On Fri, May 06, 2022 at 09:40:15AM -0700, David Vernet <void-gq6j2QGBifHby3iVrkZq2A@public.gmane.org> wrote:
> If you look at how much memory A/B/E gets at the end of the reclaim,
> it's still far less than 1MB (though should it be 0?).

This selftest has two ±equal workloads in siblings, however, if their
activity varies, it can end up even opposite (the example [1]).

> This definitely sounds to me like a useful testcase to add, and I'm
> happy to do so in a follow-on patch. If we added this, do you think
> we need to keep the check for memory.low events for the memory.low ==
> 0 child in the overcommit testcase?

I think it's still useful, to check the behavior when inherited vs
explicit siblings coexist under protected parent.
Actually, the second case of all siblings having the inherited
(implicit) protection is also interesting (it seems that's that I'm
seeing in my tests with the attached patch).

+Cc: Chris, who reasoned about the SWAP_CLUSTER_MAX rounding vs too high
priority (too low numerically IIUC) [2].

Michal

[1] https://lore.kernel.org/r/20220325103118.GC2828-9OudH3eul5jcvrawFnH+a6VXKuFTiq87@public.gmane.org/
[2] https://lore.kernel.org/all/20190128214213.GB15349-6Bi1550iOqEnzZ6mRAm98g@public.gmane.org/

--- 8< ---
From e18caf7a5a1b0f39185fbdc11e4034def42cde88 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Michal=20Koutn=C3=BD?= <mkoutny-IBi9RG/b67k@public.gmane.org>
Date: Tue, 10 May 2022 18:48:31 +0200
Subject: [RFC PATCH] mm: memcg: Do not overreclaim SWAP_CLUSTER_MAX from
 protected memcg
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

This was observed with memcontrol selftest/new LTP test but can be also
reproduced in simplified setup of two siblings:

	`parent .low=50M
	  ` s1	.low=50M  .current=50M+ε
	  ` s2  .low=0M   .current=50M

The expectation is that s2/memory.events:low will be zero under outer
reclaimer since no protection should be given to cgroup s2 (even with
memory_recursiveprot).

However, this does not happen. The apparent reason is that when s1 is
considered for (proportional) reclaim the scanned proportion is rounded
up to SWAP_CLUSTER_MAX and slightly over-proportional amount is
reclaimed. Consequently, when the effective low value of s2 is
calculated, it observes unclaimed parent's protection from s1
(ε-SWAP_CLUSTER_MAX in theory) and effectively appropriates it.

What is worse, when the sibling s2 has more (memory) greedy workload, it
can repeatedly "steal" the protection from s1 and the distribution ends
up with s1 mostly reclaimed despite explicit prioritization over s2.

Simply fix it by _not_ rounding up to SWAP_CLUSTER_MAX. This would have
saved us ~5 levels of reclaim priority. I.e. we may be reclaiming from
protected memcgs at relatively low priority _without_ counting any
memory.events:low (due to overreclaim). Now, if the moderated scan is
not enough, we must bring priority to zero to open protected reserves.
And that's correct, we want to be explicit when reclaiming those.


Fixes: 8a931f801340 ("mm: memcontrol: recursive memory.low protection")
Fixes: 9783aa9917f8 ("mm, memcg: proportional memory.{low,min} reclaim")
Reported-by: Richard Palethorpe <rpalethorpe-IBi9RG/b67k@public.gmane.org>
Link: https://lore.kernel.org/all/20220321101429.3703-1-rpalethorpe-IBi9RG/b67k@public.gmane.org/
Signed-off-by: Michal Koutný <mkoutny-IBi9RG/b67k@public.gmane.org>
---
 mm/vmscan.c | 7 -------
 1 file changed, 7 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 1678802e03e7..cd760842b9ad 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2798,13 +2798,6 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc,
 
 			scan = lruvec_size - lruvec_size * protection /
 				(cgroup_size + 1);
-
-			/*
-			 * Minimally target SWAP_CLUSTER_MAX pages to keep
-			 * reclaim moving forwards, avoiding decrementing
-			 * sc->priority further than desirable.
-			 */
-			scan = max(scan, SWAP_CLUSTER_MAX);
 		} else {
 			scan = lruvec_size;
 		}
-- 
2.35.3



^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH v2 2/5] cgroup: Account for memory_recursiveprot in test_memcg_low()
@ 2022-05-11 17:53                   ` Johannes Weiner
  0 siblings, 0 replies; 70+ messages in thread
From: Johannes Weiner @ 2022-05-11 17:53 UTC (permalink / raw)
  To: Michal Koutný
  Cc: Andrew Morton, David Vernet, tj, roman.gushchin, linux-kernel,
	linux-mm, cgroups, mhocko, shakeelb, kernel-team,
	Richard Palethorpe, Chris Down

Hi Michal,

On Tue, May 10, 2022 at 07:43:41PM +0200, Michal Koutný wrote:
> On Mon, May 09, 2022 at 05:44:24PM -0700, Andrew Morton <akpm@linux-foundation.org> wrote:
> > So I think we're OK with [2/5] now.  Unless there be objections, I'll
> > be looking to get this series into mm-stable later this week.
> 
> I'm sorry, I think the current form of the test reveals an unexpected
> behavior of reclaim and silencing the test is not the way to go.
> Although, I may be convinced that my understanding is wrong.

Looking through your demo results again, I agree with you. It's a tiny
error, but it compounds and systematically robs the protected group
over and over, to the point where its protection becomes worthless -
at least in idle groups, which isn't super common but does happen.

Let's keep the test as-is and fix reclaim to make it pass ;)

> The obvious fix is at the end of this message, it resolves the case I
> posted earlier (with memory_recursiveprot), however, it "breaks"
> memory.events:low accounting inside recursive children, hence I'm not
> considering it finished. (I may elaborate on the breaking case if
> interested, I also need to look more into that myself).

Can you indeed elaborate on the problem you see with low events?

> @@ -2798,13 +2798,6 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc,
>  
>  			scan = lruvec_size - lruvec_size * protection /
>  				(cgroup_size + 1);
> -
> -			/*
> -			 * Minimally target SWAP_CLUSTER_MAX pages to keep
> -			 * reclaim moving forwards, avoiding decrementing
> -			 * sc->priority further than desirable.
> -			 */
> -			scan = max(scan, SWAP_CLUSTER_MAX);

IIRC this was added due to premature OOMs in synthetic testing (Chris
may remember more details).

However, in practice it wasn't enough anyway, and was followed up by
f56ce412a59d ("mm: memcontrol: fix occasional OOMs due to proportional
memory.low reclaim"). Now, reclaim retries the whole cycle if
proportional protection was in place and it didn't manage to make
progress. The rounding for progress doesn't seem to matter anymore.

So your proposed patch looks like the right thing to do to me. And I
would ack it, but please do explain your concerns around low event
reporting after it.

Thanks!

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v2 2/5] cgroup: Account for memory_recursiveprot in test_memcg_low()
@ 2022-05-11 17:53                   ` Johannes Weiner
  0 siblings, 0 replies; 70+ messages in thread
From: Johannes Weiner @ 2022-05-11 17:53 UTC (permalink / raw)
  To: Michal Koutný
  Cc: Andrew Morton, David Vernet, tj-DgEjT+Ai2ygdnm+yROfE0A,
	roman.gushchin-fxUVXftIFDnyG1zEObXtfA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	mhocko-DgEjT+Ai2ygdnm+yROfE0A, shakeelb-hpIqsD4AKlfQT0dZR+AlfA,
	kernel-team-b10kYP2dOMg, Richard Palethorpe, Chris Down

Hi Michal,

On Tue, May 10, 2022 at 07:43:41PM +0200, Michal Koutný wrote:
> On Mon, May 09, 2022 at 05:44:24PM -0700, Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org> wrote:
> > So I think we're OK with [2/5] now.  Unless there be objections, I'll
> > be looking to get this series into mm-stable later this week.
> 
> I'm sorry, I think the current form of the test reveals an unexpected
> behavior of reclaim and silencing the test is not the way to go.
> Although, I may be convinced that my understanding is wrong.

Looking through your demo results again, I agree with you. It's a tiny
error, but it compounds and systematically robs the protected group
over and over, to the point where its protection becomes worthless -
at least in idle groups, which isn't super common but does happen.

Let's keep the test as-is and fix reclaim to make it pass ;)

> The obvious fix is at the end of this message, it resolves the case I
> posted earlier (with memory_recursiveprot), however, it "breaks"
> memory.events:low accounting inside recursive children, hence I'm not
> considering it finished. (I may elaborate on the breaking case if
> interested, I also need to look more into that myself).

Can you indeed elaborate on the problem you see with low events?

> @@ -2798,13 +2798,6 @@ static void get_scan_count(struct lruvec *lruvec, struct scan_control *sc,
>  
>  			scan = lruvec_size - lruvec_size * protection /
>  				(cgroup_size + 1);
> -
> -			/*
> -			 * Minimally target SWAP_CLUSTER_MAX pages to keep
> -			 * reclaim moving forwards, avoiding decrementing
> -			 * sc->priority further than desirable.
> -			 */
> -			scan = max(scan, SWAP_CLUSTER_MAX);

IIRC this was added due to premature OOMs in synthetic testing (Chris
may remember more details).

However, in practice it wasn't enough anyway, and was followed up by
f56ce412a59d ("mm: memcontrol: fix occasional OOMs due to proportional
memory.low reclaim"). Now, reclaim retries the whole cycle if
proportional protection was in place and it didn't manage to make
progress. The rounding for progress doesn't seem to matter anymore.

So your proposed patch looks like the right thing to do to me. And I
would ack it, but please do explain your concerns around low event
reporting after it.

Thanks!

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v2 0/5] Fix bugs in memcontroller cgroup tests
@ 2022-05-12 17:04   ` Michal Koutný
  0 siblings, 0 replies; 70+ messages in thread
From: Michal Koutný @ 2022-05-12 17:04 UTC (permalink / raw)
  To: David Vernet
  Cc: akpm, tj, roman.gushchin, linux-kernel, linux-mm, cgroups,
	hannes, mhocko, shakeelb, kernel-team

Hello.

On Sat, Apr 23, 2022 at 08:56:15AM -0700, David Vernet <void@manifault.com> wrote:
> tools/testing/selftests/cgroup/test_memcontrol.c contains a set of
> testcases which validate expected behavior of the cgroup memory controller.
> Roman Gushchin recently sent out a patchset that fixed a few issues in the
> test.

Link here
https://lore.kernel.org/r/20220415000133.3955987-1-roman.gushchin@linux.dev/.

> This patchset continues that effort by fixing a few more issues that
> were causing non-deterministic failures in the suite.

Are the Roman's patches merged anywhere? (I ran into some issues when I
was rebasing your (David's) series on top of master.) I'd like to put
all sensible patches in one series or stack on existing branch (if
there's any).

For possible v3 of this series, I did:
  - dropped the patch that allows non-zero memory.events:low for a sibling with
    memory.low=0 when mounted with memory_recursiveprot (the case needs more
    discussion),
  - added few more cleanups, convenience for debugging,
  - fixed comments in the first patch.

Pushed here [1] before properly sending the v3 for discussion.

Thanks,
Michal

[1] https://github.com/Werkov/linux/commits/cgroup-ml/memory.low-overreclaim-var2


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v2 0/5] Fix bugs in memcontroller cgroup tests
@ 2022-05-12 17:04   ` Michal Koutný
  0 siblings, 0 replies; 70+ messages in thread
From: Michal Koutný @ 2022-05-12 17:04 UTC (permalink / raw)
  To: David Vernet
  Cc: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, tj-DgEjT+Ai2ygdnm+yROfE0A,
	roman.gushchin-fxUVXftIFDnyG1zEObXtfA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	hannes-druUgvl0LCNAfugRpC6u6w, mhocko-DgEjT+Ai2ygdnm+yROfE0A,
	shakeelb-hpIqsD4AKlfQT0dZR+AlfA, kernel-team-b10kYP2dOMg

Hello.

On Sat, Apr 23, 2022 at 08:56:15AM -0700, David Vernet <void-gq6j2QGBifHby3iVrkZq2A@public.gmane.org> wrote:
> tools/testing/selftests/cgroup/test_memcontrol.c contains a set of
> testcases which validate expected behavior of the cgroup memory controller.
> Roman Gushchin recently sent out a patchset that fixed a few issues in the
> test.

Link here
https://lore.kernel.org/r/20220415000133.3955987-1-roman.gushchin-fxUVXftIFDnyG1zEObXtfA@public.gmane.org/.

> This patchset continues that effort by fixing a few more issues that
> were causing non-deterministic failures in the suite.

Are the Roman's patches merged anywhere? (I ran into some issues when I
was rebasing your (David's) series on top of master.) I'd like to put
all sensible patches in one series or stack on existing branch (if
there's any).

For possible v3 of this series, I did:
  - dropped the patch that allows non-zero memory.events:low for a sibling with
    memory.low=0 when mounted with memory_recursiveprot (the case needs more
    discussion),
  - added few more cleanups, convenience for debugging,
  - fixed comments in the first patch.

Pushed here [1] before properly sending the v3 for discussion.

Thanks,
Michal

[1] https://github.com/Werkov/linux/commits/cgroup-ml/memory.low-overreclaim-var2


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v2 2/5] cgroup: Account for memory_recursiveprot in test_memcg_low()
@ 2022-05-12 17:27                     ` Michal Koutný
  0 siblings, 0 replies; 70+ messages in thread
From: Michal Koutný @ 2022-05-12 17:27 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, David Vernet, tj, roman.gushchin, linux-kernel,
	linux-mm, cgroups, mhocko, shakeelb, kernel-team,
	Richard Palethorpe, Chris Down

On Wed, May 11, 2022 at 01:53:05PM -0400, Johannes Weiner <hannes@cmpxchg.org> wrote:
> Can you indeed elaborate on the problem you see with low events?

My mistake. I realized I was testing on a system without
memory_recursiveprot enabled. Therefore I saw no events in children with
memory.low=0.

However, it also means that my previous evaluation of the "simple" fix
(dropping the SWAP_CLUSTER_MAX rounding) was incorrect and it actually
doesn't resolve the problem of two differently active siblings I posted
earlier.

> So your proposed patch looks like the right thing to do to me. And I
> would ack it, but please do explain your concerns around low event
> reporting after it.

I retract it (at least for now), it doesn't really help. It can be seen
(after application) [1] that once (low) protected memory is opened for
reclaim, the sibling proportions change suddenly (neither sibling is
protected during sc->memcg_low_reclaim, however, the formerly protected
suddenly provides good supply of reclaimable pages).

OTOH, without memory_recursiveprot [2], the elow growth of the victim
sibling is absent and situation stabilizes with only partial reclaim
from the (explicitly) protected sibling. 

In both variants (recursive/non-recursive) the parent ends up with same
amount of unreclaimed memory, however, the gradual tranfer of elow with
the recursive protection is undesired. (I'm only thinking how to solve
it simply.)

Michal

[1] https://bugzilla.suse.com/attachment.cgi?id=858869
[2] https://bugzilla.suse.com/attachment.cgi?id=858870


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v2 2/5] cgroup: Account for memory_recursiveprot in test_memcg_low()
@ 2022-05-12 17:27                     ` Michal Koutný
  0 siblings, 0 replies; 70+ messages in thread
From: Michal Koutný @ 2022-05-12 17:27 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: Andrew Morton, David Vernet, tj-DgEjT+Ai2ygdnm+yROfE0A,
	roman.gushchin-fxUVXftIFDnyG1zEObXtfA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	mhocko-DgEjT+Ai2ygdnm+yROfE0A, shakeelb-hpIqsD4AKlfQT0dZR+AlfA,
	kernel-team-b10kYP2dOMg, Richard Palethorpe, Chris Down

On Wed, May 11, 2022 at 01:53:05PM -0400, Johannes Weiner <hannes-druUgvl0LCNAfugRpC6u6w@public.gmane.org> wrote:
> Can you indeed elaborate on the problem you see with low events?

My mistake. I realized I was testing on a system without
memory_recursiveprot enabled. Therefore I saw no events in children with
memory.low=0.

However, it also means that my previous evaluation of the "simple" fix
(dropping the SWAP_CLUSTER_MAX rounding) was incorrect and it actually
doesn't resolve the problem of two differently active siblings I posted
earlier.

> So your proposed patch looks like the right thing to do to me. And I
> would ack it, but please do explain your concerns around low event
> reporting after it.

I retract it (at least for now), it doesn't really help. It can be seen
(after application) [1] that once (low) protected memory is opened for
reclaim, the sibling proportions change suddenly (neither sibling is
protected during sc->memcg_low_reclaim, however, the formerly protected
suddenly provides good supply of reclaimable pages).

OTOH, without memory_recursiveprot [2], the elow growth of the victim
sibling is absent and situation stabilizes with only partial reclaim
from the (explicitly) protected sibling. 

In both variants (recursive/non-recursive) the parent ends up with same
amount of unreclaimed memory, however, the gradual tranfer of elow with
the recursive protection is undesired. (I'm only thinking how to solve
it simply.)

Michal

[1] https://bugzilla.suse.com/attachment.cgi?id=858869
[2] https://bugzilla.suse.com/attachment.cgi?id=858870


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v2 0/5] Fix bugs in memcontroller cgroup tests
@ 2022-05-12 17:30     ` David Vernet
  0 siblings, 0 replies; 70+ messages in thread
From: David Vernet @ 2022-05-12 17:30 UTC (permalink / raw)
  To: Michal Koutný
  Cc: akpm, tj, roman.gushchin, linux-kernel, linux-mm, cgroups,
	hannes, mhocko, shakeelb, kernel-team

Hi Michal,

On Thu, May 12, 2022 at 07:04:10PM +0200, Michal Koutný wrote:
> Are the Roman's patches merged anywhere? (I ran into some issues when I
> was rebasing your (David's) series on top of master.) I'd like to put
> all sensible patches in one series or stack on existing branch (if
> there's any).

Roman's patches are present on master on the linux-mm tree. See
b7dbfd6553d..a131b1ed12c6.

> For possible v3 of this series, I did:
>   - dropped the patch that allows non-zero memory.events:low for a sibling with
>     memory.low=0 when mounted with memory_recursiveprot (the case needs more
>     discussion),

Ack, and thanks for keeping us steered in the right direction here. I don't
see this in the patch set you linked, but I agree this commit should be
reverted and the reclaim logic instead fixed.

>   - added few more cleanups, convenience for debugging,

Are you referring to the FAIL() macro you added? I would love to Ack that,
but unfortunately checkpatch.pl will probably yell at you for having a goto
in that macro, per the point about avoiding macros that affect control flow
[0].

I tried to do the same thing when sending out my patch set and had to
revert it before sending it to upstream.

Thanks,
David

[0] https://github.com/Werkov/linux/commit/a076339cc4825af2f22f58c1347a572b104b8221

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v2 0/5] Fix bugs in memcontroller cgroup tests
@ 2022-05-12 17:30     ` David Vernet
  0 siblings, 0 replies; 70+ messages in thread
From: David Vernet @ 2022-05-12 17:30 UTC (permalink / raw)
  To: Michal Koutný
  Cc: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, tj-DgEjT+Ai2ygdnm+yROfE0A,
	roman.gushchin-fxUVXftIFDnyG1zEObXtfA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	hannes-druUgvl0LCNAfugRpC6u6w, mhocko-DgEjT+Ai2ygdnm+yROfE0A,
	shakeelb-hpIqsD4AKlfQT0dZR+AlfA, kernel-team-b10kYP2dOMg

Hi Michal,

On Thu, May 12, 2022 at 07:04:10PM +0200, Michal Koutný wrote:
> Are the Roman's patches merged anywhere? (I ran into some issues when I
> was rebasing your (David's) series on top of master.) I'd like to put
> all sensible patches in one series or stack on existing branch (if
> there's any).

Roman's patches are present on master on the linux-mm tree. See
b7dbfd6553d..a131b1ed12c6.

> For possible v3 of this series, I did:
>   - dropped the patch that allows non-zero memory.events:low for a sibling with
>     memory.low=0 when mounted with memory_recursiveprot (the case needs more
>     discussion),

Ack, and thanks for keeping us steered in the right direction here. I don't
see this in the patch set you linked, but I agree this commit should be
reverted and the reclaim logic instead fixed.

>   - added few more cleanups, convenience for debugging,

Are you referring to the FAIL() macro you added? I would love to Ack that,
but unfortunately checkpatch.pl will probably yell at you for having a goto
in that macro, per the point about avoiding macros that affect control flow
[0].

I tried to do the same thing when sending out my patch set and had to
revert it before sending it to upstream.

Thanks,
David

[0] https://github.com/Werkov/linux/commit/a076339cc4825af2f22f58c1347a572b104b8221

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v2 0/5] Fix bugs in memcontroller cgroup tests
@ 2022-05-12 17:44       ` David Vernet
  0 siblings, 0 replies; 70+ messages in thread
From: David Vernet @ 2022-05-12 17:44 UTC (permalink / raw)
  To: Michal Koutný
  Cc: akpm, tj, roman.gushchin, linux-kernel, linux-mm, cgroups,
	hannes, mhocko, shakeelb, kernel-team

On Thu, May 12, 2022 at 10:30:18AM -0700, David Vernet wrote:
> Hi Michal,
> 
> On Thu, May 12, 2022 at 07:04:10PM +0200, Michal Koutný wrote:
> > Are the Roman's patches merged anywhere? (I ran into some issues when I
> > was rebasing your (David's) series on top of master.) I'd like to put
> > all sensible patches in one series or stack on existing branch (if
> > there's any).
> 
> Roman's patches are present on master on the linux-mm tree. See
> b7dbfd6553d..a131b1ed12c6.
> 
> > For possible v3 of this series, I did:
> >   - dropped the patch that allows non-zero memory.events:low for a sibling with
> >     memory.low=0 when mounted with memory_recursiveprot (the case needs more
> >     discussion),
> 
> Ack, and thanks for keeping us steered in the right direction here. I don't
> see this in the patch set you linked, but I agree this commit should be
> reverted and the reclaim logic instead fixed.
> 
> >   - added few more cleanups, convenience for debugging,
> 
> Are you referring to the FAIL() macro you added? I would love to Ack that,
> but unfortunately checkpatch.pl will probably yell at you for having a goto
> in that macro, per the point about avoiding macros that affect control flow
> [0].
> 
> I tried to do the same thing when sending out my patch set and had to
> revert it before sending it to upstream.
> 
> Thanks,
> David
> 
> [0] https://github.com/Werkov/linux/commit/a076339cc4825af2f22f58c1347a572b104b8221

Sorry, I meant to link this:

https://www.kernel.org/doc/html/latest/process/coding-style.html#macros-enums-and-rtl

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH v2 0/5] Fix bugs in memcontroller cgroup tests
@ 2022-05-12 17:44       ` David Vernet
  0 siblings, 0 replies; 70+ messages in thread
From: David Vernet @ 2022-05-12 17:44 UTC (permalink / raw)
  To: Michal Koutný
  Cc: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, tj-DgEjT+Ai2ygdnm+yROfE0A,
	roman.gushchin-fxUVXftIFDnyG1zEObXtfA,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, cgroups-u79uwXL29TY76Z2rM5mHXA,
	hannes-druUgvl0LCNAfugRpC6u6w, mhocko-DgEjT+Ai2ygdnm+yROfE0A,
	shakeelb-hpIqsD4AKlfQT0dZR+AlfA, kernel-team-b10kYP2dOMg

On Thu, May 12, 2022 at 10:30:18AM -0700, David Vernet wrote:
> Hi Michal,
> 
> On Thu, May 12, 2022 at 07:04:10PM +0200, Michal Koutný wrote:
> > Are the Roman's patches merged anywhere? (I ran into some issues when I
> > was rebasing your (David's) series on top of master.) I'd like to put
> > all sensible patches in one series or stack on existing branch (if
> > there's any).
> 
> Roman's patches are present on master on the linux-mm tree. See
> b7dbfd6553d..a131b1ed12c6.
> 
> > For possible v3 of this series, I did:
> >   - dropped the patch that allows non-zero memory.events:low for a sibling with
> >     memory.low=0 when mounted with memory_recursiveprot (the case needs more
> >     discussion),
> 
> Ack, and thanks for keeping us steered in the right direction here. I don't
> see this in the patch set you linked, but I agree this commit should be
> reverted and the reclaim logic instead fixed.
> 
> >   - added few more cleanups, convenience for debugging,
> 
> Are you referring to the FAIL() macro you added? I would love to Ack that,
> but unfortunately checkpatch.pl will probably yell at you for having a goto
> in that macro, per the point about avoiding macros that affect control flow
> [0].
> 
> I tried to do the same thing when sending out my patch set and had to
> revert it before sending it to upstream.
> 
> Thanks,
> David
> 
> [0] https://github.com/Werkov/linux/commit/a076339cc4825af2f22f58c1347a572b104b8221

Sorry, I meant to link this:

https://www.kernel.org/doc/html/latest/process/coding-style.html#macros-enums-and-rtl

^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH 0/4] memcontrol selftests fixups
@ 2022-05-13 17:18         ` Michal Koutný
  0 siblings, 0 replies; 70+ messages in thread
From: Michal Koutný @ 2022-05-13 17:18 UTC (permalink / raw)
  To: void
  Cc: akpm, cgroups, hannes, kernel-team, linux-kernel, linux-mm,
	mhocko, mkoutny, roman.gushchin, shakeelb, tj,
	Richard Palethorpe

Hello.

I'm just flushing the simple patches to make memcontrol selftests check the
events behavior we had consensus about (test_memcg_low fails). (I've dropped to
goto macros for now.)

(test_memcg_reclaim, test_memcg_swap_max fail for me now but it's present
even before the refactoring.)

The only bigger change is adjustment of the protected values to make tests
succeed with the given tolerance.

It's based on mm-stable [1] commit e240ac52f7da. AFAIC, the fixup and partial
reverts may be folded into respective commits.
Let me know if it should be (re)based on something else.

Thanks,
Michal

[1] https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/log/tools/testing/selftests/cgroup?h=mm-stable

Michal Koutný (4):
  selftests: memcg: Fix compilation
  selftests: memcg: Expect no low events in unprotected sibling
  selftests: memcg: Adjust expected reclaim values of protected cgroups
  selftests: memcg: Remove protection from top level memcg

 .../selftests/cgroup/test_memcontrol.c        | 59 +++++++++----------
 1 file changed, 29 insertions(+), 30 deletions(-)

-- 
2.35.3


^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH 0/4] memcontrol selftests fixups
@ 2022-05-13 17:18         ` Michal Koutný
  0 siblings, 0 replies; 70+ messages in thread
From: Michal Koutný @ 2022-05-13 17:18 UTC (permalink / raw)
  To: void-gq6j2QGBifHby3iVrkZq2A
  Cc: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	cgroups-u79uwXL29TY76Z2rM5mHXA, hannes-druUgvl0LCNAfugRpC6u6w,
	kernel-team-b10kYP2dOMg, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, mhocko-DgEjT+Ai2ygdnm+yROfE0A,
	mkoutny-IBi9RG/b67k, roman.gushchin-fxUVXftIFDnyG1zEObXtfA,
	shakeelb-hpIqsD4AKlfQT0dZR+AlfA, tj-DgEjT+Ai2ygdnm+yROfE0A,
	Richard Palethorpe

Hello.

I'm just flushing the simple patches to make memcontrol selftests check the
events behavior we had consensus about (test_memcg_low fails). (I've dropped to
goto macros for now.)

(test_memcg_reclaim, test_memcg_swap_max fail for me now but it's present
even before the refactoring.)

The only bigger change is adjustment of the protected values to make tests
succeed with the given tolerance.

It's based on mm-stable [1] commit e240ac52f7da. AFAIC, the fixup and partial
reverts may be folded into respective commits.
Let me know if it should be (re)based on something else.

Thanks,
Michal

[1] https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git/log/tools/testing/selftests/cgroup?h=mm-stable

Michal Koutn√Ω (4):
  selftests: memcg: Fix compilation
  selftests: memcg: Expect no low events in unprotected sibling
  selftests: memcg: Adjust expected reclaim values of protected cgroups
  selftests: memcg: Remove protection from top level memcg

 .../selftests/cgroup/test_memcontrol.c        | 59 +++++++++----------
 1 file changed, 29 insertions(+), 30 deletions(-)

-- 
2.35.3


^ permalink raw reply	[flat|nested] 70+ messages in thread

* [PATCH 1/4] selftests: memcg: Fix compilation
@ 2022-05-13 17:18           ` Michal Koutný
  0 siblings, 0 replies; 70+ messages in thread
From: Michal Koutný @ 2022-05-13 17:18 UTC (permalink / raw)
  To: void
  Cc: akpm, cgroups, hannes, kernel-team, linux-kernel, linux-mm,
	mhocko, mkoutny, roman.gushchin, shakeelb, tj,
	Richard Palethorpe

This fixes mis-applied changes from commit 72b1e03aa725 ("cgroup:
account for memory_localevents in test_memcg_oom_group_leaf_events()").

Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
 .../selftests/cgroup/test_memcontrol.c        | 25 +++++++++++--------
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
index 6ab94317c87b..4958b42201a9 100644
--- a/tools/testing/selftests/cgroup/test_memcontrol.c
+++ b/tools/testing/selftests/cgroup/test_memcontrol.c
@@ -1241,7 +1241,16 @@ static int test_memcg_oom_group_leaf_events(const char *root)
 	if (cg_read_key_long(child, "memory.events", "oom_kill ") <= 0)
 		goto cleanup;
 
-	if (cg_read_key_long(parent, "memory.events", "oom_kill ") <= 0)
+	parent_oom_events = cg_read_key_long(
+			parent, "memory.events", "oom_kill ");
+	/*
+	 * If memory_localevents is not enabled (the default), the parent should
+	 * count OOM events in its children groups. Otherwise, it should not
+	 * have observed any events.
+	 */
+	if (has_localevents && parent_oom_events != 0)
+		goto cleanup;
+	else if (!has_localevents && parent_oom_events <= 0)
 		goto cleanup;
 
 	ret = KSFT_PASS;
@@ -1349,20 +1358,14 @@ static int test_memcg_oom_group_score_events(const char *root)
 	if (!cg_run(memcg, alloc_anon, (void *)MB(100)))
 		goto cleanup;
 
-	parent_oom_events = cg_read_key_long(
-			parent, "memory.events", "oom_kill ");
-	/*
-	 * If memory_localevents is not enabled (the default), the parent should
-	 * count OOM events in its children groups. Otherwise, it should not
-	 * have observed any events.
-	 */
-	if ((has_localevents && parent_oom_events == 0) ||
-	     parent_oom_events > 0)
-		ret = KSFT_PASS;
+	if (cg_read_key_long(memcg, "memory.events", "oom_kill ") != 3)
+		FAIL(cleanup);
 
 	if (kill(safe_pid, SIGKILL))
 		goto cleanup;
 
+	ret = KSFT_PASS;
+
 cleanup:
 	if (memcg)
 		cg_destroy(memcg);
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 1/4] selftests: memcg: Fix compilation
@ 2022-05-13 17:18           ` Michal Koutný
  0 siblings, 0 replies; 70+ messages in thread
From: Michal Koutný @ 2022-05-13 17:18 UTC (permalink / raw)
  To: void-gq6j2QGBifHby3iVrkZq2A
  Cc: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	cgroups-u79uwXL29TY76Z2rM5mHXA, hannes-druUgvl0LCNAfugRpC6u6w,
	kernel-team-b10kYP2dOMg, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, mhocko-DgEjT+Ai2ygdnm+yROfE0A,
	mkoutny-IBi9RG/b67k, roman.gushchin-fxUVXftIFDnyG1zEObXtfA,
	shakeelb-hpIqsD4AKlfQT0dZR+AlfA, tj-DgEjT+Ai2ygdnm+yROfE0A,
	Richard Palethorpe

This fixes mis-applied changes from commit 72b1e03aa725 ("cgroup:
account for memory_localevents in test_memcg_oom_group_leaf_events()").

Signed-off-by: Michal Koutn√Ω <mkoutny-IBi9RG/b67k@public.gmane.org>
---
 .../selftests/cgroup/test_memcontrol.c        | 25 +++++++++++--------
 1 file changed, 14 insertions(+), 11 deletions(-)

diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
index 6ab94317c87b..4958b42201a9 100644
--- a/tools/testing/selftests/cgroup/test_memcontrol.c
+++ b/tools/testing/selftests/cgroup/test_memcontrol.c
@@ -1241,7 +1241,16 @@ static int test_memcg_oom_group_leaf_events(const char *root)
 	if (cg_read_key_long(child, "memory.events", "oom_kill ") <= 0)
 		goto cleanup;
 
-	if (cg_read_key_long(parent, "memory.events", "oom_kill ") <= 0)
+	parent_oom_events = cg_read_key_long(
+			parent, "memory.events", "oom_kill ");
+	/*
+	 * If memory_localevents is not enabled (the default), the parent should
+	 * count OOM events in its children groups. Otherwise, it should not
+	 * have observed any events.
+	 */
+	if (has_localevents && parent_oom_events != 0)
+		goto cleanup;
+	else if (!has_localevents && parent_oom_events <= 0)
 		goto cleanup;
 
 	ret = KSFT_PASS;
@@ -1349,20 +1358,14 @@ static int test_memcg_oom_group_score_events(const char *root)
 	if (!cg_run(memcg, alloc_anon, (void *)MB(100)))
 		goto cleanup;
 
-	parent_oom_events = cg_read_key_long(
-			parent, "memory.events", "oom_kill ");
-	/*
-	 * If memory_localevents is not enabled (the default), the parent should
-	 * count OOM events in its children groups. Otherwise, it should not
-	 * have observed any events.
-	 */
-	if ((has_localevents && parent_oom_events == 0) ||
-	     parent_oom_events > 0)
-		ret = KSFT_PASS;
+	if (cg_read_key_long(memcg, "memory.events", "oom_kill ") != 3)
+		FAIL(cleanup);
 
 	if (kill(safe_pid, SIGKILL))
 		goto cleanup;
 
+	ret = KSFT_PASS;
+
 cleanup:
 	if (memcg)
 		cg_destroy(memcg);
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 2/4] selftests: memcg: Expect no low events in unprotected sibling
@ 2022-05-13 17:18           ` Michal Koutný
  0 siblings, 0 replies; 70+ messages in thread
From: Michal Koutný @ 2022-05-13 17:18 UTC (permalink / raw)
  To: void
  Cc: akpm, cgroups, hannes, kernel-team, linux-kernel, linux-mm,
	mhocko, mkoutny, roman.gushchin, shakeelb, tj,
	Richard Palethorpe

This is effectively a revert of commit cdc69458a5f3 ("cgroup: account
for memory_recursiveprot in test_memcg_low()"). The case test_memcg_low
will fail with memory_recursiveprot until resolved in reclaim
code.
However, this patch preserves the existing helpers and variables for
later uses.

Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
 tools/testing/selftests/cgroup/test_memcontrol.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
index 4958b42201a9..eba252fa64ac 100644
--- a/tools/testing/selftests/cgroup/test_memcontrol.c
+++ b/tools/testing/selftests/cgroup/test_memcontrol.c
@@ -528,7 +528,7 @@ static int test_memcg_low(const char *root)
 	}
 
 	for (i = 0; i < ARRAY_SIZE(children); i++) {
-		int no_low_events_index = has_recursiveprot ? 2 : 1;
+		int no_low_events_index = 1;
 
 		oom = cg_read_key_long(children[i], "memory.events", "oom ");
 		low = cg_read_key_long(children[i], "memory.events", "low ");
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 2/4] selftests: memcg: Expect no low events in unprotected sibling
@ 2022-05-13 17:18           ` Michal Koutný
  0 siblings, 0 replies; 70+ messages in thread
From: Michal Koutný @ 2022-05-13 17:18 UTC (permalink / raw)
  To: void-gq6j2QGBifHby3iVrkZq2A
  Cc: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	cgroups-u79uwXL29TY76Z2rM5mHXA, hannes-druUgvl0LCNAfugRpC6u6w,
	kernel-team-b10kYP2dOMg, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, mhocko-DgEjT+Ai2ygdnm+yROfE0A,
	mkoutny-IBi9RG/b67k, roman.gushchin-fxUVXftIFDnyG1zEObXtfA,
	shakeelb-hpIqsD4AKlfQT0dZR+AlfA, tj-DgEjT+Ai2ygdnm+yROfE0A,
	Richard Palethorpe

This is effectively a revert of commit cdc69458a5f3 ("cgroup: account
for memory_recursiveprot in test_memcg_low()"). The case test_memcg_low
will fail with memory_recursiveprot until resolved in reclaim
code.
However, this patch preserves the existing helpers and variables for
later uses.

Signed-off-by: Michal Koutn√Ω <mkoutny-IBi9RG/b67k@public.gmane.org>
---
 tools/testing/selftests/cgroup/test_memcontrol.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
index 4958b42201a9..eba252fa64ac 100644
--- a/tools/testing/selftests/cgroup/test_memcontrol.c
+++ b/tools/testing/selftests/cgroup/test_memcontrol.c
@@ -528,7 +528,7 @@ static int test_memcg_low(const char *root)
 	}
 
 	for (i = 0; i < ARRAY_SIZE(children); i++) {
-		int no_low_events_index = has_recursiveprot ? 2 : 1;
+		int no_low_events_index = 1;
 
 		oom = cg_read_key_long(children[i], "memory.events", "oom ");
 		low = cg_read_key_long(children[i], "memory.events", "low ");
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 3/4] selftests: memcg: Adjust expected reclaim values of protected cgroups
@ 2022-05-13 17:18           ` Michal Koutný
  0 siblings, 0 replies; 70+ messages in thread
From: Michal Koutný @ 2022-05-13 17:18 UTC (permalink / raw)
  To: void
  Cc: akpm, cgroups, hannes, kernel-team, linux-kernel, linux-mm,
	mhocko, mkoutny, roman.gushchin, shakeelb, tj,
	Richard Palethorpe

The numbers are not easy to derive in a closed form (certainly mere
protections ratios do not apply), therefore use a simulation to obtain
expected numbers.

The new values make the protection tests succeed more precisely.

	% run as: octave-cli script
	%
	% Input configurations
	% -------------------
	% E parent effective protection
	% n nominal protection of siblings set at the givel level
	% c current consumption -,,-

	% example from testcase (values in GB)
	E = 50 / 1024;
	n = [75 25 0 500 ] / 1024;
	c = [50 50 50 0] / 1024;

	% Reclaim parameters
	% ------------------

	% Minimal reclaim amount (GB)
	cluster = 32*4 / 2**20;

	% Reclaim coefficient (think as 0.5^sc->priority)
	alpha = .1

	% Simulation parameters
	% ---------------------
	epsilon = 1e-7;
	timeout = 1000;

	% Simulation loop
	% ---------------------
	% Simulation assumes siblings consumed the initial amount of memory (w/out
	% reclaim) and then the reclaim starts, all memory is reclaimable, i.e. treated
	% same. It simulates only non-low reclaim and assumes all memory.min = 0.

	ch = [];
	eh = [];
	rh = [];

	for t = 1:timeout
		% low_usage
		u = min(c, n);
		siblings = sum(u);

		% effective_protection()
		protected = min(n, c);                % start with nominal
		e = protected * min(1, E / siblings); % normalize overcommit

		% recursive protection
		unclaimed = max(0, E - siblings);
		parent_overuse = sum(c) - siblings;
		if (unclaimed > 0 && parent_overuse > 0)
			overuse = max(0, c - protected);
			e += unclaimed * (overuse / parent_overuse);
		endif

		% get_scan_count()
		r = alpha * c;             % assume all memory is in a single LRU list

		% commit 1bc63fb1272b ("mm, memcg: make scan aggression always exclude protection")
		sz = max(e, c);
		r .*= (1 - (e+epsilon) ./ (sz+epsilon));

		% uncomment to debug prints
		% e, c, r

		% nothing to reclaim, reached equilibrium
		if max(r) < epsilon
			break;
		endif

		% SWAP_CLUSTER_MAX
		r = max(r, (r > epsilon) .* cluster);
		% XXX here I do parallel reclaim of all siblings
		% in reality reclaim is serialized and each sibling recalculates own residual
		c = max(c - r, 0);

		ch = [ch ; c];
		eh = [eh ; e];
		rh = [rh ; r];
	endfor

	t
	c, e

Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
 .../selftests/cgroup/test_memcontrol.c        | 20 +++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
index eba252fa64ac..9ffacf024bbd 100644
--- a/tools/testing/selftests/cgroup/test_memcontrol.c
+++ b/tools/testing/selftests/cgroup/test_memcontrol.c
@@ -260,9 +260,9 @@ static int cg_test_proc_killed(const char *cgroup)
  * memory pressure in it.
  *
  * A/B    memory.current ~= 50M
- * A/B/C  memory.current ~= 33M
- * A/B/D  memory.current ~= 17M
- * A/B/F  memory.current ~= 0
+ * A/B/C  memory.current ~= 29M
+ * A/B/D  memory.current ~= 21M
+ * A/B/E  memory.current ~= 0
  *
  * After that it tries to allocate more than there is
  * unprotected memory in A available, and checks
@@ -365,10 +365,10 @@ static int test_memcg_min(const char *root)
 	for (i = 0; i < ARRAY_SIZE(children); i++)
 		c[i] = cg_read_long(children[i], "memory.current");
 
-	if (!values_close(c[0], MB(33), 10))
+	if (!values_close(c[0], MB(29), 10))
 		goto cleanup;
 
-	if (!values_close(c[1], MB(17), 10))
+	if (!values_close(c[1], MB(21), 10))
 		goto cleanup;
 
 	if (c[3] != 0)
@@ -417,9 +417,9 @@ static int test_memcg_min(const char *root)
  *
  * Then it checks actual memory usages and expects that:
  * A/B    memory.current ~= 50M
- * A/B/   memory.current ~= 33M
- * A/B/D  memory.current ~= 17M
- * A/B/F  memory.current ~= 0
+ * A/B/   memory.current ~= 29M
+ * A/B/D  memory.current ~= 21M
+ * A/B/E  memory.current ~= 0
  *
  * After that it tries to allocate more than there is
  * unprotected memory in A available,
@@ -512,10 +512,10 @@ static int test_memcg_low(const char *root)
 	for (i = 0; i < ARRAY_SIZE(children); i++)
 		c[i] = cg_read_long(children[i], "memory.current");
 
-	if (!values_close(c[0], MB(33), 10))
+	if (!values_close(c[0], MB(29), 10))
 		goto cleanup;
 
-	if (!values_close(c[1], MB(17), 10))
+	if (!values_close(c[1], MB(21), 10))
 		goto cleanup;
 
 	if (c[3] != 0)
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 3/4] selftests: memcg: Adjust expected reclaim values of protected cgroups
@ 2022-05-13 17:18           ` Michal Koutný
  0 siblings, 0 replies; 70+ messages in thread
From: Michal Koutný @ 2022-05-13 17:18 UTC (permalink / raw)
  To: void-gq6j2QGBifHby3iVrkZq2A
  Cc: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	cgroups-u79uwXL29TY76Z2rM5mHXA, hannes-druUgvl0LCNAfugRpC6u6w,
	kernel-team-b10kYP2dOMg, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, mhocko-DgEjT+Ai2ygdnm+yROfE0A,
	mkoutny-IBi9RG/b67k, roman.gushchin-fxUVXftIFDnyG1zEObXtfA,
	shakeelb-hpIqsD4AKlfQT0dZR+AlfA, tj-DgEjT+Ai2ygdnm+yROfE0A,
	Richard Palethorpe

The numbers are not easy to derive in a closed form (certainly mere
protections ratios do not apply), therefore use a simulation to obtain
expected numbers.

The new values make the protection tests succeed more precisely.

	% run as: octave-cli script
	%
	% Input configurations
	% -------------------
	% E parent effective protection
	% n nominal protection of siblings set at the givel level
	% c current consumption -,,-

	% example from testcase (values in GB)
	E = 50 / 1024;
	n = [75 25 0 500 ] / 1024;
	c = [50 50 50 0] / 1024;

	% Reclaim parameters
	% ------------------

	% Minimal reclaim amount (GB)
	cluster = 32*4 / 2**20;

	% Reclaim coefficient (think as 0.5^sc->priority)
	alpha = .1

	% Simulation parameters
	% ---------------------
	epsilon = 1e-7;
	timeout = 1000;

	% Simulation loop
	% ---------------------
	% Simulation assumes siblings consumed the initial amount of memory (w/out
	% reclaim) and then the reclaim starts, all memory is reclaimable, i.e. treated
	% same. It simulates only non-low reclaim and assumes all memory.min = 0.

	ch = [];
	eh = [];
	rh = [];

	for t = 1:timeout
		% low_usage
		u = min(c, n);
		siblings = sum(u);

		% effective_protection()
		protected = min(n, c);                % start with nominal
		e = protected * min(1, E / siblings); % normalize overcommit

		% recursive protection
		unclaimed = max(0, E - siblings);
		parent_overuse = sum(c) - siblings;
		if (unclaimed > 0 && parent_overuse > 0)
			overuse = max(0, c - protected);
			e += unclaimed * (overuse / parent_overuse);
		endif

		% get_scan_count()
		r = alpha * c;             % assume all memory is in a single LRU list

		% commit 1bc63fb1272b ("mm, memcg: make scan aggression always exclude protection")
		sz = max(e, c);
		r .*= (1 - (e+epsilon) ./ (sz+epsilon));

		% uncomment to debug prints
		% e, c, r

		% nothing to reclaim, reached equilibrium
		if max(r) < epsilon
			break;
		endif

		% SWAP_CLUSTER_MAX
		r = max(r, (r > epsilon) .* cluster);
		% XXX here I do parallel reclaim of all siblings
		% in reality reclaim is serialized and each sibling recalculates own residual
		c = max(c - r, 0);

		ch = [ch ; c];
		eh = [eh ; e];
		rh = [rh ; r];
	endfor

	t
	c, e

Signed-off-by: Michal Koutn√Ω <mkoutny-IBi9RG/b67k@public.gmane.org>
---
 .../selftests/cgroup/test_memcontrol.c        | 20 +++++++++----------
 1 file changed, 10 insertions(+), 10 deletions(-)

diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
index eba252fa64ac..9ffacf024bbd 100644
--- a/tools/testing/selftests/cgroup/test_memcontrol.c
+++ b/tools/testing/selftests/cgroup/test_memcontrol.c
@@ -260,9 +260,9 @@ static int cg_test_proc_killed(const char *cgroup)
  * memory pressure in it.
  *
  * A/B    memory.current ~= 50M
- * A/B/C  memory.current ~= 33M
- * A/B/D  memory.current ~= 17M
- * A/B/F  memory.current ~= 0
+ * A/B/C  memory.current ~= 29M
+ * A/B/D  memory.current ~= 21M
+ * A/B/E  memory.current ~= 0
  *
  * After that it tries to allocate more than there is
  * unprotected memory in A available, and checks
@@ -365,10 +365,10 @@ static int test_memcg_min(const char *root)
 	for (i = 0; i < ARRAY_SIZE(children); i++)
 		c[i] = cg_read_long(children[i], "memory.current");
 
-	if (!values_close(c[0], MB(33), 10))
+	if (!values_close(c[0], MB(29), 10))
 		goto cleanup;
 
-	if (!values_close(c[1], MB(17), 10))
+	if (!values_close(c[1], MB(21), 10))
 		goto cleanup;
 
 	if (c[3] != 0)
@@ -417,9 +417,9 @@ static int test_memcg_min(const char *root)
  *
  * Then it checks actual memory usages and expects that:
  * A/B    memory.current ~= 50M
- * A/B/   memory.current ~= 33M
- * A/B/D  memory.current ~= 17M
- * A/B/F  memory.current ~= 0
+ * A/B/   memory.current ~= 29M
+ * A/B/D  memory.current ~= 21M
+ * A/B/E  memory.current ~= 0
  *
  * After that it tries to allocate more than there is
  * unprotected memory in A available,
@@ -512,10 +512,10 @@ static int test_memcg_low(const char *root)
 	for (i = 0; i < ARRAY_SIZE(children); i++)
 		c[i] = cg_read_long(children[i], "memory.current");
 
-	if (!values_close(c[0], MB(33), 10))
+	if (!values_close(c[0], MB(29), 10))
 		goto cleanup;
 
-	if (!values_close(c[1], MB(17), 10))
+	if (!values_close(c[1], MB(21), 10))
 		goto cleanup;
 
 	if (c[3] != 0)
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 4/4] selftests: memcg: Remove protection from top level memcg
@ 2022-05-13 17:18           ` Michal Koutný
  0 siblings, 0 replies; 70+ messages in thread
From: Michal Koutný @ 2022-05-13 17:18 UTC (permalink / raw)
  To: void
  Cc: akpm, cgroups, hannes, kernel-team, linux-kernel, linux-mm,
	mhocko, mkoutny, roman.gushchin, shakeelb, tj,
	Richard Palethorpe

The reclaim is triggered by memory limit in a subtree, therefore the
testcase does not need configured protection against external reclaim.

Also, correct/deduplicate respective comments

Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
 tools/testing/selftests/cgroup/test_memcontrol.c | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
index 9ffacf024bbd..9d370aafd799 100644
--- a/tools/testing/selftests/cgroup/test_memcontrol.c
+++ b/tools/testing/selftests/cgroup/test_memcontrol.c
@@ -247,7 +247,7 @@ static int cg_test_proc_killed(const char *cgroup)
 
 /*
  * First, this test creates the following hierarchy:
- * A       memory.min = 50M,  memory.max = 200M
+ * A       memory.min = 0,    memory.max = 200M
  * A/B     memory.min = 50M,  memory.current = 50M
  * A/B/C   memory.min = 75M,  memory.current = 50M
  * A/B/D   memory.min = 25M,  memory.current = 50M
@@ -257,7 +257,7 @@ static int cg_test_proc_killed(const char *cgroup)
  * Usages are pagecache, but the test keeps a running
  * process in every leaf cgroup.
  * Then it creates A/G and creates a significant
- * memory pressure in it.
+ * memory pressure in A.
  *
  * A/B    memory.current ~= 50M
  * A/B/C  memory.current ~= 29M
@@ -335,8 +335,6 @@ static int test_memcg_min(const char *root)
 			      (void *)(long)fd);
 	}
 
-	if (cg_write(parent[0], "memory.min", "50M"))
-		goto cleanup;
 	if (cg_write(parent[1], "memory.min", "50M"))
 		goto cleanup;
 	if (cg_write(children[0], "memory.min", "75M"))
@@ -404,8 +402,8 @@ static int test_memcg_min(const char *root)
 
 /*
  * First, this test creates the following hierarchy:
- * A       memory.low = 50M,  memory.max = 200M
- * A/B     memory.low = 50M,  memory.current = 50M
+ * A       memory.low = 0,    memory.max = 200M
+ * A/B     memory.low = 50M,  memory.current = ...
  * A/B/C   memory.low = 75M,  memory.current = 50M
  * A/B/D   memory.low = 25M,  memory.current = 50M
  * A/B/E   memory.low = 0,    memory.current = 50M
@@ -490,8 +488,6 @@ static int test_memcg_low(const char *root)
 			goto cleanup;
 	}
 
-	if (cg_write(parent[0], "memory.low", "50M"))
-		goto cleanup;
 	if (cg_write(parent[1], "memory.low", "50M"))
 		goto cleanup;
 	if (cg_write(children[0], "memory.low", "75M"))
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* [PATCH 4/4] selftests: memcg: Remove protection from top level memcg
@ 2022-05-13 17:18           ` Michal Koutný
  0 siblings, 0 replies; 70+ messages in thread
From: Michal Koutný @ 2022-05-13 17:18 UTC (permalink / raw)
  To: void-gq6j2QGBifHby3iVrkZq2A
  Cc: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	cgroups-u79uwXL29TY76Z2rM5mHXA, hannes-druUgvl0LCNAfugRpC6u6w,
	kernel-team-b10kYP2dOMg, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, mhocko-DgEjT+Ai2ygdnm+yROfE0A,
	mkoutny-IBi9RG/b67k, roman.gushchin-fxUVXftIFDnyG1zEObXtfA,
	shakeelb-hpIqsD4AKlfQT0dZR+AlfA, tj-DgEjT+Ai2ygdnm+yROfE0A,
	Richard Palethorpe

The reclaim is triggered by memory limit in a subtree, therefore the
testcase does not need configured protection against external reclaim.

Also, correct/deduplicate respective comments

Signed-off-by: Michal Koutn√Ω <mkoutny-IBi9RG/b67k@public.gmane.org>
---
 tools/testing/selftests/cgroup/test_memcontrol.c | 12 ++++--------
 1 file changed, 4 insertions(+), 8 deletions(-)

diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
index 9ffacf024bbd..9d370aafd799 100644
--- a/tools/testing/selftests/cgroup/test_memcontrol.c
+++ b/tools/testing/selftests/cgroup/test_memcontrol.c
@@ -247,7 +247,7 @@ static int cg_test_proc_killed(const char *cgroup)
 
 /*
  * First, this test creates the following hierarchy:
- * A       memory.min = 50M,  memory.max = 200M
+ * A       memory.min = 0,    memory.max = 200M
  * A/B     memory.min = 50M,  memory.current = 50M
  * A/B/C   memory.min = 75M,  memory.current = 50M
  * A/B/D   memory.min = 25M,  memory.current = 50M
@@ -257,7 +257,7 @@ static int cg_test_proc_killed(const char *cgroup)
  * Usages are pagecache, but the test keeps a running
  * process in every leaf cgroup.
  * Then it creates A/G and creates a significant
- * memory pressure in it.
+ * memory pressure in A.
  *
  * A/B    memory.current ~= 50M
  * A/B/C  memory.current ~= 29M
@@ -335,8 +335,6 @@ static int test_memcg_min(const char *root)
 			      (void *)(long)fd);
 	}
 
-	if (cg_write(parent[0], "memory.min", "50M"))
-		goto cleanup;
 	if (cg_write(parent[1], "memory.min", "50M"))
 		goto cleanup;
 	if (cg_write(children[0], "memory.min", "75M"))
@@ -404,8 +402,8 @@ static int test_memcg_min(const char *root)
 
 /*
  * First, this test creates the following hierarchy:
- * A       memory.low = 50M,  memory.max = 200M
- * A/B     memory.low = 50M,  memory.current = 50M
+ * A       memory.low = 0,    memory.max = 200M
+ * A/B     memory.low = 50M,  memory.current = ...
  * A/B/C   memory.low = 75M,  memory.current = 50M
  * A/B/D   memory.low = 25M,  memory.current = 50M
  * A/B/E   memory.low = 0,    memory.current = 50M
@@ -490,8 +488,6 @@ static int test_memcg_low(const char *root)
 			goto cleanup;
 	}
 
-	if (cg_write(parent[0], "memory.low", "50M"))
-		goto cleanup;
 	if (cg_write(parent[1], "memory.low", "50M"))
 		goto cleanup;
 	if (cg_write(children[0], "memory.low", "75M"))
-- 
2.35.3


^ permalink raw reply related	[flat|nested] 70+ messages in thread

* Re: [PATCH 1/4] selftests: memcg: Fix compilation
@ 2022-05-13 17:40             ` David Vernet
  0 siblings, 0 replies; 70+ messages in thread
From: David Vernet @ 2022-05-13 17:40 UTC (permalink / raw)
  To: Michal Koutný
  Cc: akpm, cgroups, hannes, kernel-team, linux-kernel, linux-mm,
	mhocko, roman.gushchin, shakeelb, tj, Richard Palethorpe

On Fri, May 13, 2022 at 07:18:08PM +0200, Michal Koutný wrote:
> This fixes mis-applied changes from commit 72b1e03aa725 ("cgroup:
> account for memory_localevents in test_memcg_oom_group_leaf_events()").
> 
> Signed-off-by: Michal Koutný <mkoutny@suse.com>
> ---
>  .../selftests/cgroup/test_memcontrol.c        | 25 +++++++++++--------
>  1 file changed, 14 insertions(+), 11 deletions(-)
> 
> diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
> index 6ab94317c87b..4958b42201a9 100644
> --- a/tools/testing/selftests/cgroup/test_memcontrol.c
> +++ b/tools/testing/selftests/cgroup/test_memcontrol.c
> @@ -1241,7 +1241,16 @@ static int test_memcg_oom_group_leaf_events(const char *root)
>  	if (cg_read_key_long(child, "memory.events", "oom_kill ") <= 0)
>  		goto cleanup;
>  
> -	if (cg_read_key_long(parent, "memory.events", "oom_kill ") <= 0)
> +	parent_oom_events = cg_read_key_long(
> +			parent, "memory.events", "oom_kill ");
> +	/*
> +	 * If memory_localevents is not enabled (the default), the parent should
> +	 * count OOM events in its children groups. Otherwise, it should not
> +	 * have observed any events.
> +	 */
> +	if (has_localevents && parent_oom_events != 0)
> +		goto cleanup;
> +	else if (!has_localevents && parent_oom_events <= 0)
>  		goto cleanup;
>  
>  	ret = KSFT_PASS;
> @@ -1349,20 +1358,14 @@ static int test_memcg_oom_group_score_events(const char *root)
>  	if (!cg_run(memcg, alloc_anon, (void *)MB(100)))
>  		goto cleanup;
>  
> -	parent_oom_events = cg_read_key_long(
> -			parent, "memory.events", "oom_kill ");
> -	/*
> -	 * If memory_localevents is not enabled (the default), the parent should
> -	 * count OOM events in its children groups. Otherwise, it should not
> -	 * have observed any events.
> -	 */
> -	if ((has_localevents && parent_oom_events == 0) ||
> -	     parent_oom_events > 0)
> -		ret = KSFT_PASS;
> +	if (cg_read_key_long(memcg, "memory.events", "oom_kill ") != 3)
> +		FAIL(cleanup);
>  
>  	if (kill(safe_pid, SIGKILL))
>  		goto cleanup;
>  
> +	ret = KSFT_PASS;
> +
>  cleanup:
>  	if (memcg)
>  		cg_destroy(memcg);
> -- 
> 2.35.3
> 

Thanks for the fix, Michal.

Reviewed-by: David Vernet <void@manifault.com>

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 1/4] selftests: memcg: Fix compilation
@ 2022-05-13 17:40             ` David Vernet
  0 siblings, 0 replies; 70+ messages in thread
From: David Vernet @ 2022-05-13 17:40 UTC (permalink / raw)
  To: Michal Koutný
  Cc: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	cgroups-u79uwXL29TY76Z2rM5mHXA, hannes-druUgvl0LCNAfugRpC6u6w,
	kernel-team-b10kYP2dOMg, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, mhocko-DgEjT+Ai2ygdnm+yROfE0A,
	roman.gushchin-fxUVXftIFDnyG1zEObXtfA,
	shakeelb-hpIqsD4AKlfQT0dZR+AlfA, tj-DgEjT+Ai2ygdnm+yROfE0A,
	Richard Palethorpe

On Fri, May 13, 2022 at 07:18:08PM +0200, Michal Koutný wrote:
> This fixes mis-applied changes from commit 72b1e03aa725 ("cgroup:
> account for memory_localevents in test_memcg_oom_group_leaf_events()").
> 
> Signed-off-by: Michal Koutný <mkoutny-IBi9RG/b67k@public.gmane.org>
> ---
>  .../selftests/cgroup/test_memcontrol.c        | 25 +++++++++++--------
>  1 file changed, 14 insertions(+), 11 deletions(-)
> 
> diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
> index 6ab94317c87b..4958b42201a9 100644
> --- a/tools/testing/selftests/cgroup/test_memcontrol.c
> +++ b/tools/testing/selftests/cgroup/test_memcontrol.c
> @@ -1241,7 +1241,16 @@ static int test_memcg_oom_group_leaf_events(const char *root)
>  	if (cg_read_key_long(child, "memory.events", "oom_kill ") <= 0)
>  		goto cleanup;
>  
> -	if (cg_read_key_long(parent, "memory.events", "oom_kill ") <= 0)
> +	parent_oom_events = cg_read_key_long(
> +			parent, "memory.events", "oom_kill ");
> +	/*
> +	 * If memory_localevents is not enabled (the default), the parent should
> +	 * count OOM events in its children groups. Otherwise, it should not
> +	 * have observed any events.
> +	 */
> +	if (has_localevents && parent_oom_events != 0)
> +		goto cleanup;
> +	else if (!has_localevents && parent_oom_events <= 0)
>  		goto cleanup;
>  
>  	ret = KSFT_PASS;
> @@ -1349,20 +1358,14 @@ static int test_memcg_oom_group_score_events(const char *root)
>  	if (!cg_run(memcg, alloc_anon, (void *)MB(100)))
>  		goto cleanup;
>  
> -	parent_oom_events = cg_read_key_long(
> -			parent, "memory.events", "oom_kill ");
> -	/*
> -	 * If memory_localevents is not enabled (the default), the parent should
> -	 * count OOM events in its children groups. Otherwise, it should not
> -	 * have observed any events.
> -	 */
> -	if ((has_localevents && parent_oom_events == 0) ||
> -	     parent_oom_events > 0)
> -		ret = KSFT_PASS;
> +	if (cg_read_key_long(memcg, "memory.events", "oom_kill ") != 3)
> +		FAIL(cleanup);
>  
>  	if (kill(safe_pid, SIGKILL))
>  		goto cleanup;
>  
> +	ret = KSFT_PASS;
> +
>  cleanup:
>  	if (memcg)
>  		cg_destroy(memcg);
> -- 
> 2.35.3
> 

Thanks for the fix, Michal.

Reviewed-by: David Vernet <void-gq6j2QGBifHby3iVrkZq2A@public.gmane.org>

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 2/4] selftests: memcg: Expect no low events in unprotected sibling
@ 2022-05-13 17:42             ` David Vernet
  0 siblings, 0 replies; 70+ messages in thread
From: David Vernet @ 2022-05-13 17:42 UTC (permalink / raw)
  To: Michal Koutný
  Cc: akpm, cgroups, hannes, kernel-team, linux-kernel, linux-mm,
	mhocko, roman.gushchin, shakeelb, tj, Richard Palethorpe

On Fri, May 13, 2022 at 07:18:09PM +0200, Michal Koutný wrote:
> This is effectively a revert of commit cdc69458a5f3 ("cgroup: account
> for memory_recursiveprot in test_memcg_low()"). The case test_memcg_low
> will fail with memory_recursiveprot until resolved in reclaim
> code.
> However, this patch preserves the existing helpers and variables for
> later uses.
> 
> Signed-off-by: Michal Koutný <mkoutny@suse.com>
> ---
>  tools/testing/selftests/cgroup/test_memcontrol.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
> index 4958b42201a9..eba252fa64ac 100644
> --- a/tools/testing/selftests/cgroup/test_memcontrol.c
> +++ b/tools/testing/selftests/cgroup/test_memcontrol.c
> @@ -528,7 +528,7 @@ static int test_memcg_low(const char *root)
>  	}
>  
>  	for (i = 0; i < ARRAY_SIZE(children); i++) {
> -		int no_low_events_index = has_recursiveprot ? 2 : 1;
> +		int no_low_events_index = 1;
>  
>  		oom = cg_read_key_long(children[i], "memory.events", "oom ");
>  		low = cg_read_key_long(children[i], "memory.events", "low ");
> -- 
> 2.35.3
> 

Reviewed-by: David Vernet <void@manifault.com>

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 2/4] selftests: memcg: Expect no low events in unprotected sibling
@ 2022-05-13 17:42             ` David Vernet
  0 siblings, 0 replies; 70+ messages in thread
From: David Vernet @ 2022-05-13 17:42 UTC (permalink / raw)
  To: Michal Koutný
  Cc: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	cgroups-u79uwXL29TY76Z2rM5mHXA, hannes-druUgvl0LCNAfugRpC6u6w,
	kernel-team-b10kYP2dOMg, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, mhocko-DgEjT+Ai2ygdnm+yROfE0A,
	roman.gushchin-fxUVXftIFDnyG1zEObXtfA,
	shakeelb-hpIqsD4AKlfQT0dZR+AlfA, tj-DgEjT+Ai2ygdnm+yROfE0A,
	Richard Palethorpe

On Fri, May 13, 2022 at 07:18:09PM +0200, Michal Koutný wrote:
> This is effectively a revert of commit cdc69458a5f3 ("cgroup: account
> for memory_recursiveprot in test_memcg_low()"). The case test_memcg_low
> will fail with memory_recursiveprot until resolved in reclaim
> code.
> However, this patch preserves the existing helpers and variables for
> later uses.
> 
> Signed-off-by: Michal Koutný <mkoutny-IBi9RG/b67k@public.gmane.org>
> ---
>  tools/testing/selftests/cgroup/test_memcontrol.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
> index 4958b42201a9..eba252fa64ac 100644
> --- a/tools/testing/selftests/cgroup/test_memcontrol.c
> +++ b/tools/testing/selftests/cgroup/test_memcontrol.c
> @@ -528,7 +528,7 @@ static int test_memcg_low(const char *root)
>  	}
>  
>  	for (i = 0; i < ARRAY_SIZE(children); i++) {
> -		int no_low_events_index = has_recursiveprot ? 2 : 1;
> +		int no_low_events_index = 1;
>  
>  		oom = cg_read_key_long(children[i], "memory.events", "oom ");
>  		low = cg_read_key_long(children[i], "memory.events", "low ");
> -- 
> 2.35.3
> 

Reviewed-by: David Vernet <void-gq6j2QGBifHby3iVrkZq2A@public.gmane.org>

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 3/4] selftests: memcg: Adjust expected reclaim values of protected cgroups
@ 2022-05-13 18:52             ` Roman Gushchin
  0 siblings, 0 replies; 70+ messages in thread
From: Roman Gushchin @ 2022-05-13 18:52 UTC (permalink / raw)
  To: Michal Koutný
  Cc: void, akpm, cgroups, hannes, kernel-team, linux-kernel, linux-mm,
	mhocko, shakeelb, tj, Richard Palethorpe

On Fri, May 13, 2022 at 07:18:10PM +0200, Michal Koutny wrote:
> The numbers are not easy to derive in a closed form (certainly mere
> protections ratios do not apply), therefore use a simulation to obtain
> expected numbers.
> 
> The new values make the protection tests succeed more precisely.
> 
> 	% run as: octave-cli script
> 	%
> 	% Input configurations
> 	% -------------------
> 	% E parent effective protection
> 	% n nominal protection of siblings set at the givel level
> 	% c current consumption -,,-
> 
> 	% example from testcase (values in GB)
> 	E = 50 / 1024;
> 	n = [75 25 0 500 ] / 1024;
> 	c = [50 50 50 0] / 1024;
> 
> 	% Reclaim parameters
> 	% ------------------
> 
> 	% Minimal reclaim amount (GB)
> 	cluster = 32*4 / 2**20;
> 
> 	% Reclaim coefficient (think as 0.5^sc->priority)
> 	alpha = .1
> 
> 	% Simulation parameters
> 	% ---------------------
> 	epsilon = 1e-7;
> 	timeout = 1000;
> 
> 	% Simulation loop
> 	% ---------------------
> 	% Simulation assumes siblings consumed the initial amount of memory (w/out
> 	% reclaim) and then the reclaim starts, all memory is reclaimable, i.e. treated
> 	% same. It simulates only non-low reclaim and assumes all memory.min = 0.
> 
> 	ch = [];
> 	eh = [];
> 	rh = [];
> 
> 	for t = 1:timeout
> 		% low_usage
> 		u = min(c, n);
> 		siblings = sum(u);
> 
> 		% effective_protection()
> 		protected = min(n, c);                % start with nominal
> 		e = protected * min(1, E / siblings); % normalize overcommit
> 
> 		% recursive protection
> 		unclaimed = max(0, E - siblings);
> 		parent_overuse = sum(c) - siblings;
> 		if (unclaimed > 0 && parent_overuse > 0)
> 			overuse = max(0, c - protected);
> 			e += unclaimed * (overuse / parent_overuse);
> 		endif
> 
> 		% get_scan_count()
> 		r = alpha * c;             % assume all memory is in a single LRU list
> 
> 		% commit 1bc63fb1272b ("mm, memcg: make scan aggression always exclude protection")
> 		sz = max(e, c);
> 		r .*= (1 - (e+epsilon) ./ (sz+epsilon));
> 
> 		% uncomment to debug prints
> 		% e, c, r
> 
> 		% nothing to reclaim, reached equilibrium
> 		if max(r) < epsilon
> 			break;
> 		endif
> 
> 		% SWAP_CLUSTER_MAX
> 		r = max(r, (r > epsilon) .* cluster);
> 		% XXX here I do parallel reclaim of all siblings
> 		% in reality reclaim is serialized and each sibling recalculates own residual
> 		c = max(c - r, 0);
> 
> 		ch = [ch ; c];
> 		eh = [eh ; e];
> 		rh = [rh ; r];
> 	endfor
> 
> 	t
> 	c, e

This is a cool stuff!

How about to place it into a separate file and add a comment into the code
with a reference?

Thanks!

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 3/4] selftests: memcg: Adjust expected reclaim values of protected cgroups
@ 2022-05-13 18:52             ` Roman Gushchin
  0 siblings, 0 replies; 70+ messages in thread
From: Roman Gushchin @ 2022-05-13 18:52 UTC (permalink / raw)
  To: Michal Koutný
  Cc: void-gq6j2QGBifHby3iVrkZq2A,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	cgroups-u79uwXL29TY76Z2rM5mHXA, hannes-druUgvl0LCNAfugRpC6u6w,
	kernel-team-b10kYP2dOMg, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, mhocko-DgEjT+Ai2ygdnm+yROfE0A,
	shakeelb-hpIqsD4AKlfQT0dZR+AlfA, tj-DgEjT+Ai2ygdnm+yROfE0A,
	Richard Palethorpe

On Fri, May 13, 2022 at 07:18:10PM +0200, Michal Koutny wrote:
> The numbers are not easy to derive in a closed form (certainly mere
> protections ratios do not apply), therefore use a simulation to obtain
> expected numbers.
> 
> The new values make the protection tests succeed more precisely.
> 
> 	% run as: octave-cli script
> 	%
> 	% Input configurations
> 	% -------------------
> 	% E parent effective protection
> 	% n nominal protection of siblings set at the givel level
> 	% c current consumption -,,-
> 
> 	% example from testcase (values in GB)
> 	E = 50 / 1024;
> 	n = [75 25 0 500 ] / 1024;
> 	c = [50 50 50 0] / 1024;
> 
> 	% Reclaim parameters
> 	% ------------------
> 
> 	% Minimal reclaim amount (GB)
> 	cluster = 32*4 / 2**20;
> 
> 	% Reclaim coefficient (think as 0.5^sc->priority)
> 	alpha = .1
> 
> 	% Simulation parameters
> 	% ---------------------
> 	epsilon = 1e-7;
> 	timeout = 1000;
> 
> 	% Simulation loop
> 	% ---------------------
> 	% Simulation assumes siblings consumed the initial amount of memory (w/out
> 	% reclaim) and then the reclaim starts, all memory is reclaimable, i.e. treated
> 	% same. It simulates only non-low reclaim and assumes all memory.min = 0.
> 
> 	ch = [];
> 	eh = [];
> 	rh = [];
> 
> 	for t = 1:timeout
> 		% low_usage
> 		u = min(c, n);
> 		siblings = sum(u);
> 
> 		% effective_protection()
> 		protected = min(n, c);                % start with nominal
> 		e = protected * min(1, E / siblings); % normalize overcommit
> 
> 		% recursive protection
> 		unclaimed = max(0, E - siblings);
> 		parent_overuse = sum(c) - siblings;
> 		if (unclaimed > 0 && parent_overuse > 0)
> 			overuse = max(0, c - protected);
> 			e += unclaimed * (overuse / parent_overuse);
> 		endif
> 
> 		% get_scan_count()
> 		r = alpha * c;             % assume all memory is in a single LRU list
> 
> 		% commit 1bc63fb1272b ("mm, memcg: make scan aggression always exclude protection")
> 		sz = max(e, c);
> 		r .*= (1 - (e+epsilon) ./ (sz+epsilon));
> 
> 		% uncomment to debug prints
> 		% e, c, r
> 
> 		% nothing to reclaim, reached equilibrium
> 		if max(r) < epsilon
> 			break;
> 		endif
> 
> 		% SWAP_CLUSTER_MAX
> 		r = max(r, (r > epsilon) .* cluster);
> 		% XXX here I do parallel reclaim of all siblings
> 		% in reality reclaim is serialized and each sibling recalculates own residual
> 		c = max(c - r, 0);
> 
> 		ch = [ch ; c];
> 		eh = [eh ; e];
> 		rh = [rh ; r];
> 	endfor
> 
> 	t
> 	c, e

This is a cool stuff!

How about to place it into a separate file and add a comment into the code
with a reference?

Thanks!

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 1/4] selftests: memcg: Fix compilation
@ 2022-05-13 18:53             ` Roman Gushchin
  0 siblings, 0 replies; 70+ messages in thread
From: Roman Gushchin @ 2022-05-13 18:53 UTC (permalink / raw)
  To: Michal Koutný
  Cc: void, akpm, cgroups, hannes, kernel-team, linux-kernel, linux-mm,
	mhocko, shakeelb, tj, Richard Palethorpe

On Fri, May 13, 2022 at 07:18:08PM +0200, Michal Koutny wrote:
> This fixes mis-applied changes from commit 72b1e03aa725 ("cgroup:
> account for memory_localevents in test_memcg_oom_group_leaf_events()").
> 
> Signed-off-by: Michal Koutný <mkoutny@suse.com>
> ---
>  .../selftests/cgroup/test_memcontrol.c        | 25 +++++++++++--------
>  1 file changed, 14 insertions(+), 11 deletions(-)
> 
> diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
> index 6ab94317c87b..4958b42201a9 100644
> --- a/tools/testing/selftests/cgroup/test_memcontrol.c
> +++ b/tools/testing/selftests/cgroup/test_memcontrol.c
> @@ -1241,7 +1241,16 @@ static int test_memcg_oom_group_leaf_events(const char *root)
>  	if (cg_read_key_long(child, "memory.events", "oom_kill ") <= 0)
>  		goto cleanup;
>  
> -	if (cg_read_key_long(parent, "memory.events", "oom_kill ") <= 0)
> +	parent_oom_events = cg_read_key_long(
> +			parent, "memory.events", "oom_kill ");
> +	/*
> +	 * If memory_localevents is not enabled (the default), the parent should
> +	 * count OOM events in its children groups. Otherwise, it should not
> +	 * have observed any events.
> +	 */
> +	if (has_localevents && parent_oom_events != 0)
> +		goto cleanup;
> +	else if (!has_localevents && parent_oom_events <= 0)
>  		goto cleanup;
>  
>  	ret = KSFT_PASS;
> @@ -1349,20 +1358,14 @@ static int test_memcg_oom_group_score_events(const char *root)
>  	if (!cg_run(memcg, alloc_anon, (void *)MB(100)))
>  		goto cleanup;
>  
> -	parent_oom_events = cg_read_key_long(
> -			parent, "memory.events", "oom_kill ");
> -	/*
> -	 * If memory_localevents is not enabled (the default), the parent should
> -	 * count OOM events in its children groups. Otherwise, it should not
> -	 * have observed any events.
> -	 */
> -	if ((has_localevents && parent_oom_events == 0) ||
> -	     parent_oom_events > 0)
> -		ret = KSFT_PASS;
> +	if (cg_read_key_long(memcg, "memory.events", "oom_kill ") != 3)
> +		FAIL(cleanup);
>  
>  	if (kill(safe_pid, SIGKILL))
>  		goto cleanup;
>  
> +	ret = KSFT_PASS;
> +
>  cleanup:
>  	if (memcg)
>  		cg_destroy(memcg);
> -- 
> 2.35.3
> 

My bad.

Acked-by: Roman Gushchin <roman.gushchin@linux.dev>

Thanks!

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 1/4] selftests: memcg: Fix compilation
@ 2022-05-13 18:53             ` Roman Gushchin
  0 siblings, 0 replies; 70+ messages in thread
From: Roman Gushchin @ 2022-05-13 18:53 UTC (permalink / raw)
  To: Michal Koutný
  Cc: void-gq6j2QGBifHby3iVrkZq2A,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	cgroups-u79uwXL29TY76Z2rM5mHXA, hannes-druUgvl0LCNAfugRpC6u6w,
	kernel-team-b10kYP2dOMg, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, mhocko-DgEjT+Ai2ygdnm+yROfE0A,
	shakeelb-hpIqsD4AKlfQT0dZR+AlfA, tj-DgEjT+Ai2ygdnm+yROfE0A,
	Richard Palethorpe

On Fri, May 13, 2022 at 07:18:08PM +0200, Michal Koutny wrote:
> This fixes mis-applied changes from commit 72b1e03aa725 ("cgroup:
> account for memory_localevents in test_memcg_oom_group_leaf_events()").
> 
> Signed-off-by: Michal Koutný <mkoutny-IBi9RG/b67k@public.gmane.org>
> ---
>  .../selftests/cgroup/test_memcontrol.c        | 25 +++++++++++--------
>  1 file changed, 14 insertions(+), 11 deletions(-)
> 
> diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
> index 6ab94317c87b..4958b42201a9 100644
> --- a/tools/testing/selftests/cgroup/test_memcontrol.c
> +++ b/tools/testing/selftests/cgroup/test_memcontrol.c
> @@ -1241,7 +1241,16 @@ static int test_memcg_oom_group_leaf_events(const char *root)
>  	if (cg_read_key_long(child, "memory.events", "oom_kill ") <= 0)
>  		goto cleanup;
>  
> -	if (cg_read_key_long(parent, "memory.events", "oom_kill ") <= 0)
> +	parent_oom_events = cg_read_key_long(
> +			parent, "memory.events", "oom_kill ");
> +	/*
> +	 * If memory_localevents is not enabled (the default), the parent should
> +	 * count OOM events in its children groups. Otherwise, it should not
> +	 * have observed any events.
> +	 */
> +	if (has_localevents && parent_oom_events != 0)
> +		goto cleanup;
> +	else if (!has_localevents && parent_oom_events <= 0)
>  		goto cleanup;
>  
>  	ret = KSFT_PASS;
> @@ -1349,20 +1358,14 @@ static int test_memcg_oom_group_score_events(const char *root)
>  	if (!cg_run(memcg, alloc_anon, (void *)MB(100)))
>  		goto cleanup;
>  
> -	parent_oom_events = cg_read_key_long(
> -			parent, "memory.events", "oom_kill ");
> -	/*
> -	 * If memory_localevents is not enabled (the default), the parent should
> -	 * count OOM events in its children groups. Otherwise, it should not
> -	 * have observed any events.
> -	 */
> -	if ((has_localevents && parent_oom_events == 0) ||
> -	     parent_oom_events > 0)
> -		ret = KSFT_PASS;
> +	if (cg_read_key_long(memcg, "memory.events", "oom_kill ") != 3)
> +		FAIL(cleanup);
>  
>  	if (kill(safe_pid, SIGKILL))
>  		goto cleanup;
>  
> +	ret = KSFT_PASS;
> +
>  cleanup:
>  	if (memcg)
>  		cg_destroy(memcg);
> -- 
> 2.35.3
> 

My bad.

Acked-by: Roman Gushchin <roman.gushchin-fxUVXftIFDnyG1zEObXtfA@public.gmane.org>

Thanks!

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 2/4] selftests: memcg: Expect no low events in unprotected sibling
  2022-05-13 17:18           ` Michal Koutný
  (?)
  (?)
@ 2022-05-13 18:54           ` Roman Gushchin
  2022-05-18 15:54               ` Michal Koutný
  -1 siblings, 1 reply; 70+ messages in thread
From: Roman Gushchin @ 2022-05-13 18:54 UTC (permalink / raw)
  To: Michal Koutný
  Cc: void, akpm, cgroups, hannes, kernel-team, linux-kernel, linux-mm,
	mhocko, shakeelb, tj, Richard Palethorpe

On Fri, May 13, 2022 at 07:18:09PM +0200, Michal Koutny wrote:
> This is effectively a revert of commit cdc69458a5f3 ("cgroup: account
> for memory_recursiveprot in test_memcg_low()"). The case test_memcg_low
> will fail with memory_recursiveprot until resolved in reclaim
> code.

Hm, what are our plans here? Are we going to fix it soon-ish, or there
is still no agreement on how to proceed?

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 4/4] selftests: memcg: Remove protection from top level memcg
@ 2022-05-13 18:59             ` Roman Gushchin
  0 siblings, 0 replies; 70+ messages in thread
From: Roman Gushchin @ 2022-05-13 18:59 UTC (permalink / raw)
  To: Michal Koutný
  Cc: void, akpm, cgroups, hannes, kernel-team, linux-kernel, linux-mm,
	mhocko, shakeelb, tj, Richard Palethorpe

On Fri, May 13, 2022 at 07:18:11PM +0200, Michal Koutny wrote:
> The reclaim is triggered by memory limit in a subtree, therefore the
> testcase does not need configured protection against external reclaim.
> 
> Also, correct/deduplicate respective comments
> 
> Signed-off-by: Michal Koutný <mkoutny@suse.com>
> ---
>  tools/testing/selftests/cgroup/test_memcontrol.c | 12 ++++--------
>  1 file changed, 4 insertions(+), 8 deletions(-)
> 
> diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
> index 9ffacf024bbd..9d370aafd799 100644
> --- a/tools/testing/selftests/cgroup/test_memcontrol.c
> +++ b/tools/testing/selftests/cgroup/test_memcontrol.c
> @@ -247,7 +247,7 @@ static int cg_test_proc_killed(const char *cgroup)
>  
>  /*
>   * First, this test creates the following hierarchy:
> - * A       memory.min = 50M,  memory.max = 200M
> + * A       memory.min = 0,    memory.max = 200M
>   * A/B     memory.min = 50M,  memory.current = 50M
>   * A/B/C   memory.min = 75M,  memory.current = 50M
>   * A/B/D   memory.min = 25M,  memory.current = 50M
> @@ -257,7 +257,7 @@ static int cg_test_proc_killed(const char *cgroup)
>   * Usages are pagecache, but the test keeps a running
>   * process in every leaf cgroup.
>   * Then it creates A/G and creates a significant
> - * memory pressure in it.
> + * memory pressure in A.
>   *
>   * A/B    memory.current ~= 50M
>   * A/B/C  memory.current ~= 29M
> @@ -335,8 +335,6 @@ static int test_memcg_min(const char *root)
>  			      (void *)(long)fd);
>  	}
>  
> -	if (cg_write(parent[0], "memory.min", "50M"))
> -		goto cleanup;
>  	if (cg_write(parent[1], "memory.min", "50M"))
>  		goto cleanup;
>  	if (cg_write(children[0], "memory.min", "75M"))
> @@ -404,8 +402,8 @@ static int test_memcg_min(const char *root)
>  
>  /*
>   * First, this test creates the following hierarchy:
> - * A       memory.low = 50M,  memory.max = 200M
> - * A/B     memory.low = 50M,  memory.current = 50M
> + * A       memory.low = 0,    memory.max = 200M
> + * A/B     memory.low = 50M,  memory.current = ...

Can you, please, just remove "memory.current = ...", it's not
because obvious what "..." means here.

Other than that: Acked-by: Roman Gushchin <roman.gushchin@linux.dev>

Thank you!

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 4/4] selftests: memcg: Remove protection from top level memcg
@ 2022-05-13 18:59             ` Roman Gushchin
  0 siblings, 0 replies; 70+ messages in thread
From: Roman Gushchin @ 2022-05-13 18:59 UTC (permalink / raw)
  To: Michal Koutný
  Cc: void-gq6j2QGBifHby3iVrkZq2A,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	cgroups-u79uwXL29TY76Z2rM5mHXA, hannes-druUgvl0LCNAfugRpC6u6w,
	kernel-team-b10kYP2dOMg, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, mhocko-DgEjT+Ai2ygdnm+yROfE0A,
	shakeelb-hpIqsD4AKlfQT0dZR+AlfA, tj-DgEjT+Ai2ygdnm+yROfE0A,
	Richard Palethorpe

On Fri, May 13, 2022 at 07:18:11PM +0200, Michal Koutny wrote:
> The reclaim is triggered by memory limit in a subtree, therefore the
> testcase does not need configured protection against external reclaim.
> 
> Also, correct/deduplicate respective comments
> 
> Signed-off-by: Michal Koutný <mkoutny-IBi9RG/b67k@public.gmane.org>
> ---
>  tools/testing/selftests/cgroup/test_memcontrol.c | 12 ++++--------
>  1 file changed, 4 insertions(+), 8 deletions(-)
> 
> diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
> index 9ffacf024bbd..9d370aafd799 100644
> --- a/tools/testing/selftests/cgroup/test_memcontrol.c
> +++ b/tools/testing/selftests/cgroup/test_memcontrol.c
> @@ -247,7 +247,7 @@ static int cg_test_proc_killed(const char *cgroup)
>  
>  /*
>   * First, this test creates the following hierarchy:
> - * A       memory.min = 50M,  memory.max = 200M
> + * A       memory.min = 0,    memory.max = 200M
>   * A/B     memory.min = 50M,  memory.current = 50M
>   * A/B/C   memory.min = 75M,  memory.current = 50M
>   * A/B/D   memory.min = 25M,  memory.current = 50M
> @@ -257,7 +257,7 @@ static int cg_test_proc_killed(const char *cgroup)
>   * Usages are pagecache, but the test keeps a running
>   * process in every leaf cgroup.
>   * Then it creates A/G and creates a significant
> - * memory pressure in it.
> + * memory pressure in A.
>   *
>   * A/B    memory.current ~= 50M
>   * A/B/C  memory.current ~= 29M
> @@ -335,8 +335,6 @@ static int test_memcg_min(const char *root)
>  			      (void *)(long)fd);
>  	}
>  
> -	if (cg_write(parent[0], "memory.min", "50M"))
> -		goto cleanup;
>  	if (cg_write(parent[1], "memory.min", "50M"))
>  		goto cleanup;
>  	if (cg_write(children[0], "memory.min", "75M"))
> @@ -404,8 +402,8 @@ static int test_memcg_min(const char *root)
>  
>  /*
>   * First, this test creates the following hierarchy:
> - * A       memory.low = 50M,  memory.max = 200M
> - * A/B     memory.low = 50M,  memory.current = 50M
> + * A       memory.low = 0,    memory.max = 200M
> + * A/B     memory.low = 50M,  memory.current = ...

Can you, please, just remove "memory.current = ...", it's not
because obvious what "..." means here.

Other than that: Acked-by: Roman Gushchin <roman.gushchin-fxUVXftIFDnyG1zEObXtfA@public.gmane.org>

Thank you!

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 1/4] selftests: memcg: Fix compilation
  2022-05-13 18:53             ` Roman Gushchin
@ 2022-05-13 19:09               ` Roman Gushchin
  -1 siblings, 0 replies; 70+ messages in thread
From: Roman Gushchin @ 2022-05-13 19:09 UTC (permalink / raw)
  To: Michal Koutný
  Cc: void, akpm, cgroups, hannes, kernel-team, linux-kernel, linux-mm,
	mhocko, shakeelb, tj, Richard Palethorpe

On Fri, May 13, 2022 at 11:53:26AM -0700, Roman Gushchin wrote:
> On Fri, May 13, 2022 at 07:18:08PM +0200, Michal Koutny wrote:
> > This fixes mis-applied changes from commit 72b1e03aa725 ("cgroup:
> > account for memory_localevents in test_memcg_oom_group_leaf_events()").
> > 
> > Signed-off-by: Michal Koutný <mkoutny@suse.com>
> > ---
> >  .../selftests/cgroup/test_memcontrol.c        | 25 +++++++++++--------
> >  1 file changed, 14 insertions(+), 11 deletions(-)
> > 
> > diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
> > index 6ab94317c87b..4958b42201a9 100644
> > --- a/tools/testing/selftests/cgroup/test_memcontrol.c
> > +++ b/tools/testing/selftests/cgroup/test_memcontrol.c
> > @@ -1241,7 +1241,16 @@ static int test_memcg_oom_group_leaf_events(const char *root)
> >  	if (cg_read_key_long(child, "memory.events", "oom_kill ") <= 0)
> >  		goto cleanup;
> >  
> > -	if (cg_read_key_long(parent, "memory.events", "oom_kill ") <= 0)
> > +	parent_oom_events = cg_read_key_long(
> > +			parent, "memory.events", "oom_kill ");
> > +	/*
> > +	 * If memory_localevents is not enabled (the default), the parent should
> > +	 * count OOM events in its children groups. Otherwise, it should not
> > +	 * have observed any events.
> > +	 */
> > +	if (has_localevents && parent_oom_events != 0)
> > +		goto cleanup;
> > +	else if (!has_localevents && parent_oom_events <= 0)
> >  		goto cleanup;
> >  
> >  	ret = KSFT_PASS;
> > @@ -1349,20 +1358,14 @@ static int test_memcg_oom_group_score_events(const char *root)
> >  	if (!cg_run(memcg, alloc_anon, (void *)MB(100)))
> >  		goto cleanup;
> >  
> > -	parent_oom_events = cg_read_key_long(
> > -			parent, "memory.events", "oom_kill ");
> > -	/*
> > -	 * If memory_localevents is not enabled (the default), the parent should
> > -	 * count OOM events in its children groups. Otherwise, it should not
> > -	 * have observed any events.
> > -	 */
> > -	if ((has_localevents && parent_oom_events == 0) ||
> > -	     parent_oom_events > 0)
> > -		ret = KSFT_PASS;
> > +	if (cg_read_key_long(memcg, "memory.events", "oom_kill ") != 3)
> > +		FAIL(cleanup);
> >  
> >  	if (kill(safe_pid, SIGKILL))
> >  		goto cleanup;
> >  
> > +	ret = KSFT_PASS;
> > +
> >  cleanup:
> >  	if (memcg)
> >  		cg_destroy(memcg);
> > -- 
> > 2.35.3
> > 
> 
> My bad.

Actually not, as pointed by David.
Seems like it's git fault. The original patch looks correct.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 1/4] selftests: memcg: Fix compilation
@ 2022-05-13 19:09               ` Roman Gushchin
  0 siblings, 0 replies; 70+ messages in thread
From: Roman Gushchin @ 2022-05-13 19:09 UTC (permalink / raw)
  To: Michal Koutný
  Cc: void-gq6j2QGBifHby3iVrkZq2A,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	cgroups-u79uwXL29TY76Z2rM5mHXA, hannes-druUgvl0LCNAfugRpC6u6w,
	kernel-team-b10kYP2dOMg, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, mhocko-DgEjT+Ai2ygdnm+yROfE0A,
	shakeelb-hpIqsD4AKlfQT0dZR+AlfA, tj-DgEjT+Ai2ygdnm+yROfE0A,
	Richard Palethorpe

On Fri, May 13, 2022 at 11:53:26AM -0700, Roman Gushchin wrote:
> On Fri, May 13, 2022 at 07:18:08PM +0200, Michal Koutny wrote:
> > This fixes mis-applied changes from commit 72b1e03aa725 ("cgroup:
> > account for memory_localevents in test_memcg_oom_group_leaf_events()").
> > 
> > Signed-off-by: Michal Koutný <mkoutny-IBi9RG/b67k@public.gmane.org>
> > ---
> >  .../selftests/cgroup/test_memcontrol.c        | 25 +++++++++++--------
> >  1 file changed, 14 insertions(+), 11 deletions(-)
> > 
> > diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
> > index 6ab94317c87b..4958b42201a9 100644
> > --- a/tools/testing/selftests/cgroup/test_memcontrol.c
> > +++ b/tools/testing/selftests/cgroup/test_memcontrol.c
> > @@ -1241,7 +1241,16 @@ static int test_memcg_oom_group_leaf_events(const char *root)
> >  	if (cg_read_key_long(child, "memory.events", "oom_kill ") <= 0)
> >  		goto cleanup;
> >  
> > -	if (cg_read_key_long(parent, "memory.events", "oom_kill ") <= 0)
> > +	parent_oom_events = cg_read_key_long(
> > +			parent, "memory.events", "oom_kill ");
> > +	/*
> > +	 * If memory_localevents is not enabled (the default), the parent should
> > +	 * count OOM events in its children groups. Otherwise, it should not
> > +	 * have observed any events.
> > +	 */
> > +	if (has_localevents && parent_oom_events != 0)
> > +		goto cleanup;
> > +	else if (!has_localevents && parent_oom_events <= 0)
> >  		goto cleanup;
> >  
> >  	ret = KSFT_PASS;
> > @@ -1349,20 +1358,14 @@ static int test_memcg_oom_group_score_events(const char *root)
> >  	if (!cg_run(memcg, alloc_anon, (void *)MB(100)))
> >  		goto cleanup;
> >  
> > -	parent_oom_events = cg_read_key_long(
> > -			parent, "memory.events", "oom_kill ");
> > -	/*
> > -	 * If memory_localevents is not enabled (the default), the parent should
> > -	 * count OOM events in its children groups. Otherwise, it should not
> > -	 * have observed any events.
> > -	 */
> > -	if ((has_localevents && parent_oom_events == 0) ||
> > -	     parent_oom_events > 0)
> > -		ret = KSFT_PASS;
> > +	if (cg_read_key_long(memcg, "memory.events", "oom_kill ") != 3)
> > +		FAIL(cleanup);
> >  
> >  	if (kill(safe_pid, SIGKILL))
> >  		goto cleanup;
> >  
> > +	ret = KSFT_PASS;
> > +
> >  cleanup:
> >  	if (memcg)
> >  		cg_destroy(memcg);
> > -- 
> > 2.35.3
> > 
> 
> My bad.

Actually not, as pointed by David.
Seems like it's git fault. The original patch looks correct.

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 4/4] selftests: memcg: Remove protection from top level memcg
@ 2022-05-13 19:14             ` David Vernet
  0 siblings, 0 replies; 70+ messages in thread
From: David Vernet @ 2022-05-13 19:14 UTC (permalink / raw)
  To: Michal Koutný
  Cc: akpm, cgroups, hannes, kernel-team, linux-kernel, linux-mm,
	mhocko, roman.gushchin, shakeelb, tj, Richard Palethorpe

On Fri, May 13, 2022 at 07:18:11PM +0200, Michal Koutný wrote:
> The reclaim is triggered by memory limit in a subtree, therefore the
> testcase does not need configured protection against external reclaim.
> 
> Also, correct/deduplicate respective comments
> 
> Signed-off-by: Michal Koutný <mkoutny@suse.com>
> ---
>  tools/testing/selftests/cgroup/test_memcontrol.c | 12 ++++--------
>  1 file changed, 4 insertions(+), 8 deletions(-)
> 
> diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
> index 9ffacf024bbd..9d370aafd799 100644
> --- a/tools/testing/selftests/cgroup/test_memcontrol.c
> +++ b/tools/testing/selftests/cgroup/test_memcontrol.c
> @@ -247,7 +247,7 @@ static int cg_test_proc_killed(const char *cgroup)
>  
>  /*
>   * First, this test creates the following hierarchy:
> - * A       memory.min = 50M,  memory.max = 200M
> + * A       memory.min = 0,    memory.max = 200M
>   * A/B     memory.min = 50M,  memory.current = 50M
>   * A/B/C   memory.min = 75M,  memory.current = 50M
>   * A/B/D   memory.min = 25M,  memory.current = 50M
> @@ -257,7 +257,7 @@ static int cg_test_proc_killed(const char *cgroup)
>   * Usages are pagecache, but the test keeps a running
>   * process in every leaf cgroup.
>   * Then it creates A/G and creates a significant
> - * memory pressure in it.
> + * memory pressure in A.
>   *
>   * A/B    memory.current ~= 50M
>   * A/B/C  memory.current ~= 29M
> @@ -335,8 +335,6 @@ static int test_memcg_min(const char *root)
>  			      (void *)(long)fd);
>  	}
>  
> -	if (cg_write(parent[0], "memory.min", "50M"))
> -		goto cleanup;
>  	if (cg_write(parent[1], "memory.min", "50M"))
>  		goto cleanup;
>  	if (cg_write(children[0], "memory.min", "75M"))
> @@ -404,8 +402,8 @@ static int test_memcg_min(const char *root)
>  
>  /*
>   * First, this test creates the following hierarchy:
> - * A       memory.low = 50M,  memory.max = 200M
> - * A/B     memory.low = 50M,  memory.current = 50M
> + * A       memory.low = 0,    memory.max = 200M
> + * A/B     memory.low = 50M,  memory.current = ...

Is there a reason that we would adjust this comment but not the A/B comment
from above in from test_memcg_low()? In both cases, I would just remove the
memory.current = ... part altogether, as Roman suggested.

>   * A/B/C   memory.low = 75M,  memory.current = 50M
>   * A/B/D   memory.low = 25M,  memory.current = 50M
>   * A/B/E   memory.low = 0,    memory.current = 50M
> @@ -490,8 +488,6 @@ static int test_memcg_low(const char *root)
>  			goto cleanup;
>  	}
>  
> -	if (cg_write(parent[0], "memory.low", "50M"))
> -		goto cleanup;
>  	if (cg_write(parent[1], "memory.low", "50M"))
>  		goto cleanup;
>  	if (cg_write(children[0], "memory.low", "75M"))
> -- 
> 2.35.3
> 

Looks good otherwise.

Reviewed-by: David Vernet <void@manifault.com>

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 4/4] selftests: memcg: Remove protection from top level memcg
@ 2022-05-13 19:14             ` David Vernet
  0 siblings, 0 replies; 70+ messages in thread
From: David Vernet @ 2022-05-13 19:14 UTC (permalink / raw)
  To: Michal Koutný
  Cc: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	cgroups-u79uwXL29TY76Z2rM5mHXA, hannes-druUgvl0LCNAfugRpC6u6w,
	kernel-team-b10kYP2dOMg, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, mhocko-DgEjT+Ai2ygdnm+yROfE0A,
	roman.gushchin-fxUVXftIFDnyG1zEObXtfA,
	shakeelb-hpIqsD4AKlfQT0dZR+AlfA, tj-DgEjT+Ai2ygdnm+yROfE0A,
	Richard Palethorpe

On Fri, May 13, 2022 at 07:18:11PM +0200, Michal Koutný wrote:
> The reclaim is triggered by memory limit in a subtree, therefore the
> testcase does not need configured protection against external reclaim.
> 
> Also, correct/deduplicate respective comments
> 
> Signed-off-by: Michal Koutný <mkoutny-IBi9RG/b67k@public.gmane.org>
> ---
>  tools/testing/selftests/cgroup/test_memcontrol.c | 12 ++++--------
>  1 file changed, 4 insertions(+), 8 deletions(-)
> 
> diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
> index 9ffacf024bbd..9d370aafd799 100644
> --- a/tools/testing/selftests/cgroup/test_memcontrol.c
> +++ b/tools/testing/selftests/cgroup/test_memcontrol.c
> @@ -247,7 +247,7 @@ static int cg_test_proc_killed(const char *cgroup)
>  
>  /*
>   * First, this test creates the following hierarchy:
> - * A       memory.min = 50M,  memory.max = 200M
> + * A       memory.min = 0,    memory.max = 200M
>   * A/B     memory.min = 50M,  memory.current = 50M
>   * A/B/C   memory.min = 75M,  memory.current = 50M
>   * A/B/D   memory.min = 25M,  memory.current = 50M
> @@ -257,7 +257,7 @@ static int cg_test_proc_killed(const char *cgroup)
>   * Usages are pagecache, but the test keeps a running
>   * process in every leaf cgroup.
>   * Then it creates A/G and creates a significant
> - * memory pressure in it.
> + * memory pressure in A.
>   *
>   * A/B    memory.current ~= 50M
>   * A/B/C  memory.current ~= 29M
> @@ -335,8 +335,6 @@ static int test_memcg_min(const char *root)
>  			      (void *)(long)fd);
>  	}
>  
> -	if (cg_write(parent[0], "memory.min", "50M"))
> -		goto cleanup;
>  	if (cg_write(parent[1], "memory.min", "50M"))
>  		goto cleanup;
>  	if (cg_write(children[0], "memory.min", "75M"))
> @@ -404,8 +402,8 @@ static int test_memcg_min(const char *root)
>  
>  /*
>   * First, this test creates the following hierarchy:
> - * A       memory.low = 50M,  memory.max = 200M
> - * A/B     memory.low = 50M,  memory.current = 50M
> + * A       memory.low = 0,    memory.max = 200M
> + * A/B     memory.low = 50M,  memory.current = ...

Is there a reason that we would adjust this comment but not the A/B comment
from above in from test_memcg_low()? In both cases, I would just remove the
memory.current = ... part altogether, as Roman suggested.

>   * A/B/C   memory.low = 75M,  memory.current = 50M
>   * A/B/D   memory.low = 25M,  memory.current = 50M
>   * A/B/E   memory.low = 0,    memory.current = 50M
> @@ -490,8 +488,6 @@ static int test_memcg_low(const char *root)
>  			goto cleanup;
>  	}
>  
> -	if (cg_write(parent[0], "memory.low", "50M"))
> -		goto cleanup;
>  	if (cg_write(parent[1], "memory.low", "50M"))
>  		goto cleanup;
>  	if (cg_write(children[0], "memory.low", "75M"))
> -- 
> 2.35.3
> 

Looks good otherwise.

Reviewed-by: David Vernet <void-gq6j2QGBifHby3iVrkZq2A@public.gmane.org>

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 4/4] selftests: memcg: Remove protection from top level memcg
  2022-05-13 18:59             ` Roman Gushchin
@ 2022-05-18  0:24               ` Andrew Morton
  -1 siblings, 0 replies; 70+ messages in thread
From: Andrew Morton @ 2022-05-18  0:24 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Michal Koutný,
	void, cgroups, hannes, kernel-team, linux-kernel, linux-mm,
	mhocko, shakeelb, tj, Richard Palethorpe

On Fri, 13 May 2022 11:59:56 -0700 Roman Gushchin <roman.gushchin@linux.dev> wrote:

> On Fri, May 13, 2022 at 07:18:11PM +0200, Michal Koutny wrote:
> > The reclaim is triggered by memory limit in a subtree, therefore the
> > testcase does not need configured protection against external reclaim.
> > 
> > Also, correct/deduplicate respective comments
> > 
> > Signed-off-by: Michal Koutný <mkoutny@suse.com>
> > ---
> >  tools/testing/selftests/cgroup/test_memcontrol.c | 12 ++++--------
> >  1 file changed, 4 insertions(+), 8 deletions(-)
> > 
> > diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
> > index 9ffacf024bbd..9d370aafd799 100644
> > --- a/tools/testing/selftests/cgroup/test_memcontrol.c
> > +++ b/tools/testing/selftests/cgroup/test_memcontrol.c
> > @@ -247,7 +247,7 @@ static int cg_test_proc_killed(const char *cgroup)
> >  
> >  /*
> >   * First, this test creates the following hierarchy:
> > - * A       memory.min = 50M,  memory.max = 200M
> > + * A       memory.min = 0,    memory.max = 200M
> >   * A/B     memory.min = 50M,  memory.current = 50M
> >   * A/B/C   memory.min = 75M,  memory.current = 50M
> >   * A/B/D   memory.min = 25M,  memory.current = 50M
> > @@ -257,7 +257,7 @@ static int cg_test_proc_killed(const char *cgroup)
> >   * Usages are pagecache, but the test keeps a running
> >   * process in every leaf cgroup.
> >   * Then it creates A/G and creates a significant
> > - * memory pressure in it.
> > + * memory pressure in A.
> >   *
> >   * A/B    memory.current ~= 50M
> >   * A/B/C  memory.current ~= 29M
> > @@ -335,8 +335,6 @@ static int test_memcg_min(const char *root)
> >  			      (void *)(long)fd);
> >  	}
> >  
> > -	if (cg_write(parent[0], "memory.min", "50M"))
> > -		goto cleanup;
> >  	if (cg_write(parent[1], "memory.min", "50M"))
> >  		goto cleanup;
> >  	if (cg_write(children[0], "memory.min", "75M"))
> > @@ -404,8 +402,8 @@ static int test_memcg_min(const char *root)
> >  
> >  /*
> >   * First, this test creates the following hierarchy:
> > - * A       memory.low = 50M,  memory.max = 200M
> > - * A/B     memory.low = 50M,  memory.current = 50M
> > + * A       memory.low = 0,    memory.max = 200M
> > + * A/B     memory.low = 50M,  memory.current = ...
> 
> Can you, please, just remove "memory.current = ...", it's not
> because obvious what "..." means here.
> 

You mean this?

--- a/tools/testing/selftests/cgroup/test_memcontrol.c~selftests-memcg-remove-protection-from-top-level-memcg-fix
+++ a/tools/testing/selftests/cgroup/test_memcontrol.c
@@ -403,15 +403,14 @@ cleanup:
 /*
  * First, this test creates the following hierarchy:
  * A       memory.low = 0,    memory.max = 200M
- * A/B     memory.low = 50M,  memory.current = ...
+ * A/B     memory.low = 50M
  * A/B/C   memory.low = 75M,  memory.current = 50M
  * A/B/D   memory.low = 25M,  memory.current = 50M
  * A/B/E   memory.low = 0,    memory.current = 50M
  * A/B/F   memory.low = 500M, memory.current = 0
  *
  * Usages are pagecache.
- * Then it creates A/G an creates a significant
- * memory pressure in it.
+ * Then it creates A/G and creates significant memory pressure in it.
  *
  * Then it checks actual memory usages and expects that:
  * A/B    memory.current ~= 50M
_

(includes gratuitous comment cleanup)

I assume your comment in
https://lkml.kernel.org/r/Yn6pBPq+lAXm9NG8@carbon can be addressed in a
later patch.

I'm not sure what to amke of https://lkml.kernel.org/r/Yn6pWPodGPlz+D8G@carbon

Do we feel this series needs more work before merging it up?


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 4/4] selftests: memcg: Remove protection from top level memcg
@ 2022-05-18  0:24               ` Andrew Morton
  0 siblings, 0 replies; 70+ messages in thread
From: Andrew Morton @ 2022-05-18  0:24 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Michal Koutný,
	void-gq6j2QGBifHby3iVrkZq2A, cgroups-u79uwXL29TY76Z2rM5mHXA,
	hannes-druUgvl0LCNAfugRpC6u6w, kernel-team-b10kYP2dOMg,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, mhocko-DgEjT+Ai2ygdnm+yROfE0A,
	shakeelb-hpIqsD4AKlfQT0dZR+AlfA, tj-DgEjT+Ai2ygdnm+yROfE0A,
	Richard Palethorpe

On Fri, 13 May 2022 11:59:56 -0700 Roman Gushchin <roman.gushchin-fxUVXftIFDnyG1zEObXtfA@public.gmane.org> wrote:

> On Fri, May 13, 2022 at 07:18:11PM +0200, Michal Koutny wrote:
> > The reclaim is triggered by memory limit in a subtree, therefore the
> > testcase does not need configured protection against external reclaim.
> > 
> > Also, correct/deduplicate respective comments
> > 
> > Signed-off-by: Michal Koutný <mkoutny-IBi9RG/b67k@public.gmane.org>
> > ---
> >  tools/testing/selftests/cgroup/test_memcontrol.c | 12 ++++--------
> >  1 file changed, 4 insertions(+), 8 deletions(-)
> > 
> > diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
> > index 9ffacf024bbd..9d370aafd799 100644
> > --- a/tools/testing/selftests/cgroup/test_memcontrol.c
> > +++ b/tools/testing/selftests/cgroup/test_memcontrol.c
> > @@ -247,7 +247,7 @@ static int cg_test_proc_killed(const char *cgroup)
> >  
> >  /*
> >   * First, this test creates the following hierarchy:
> > - * A       memory.min = 50M,  memory.max = 200M
> > + * A       memory.min = 0,    memory.max = 200M
> >   * A/B     memory.min = 50M,  memory.current = 50M
> >   * A/B/C   memory.min = 75M,  memory.current = 50M
> >   * A/B/D   memory.min = 25M,  memory.current = 50M
> > @@ -257,7 +257,7 @@ static int cg_test_proc_killed(const char *cgroup)
> >   * Usages are pagecache, but the test keeps a running
> >   * process in every leaf cgroup.
> >   * Then it creates A/G and creates a significant
> > - * memory pressure in it.
> > + * memory pressure in A.
> >   *
> >   * A/B    memory.current ~= 50M
> >   * A/B/C  memory.current ~= 29M
> > @@ -335,8 +335,6 @@ static int test_memcg_min(const char *root)
> >  			      (void *)(long)fd);
> >  	}
> >  
> > -	if (cg_write(parent[0], "memory.min", "50M"))
> > -		goto cleanup;
> >  	if (cg_write(parent[1], "memory.min", "50M"))
> >  		goto cleanup;
> >  	if (cg_write(children[0], "memory.min", "75M"))
> > @@ -404,8 +402,8 @@ static int test_memcg_min(const char *root)
> >  
> >  /*
> >   * First, this test creates the following hierarchy:
> > - * A       memory.low = 50M,  memory.max = 200M
> > - * A/B     memory.low = 50M,  memory.current = 50M
> > + * A       memory.low = 0,    memory.max = 200M
> > + * A/B     memory.low = 50M,  memory.current = ...
> 
> Can you, please, just remove "memory.current = ...", it's not
> because obvious what "..." means here.
> 

You mean this?

--- a/tools/testing/selftests/cgroup/test_memcontrol.c~selftests-memcg-remove-protection-from-top-level-memcg-fix
+++ a/tools/testing/selftests/cgroup/test_memcontrol.c
@@ -403,15 +403,14 @@ cleanup:
 /*
  * First, this test creates the following hierarchy:
  * A       memory.low = 0,    memory.max = 200M
- * A/B     memory.low = 50M,  memory.current = ...
+ * A/B     memory.low = 50M
  * A/B/C   memory.low = 75M,  memory.current = 50M
  * A/B/D   memory.low = 25M,  memory.current = 50M
  * A/B/E   memory.low = 0,    memory.current = 50M
  * A/B/F   memory.low = 500M, memory.current = 0
  *
  * Usages are pagecache.
- * Then it creates A/G an creates a significant
- * memory pressure in it.
+ * Then it creates A/G and creates significant memory pressure in it.
  *
  * Then it checks actual memory usages and expects that:
  * A/B    memory.current ~= 50M
_

(includes gratuitous comment cleanup)

I assume your comment in
https://lkml.kernel.org/r/Yn6pBPq+lAXm9NG8@carbon can be addressed in a
later patch.

I'm not sure what to amke of https://lkml.kernel.org/r/Yn6pWPodGPlz+D8G@carbon

Do we feel this series needs more work before merging it up?


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 4/4] selftests: memcg: Remove protection from top level memcg
@ 2022-05-18  0:52                 ` Roman Gushchin
  0 siblings, 0 replies; 70+ messages in thread
From: Roman Gushchin @ 2022-05-18  0:52 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Koutný,
	void, cgroups, hannes, kernel-team, linux-kernel, linux-mm,
	mhocko, shakeelb, tj, Richard Palethorpe

On Tue, May 17, 2022 at 05:24:43PM -0700, Andrew Morton wrote:
> On Fri, 13 May 2022 11:59:56 -0700 Roman Gushchin <roman.gushchin@linux.dev> wrote:
> 
> > On Fri, May 13, 2022 at 07:18:11PM +0200, Michal Koutny wrote:
> > > The reclaim is triggered by memory limit in a subtree, therefore the
> > > testcase does not need configured protection against external reclaim.
> > > 
> > > Also, correct/deduplicate respective comments
> > > 
> > > Signed-off-by: Michal Koutný <mkoutny@suse.com>
> > > ---
> > >  tools/testing/selftests/cgroup/test_memcontrol.c | 12 ++++--------
> > >  1 file changed, 4 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
> > > index 9ffacf024bbd..9d370aafd799 100644
> > > --- a/tools/testing/selftests/cgroup/test_memcontrol.c
> > > +++ b/tools/testing/selftests/cgroup/test_memcontrol.c
> > > @@ -247,7 +247,7 @@ static int cg_test_proc_killed(const char *cgroup)
> > >  
> > >  /*
> > >   * First, this test creates the following hierarchy:
> > > - * A       memory.min = 50M,  memory.max = 200M
> > > + * A       memory.min = 0,    memory.max = 200M
> > >   * A/B     memory.min = 50M,  memory.current = 50M
> > >   * A/B/C   memory.min = 75M,  memory.current = 50M
> > >   * A/B/D   memory.min = 25M,  memory.current = 50M
> > > @@ -257,7 +257,7 @@ static int cg_test_proc_killed(const char *cgroup)
> > >   * Usages are pagecache, but the test keeps a running
> > >   * process in every leaf cgroup.
> > >   * Then it creates A/G and creates a significant
> > > - * memory pressure in it.
> > > + * memory pressure in A.
> > >   *
> > >   * A/B    memory.current ~= 50M
> > >   * A/B/C  memory.current ~= 29M
> > > @@ -335,8 +335,6 @@ static int test_memcg_min(const char *root)
> > >  			      (void *)(long)fd);
> > >  	}
> > >  
> > > -	if (cg_write(parent[0], "memory.min", "50M"))
> > > -		goto cleanup;
> > >  	if (cg_write(parent[1], "memory.min", "50M"))
> > >  		goto cleanup;
> > >  	if (cg_write(children[0], "memory.min", "75M"))
> > > @@ -404,8 +402,8 @@ static int test_memcg_min(const char *root)
> > >  
> > >  /*
> > >   * First, this test creates the following hierarchy:
> > > - * A       memory.low = 50M,  memory.max = 200M
> > > - * A/B     memory.low = 50M,  memory.current = 50M
> > > + * A       memory.low = 0,    memory.max = 200M
> > > + * A/B     memory.low = 50M,  memory.current = ...
> > 
> > Can you, please, just remove "memory.current = ...", it's not
> > because obvious what "..." means here.
> > 
> 
> You mean this?
> 
> --- a/tools/testing/selftests/cgroup/test_memcontrol.c~selftests-memcg-remove-protection-from-top-level-memcg-fix
> +++ a/tools/testing/selftests/cgroup/test_memcontrol.c
> @@ -403,15 +403,14 @@ cleanup:
>  /*
>   * First, this test creates the following hierarchy:
>   * A       memory.low = 0,    memory.max = 200M
> - * A/B     memory.low = 50M,  memory.current = ...
> + * A/B     memory.low = 50M
>   * A/B/C   memory.low = 75M,  memory.current = 50M
>   * A/B/D   memory.low = 25M,  memory.current = 50M
>   * A/B/E   memory.low = 0,    memory.current = 50M
>   * A/B/F   memory.low = 500M, memory.current = 0
>   *
>   * Usages are pagecache.
> - * Then it creates A/G an creates a significant
> - * memory pressure in it.
> + * Then it creates A/G and creates significant memory pressure in it.
>   *
>   * Then it checks actual memory usages and expects that:
>   * A/B    memory.current ~= 50M
> _
> 
> (includes gratuitous comment cleanup)

Yes, thank you!

> 
> I assume your comment in
> https://lkml.kernel.org/r/Yn6pBPq+lAXm9NG8@carbon can be addressed in a
> later patch.
> 
> I'm not sure what to amke of https://lkml.kernel.org/r/Yn6pWPodGPlz+D8G@carbon
> 
> Do we feel this series needs more work before merging it up?
> 

Please, go ahead with it. If anything comes up, it can be addressed later.

Thanks!

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 4/4] selftests: memcg: Remove protection from top level memcg
@ 2022-05-18  0:52                 ` Roman Gushchin
  0 siblings, 0 replies; 70+ messages in thread
From: Roman Gushchin @ 2022-05-18  0:52 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Michal Koutný,
	void-gq6j2QGBifHby3iVrkZq2A, cgroups-u79uwXL29TY76Z2rM5mHXA,
	hannes-druUgvl0LCNAfugRpC6u6w, kernel-team-b10kYP2dOMg,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, mhocko-DgEjT+Ai2ygdnm+yROfE0A,
	shakeelb-hpIqsD4AKlfQT0dZR+AlfA, tj-DgEjT+Ai2ygdnm+yROfE0A,
	Richard Palethorpe

On Tue, May 17, 2022 at 05:24:43PM -0700, Andrew Morton wrote:
> On Fri, 13 May 2022 11:59:56 -0700 Roman Gushchin <roman.gushchin-fxUVXftIFDnyG1zEObXtfA@public.gmane.org> wrote:
> 
> > On Fri, May 13, 2022 at 07:18:11PM +0200, Michal Koutny wrote:
> > > The reclaim is triggered by memory limit in a subtree, therefore the
> > > testcase does not need configured protection against external reclaim.
> > > 
> > > Also, correct/deduplicate respective comments
> > > 
> > > Signed-off-by: Michal Koutný <mkoutny-IBi9RG/b67k@public.gmane.org>
> > > ---
> > >  tools/testing/selftests/cgroup/test_memcontrol.c | 12 ++++--------
> > >  1 file changed, 4 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c
> > > index 9ffacf024bbd..9d370aafd799 100644
> > > --- a/tools/testing/selftests/cgroup/test_memcontrol.c
> > > +++ b/tools/testing/selftests/cgroup/test_memcontrol.c
> > > @@ -247,7 +247,7 @@ static int cg_test_proc_killed(const char *cgroup)
> > >  
> > >  /*
> > >   * First, this test creates the following hierarchy:
> > > - * A       memory.min = 50M,  memory.max = 200M
> > > + * A       memory.min = 0,    memory.max = 200M
> > >   * A/B     memory.min = 50M,  memory.current = 50M
> > >   * A/B/C   memory.min = 75M,  memory.current = 50M
> > >   * A/B/D   memory.min = 25M,  memory.current = 50M
> > > @@ -257,7 +257,7 @@ static int cg_test_proc_killed(const char *cgroup)
> > >   * Usages are pagecache, but the test keeps a running
> > >   * process in every leaf cgroup.
> > >   * Then it creates A/G and creates a significant
> > > - * memory pressure in it.
> > > + * memory pressure in A.
> > >   *
> > >   * A/B    memory.current ~= 50M
> > >   * A/B/C  memory.current ~= 29M
> > > @@ -335,8 +335,6 @@ static int test_memcg_min(const char *root)
> > >  			      (void *)(long)fd);
> > >  	}
> > >  
> > > -	if (cg_write(parent[0], "memory.min", "50M"))
> > > -		goto cleanup;
> > >  	if (cg_write(parent[1], "memory.min", "50M"))
> > >  		goto cleanup;
> > >  	if (cg_write(children[0], "memory.min", "75M"))
> > > @@ -404,8 +402,8 @@ static int test_memcg_min(const char *root)
> > >  
> > >  /*
> > >   * First, this test creates the following hierarchy:
> > > - * A       memory.low = 50M,  memory.max = 200M
> > > - * A/B     memory.low = 50M,  memory.current = 50M
> > > + * A       memory.low = 0,    memory.max = 200M
> > > + * A/B     memory.low = 50M,  memory.current = ...
> > 
> > Can you, please, just remove "memory.current = ...", it's not
> > because obvious what "..." means here.
> > 
> 
> You mean this?
> 
> --- a/tools/testing/selftests/cgroup/test_memcontrol.c~selftests-memcg-remove-protection-from-top-level-memcg-fix
> +++ a/tools/testing/selftests/cgroup/test_memcontrol.c
> @@ -403,15 +403,14 @@ cleanup:
>  /*
>   * First, this test creates the following hierarchy:
>   * A       memory.low = 0,    memory.max = 200M
> - * A/B     memory.low = 50M,  memory.current = ...
> + * A/B     memory.low = 50M
>   * A/B/C   memory.low = 75M,  memory.current = 50M
>   * A/B/D   memory.low = 25M,  memory.current = 50M
>   * A/B/E   memory.low = 0,    memory.current = 50M
>   * A/B/F   memory.low = 500M, memory.current = 0
>   *
>   * Usages are pagecache.
> - * Then it creates A/G an creates a significant
> - * memory pressure in it.
> + * Then it creates A/G and creates significant memory pressure in it.
>   *
>   * Then it checks actual memory usages and expects that:
>   * A/B    memory.current ~= 50M
> _
> 
> (includes gratuitous comment cleanup)

Yes, thank you!

> 
> I assume your comment in
> https://lkml.kernel.org/r/Yn6pBPq+lAXm9NG8@carbon can be addressed in a
> later patch.
> 
> I'm not sure what to amke of https://lkml.kernel.org/r/Yn6pWPodGPlz+D8G@carbon
> 
> Do we feel this series needs more work before merging it up?
> 

Please, go ahead with it. If anything comes up, it can be addressed later.

Thanks!

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 4/4] selftests: memcg: Remove protection from top level memcg
  2022-05-18  0:52                 ` Roman Gushchin
@ 2022-05-18 15:44                   ` Michal Koutný
  -1 siblings, 0 replies; 70+ messages in thread
From: Michal Koutný @ 2022-05-18 15:44 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Andrew Morton, void, cgroups, hannes, kernel-team, linux-kernel,
	linux-mm, mhocko, shakeelb, tj, Richard Palethorpe

On Tue, May 17, 2022 at 05:52:25PM -0700, Roman Gushchin <roman.gushchin@linux.dev> wrote:
> Please, go ahead with it. If anything comes up, it can be addressed later.

I hope I can still post v2 of the tests cleanups (applying feedback from
here), it's in [1] with more info there. (I sent it to a new thread
based on get_maintainers.pl).

Thanks,
Michal

[1] https://lore.kernel.org/r/20220518154037.18819-1-mkoutny@suse.com

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 4/4] selftests: memcg: Remove protection from top level memcg
@ 2022-05-18 15:44                   ` Michal Koutný
  0 siblings, 0 replies; 70+ messages in thread
From: Michal Koutný @ 2022-05-18 15:44 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: Andrew Morton, void-gq6j2QGBifHby3iVrkZq2A,
	cgroups-u79uwXL29TY76Z2rM5mHXA, hannes-druUgvl0LCNAfugRpC6u6w,
	kernel-team-b10kYP2dOMg, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, mhocko-DgEjT+Ai2ygdnm+yROfE0A,
	shakeelb-hpIqsD4AKlfQT0dZR+AlfA, tj-DgEjT+Ai2ygdnm+yROfE0A,
	Richard Palethorpe

On Tue, May 17, 2022 at 05:52:25PM -0700, Roman Gushchin <roman.gushchin-fxUVXftIFDnyG1zEObXtfA@public.gmane.org> wrote:
> Please, go ahead with it. If anything comes up, it can be addressed later.

I hope I can still post v2 of the tests cleanups (applying feedback from
here), it's in [1] with more info there. (I sent it to a new thread
based on get_maintainers.pl).

Thanks,
Michal

[1] https://lore.kernel.org/r/20220518154037.18819-1-mkoutny-IBi9RG/b67k@public.gmane.org

^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 2/4] selftests: memcg: Expect no low events in unprotected sibling
  2022-05-13 18:54           ` Roman Gushchin
@ 2022-05-18 15:54               ` Michal Koutný
  0 siblings, 0 replies; 70+ messages in thread
From: Michal Koutný @ 2022-05-18 15:54 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: void, akpm, cgroups, hannes, kernel-team, linux-kernel, linux-mm,
	mhocko, shakeelb, tj, Richard Palethorpe

Hi.

On Fri, May 13, 2022 at 11:54:16AM -0700, Roman Gushchin <roman.gushchin@linux.dev> wrote:
> Hm, what are our plans here? Are we going to fix it soon-ish, or there
> is still no agreement on how to proceed?

Here are some of my ideas in random order so far and comments:

0) mask memory.events:low
-> not a real fix

1) don't do SWAP_CLUSTER_MAX roundup
-> won't solve sudden lift of protection

2) instead of SWAP_CLUSTER_MAX over-reclaim, do same under-reclaim
-> same problem as 1)

3) update children_low_usage transactionally (after reclaim round is done)
- ???

4) don't recursively distribute residual protection in the same reclaim round
- ???

5) iterate siblings from highest to lowest protection 
- not a solution

6) assign only >SWAP_CLUSTER_MAX of residuum
- need more info

I'm discouraged by possible complexity of 3) or 4), while 6) is what I'd
like to look more into.

HTH,
Michal


^ permalink raw reply	[flat|nested] 70+ messages in thread

* Re: [PATCH 2/4] selftests: memcg: Expect no low events in unprotected sibling
@ 2022-05-18 15:54               ` Michal Koutný
  0 siblings, 0 replies; 70+ messages in thread
From: Michal Koutný @ 2022-05-18 15:54 UTC (permalink / raw)
  To: Roman Gushchin
  Cc: void-gq6j2QGBifHby3iVrkZq2A,
	akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b,
	cgroups-u79uwXL29TY76Z2rM5mHXA, hannes-druUgvl0LCNAfugRpC6u6w,
	kernel-team-b10kYP2dOMg, linux-kernel-u79uwXL29TY76Z2rM5mHXA,
	linux-mm-Bw31MaZKKs3YtjvyW6yDsg, mhocko-DgEjT+Ai2ygdnm+yROfE0A,
	shakeelb-hpIqsD4AKlfQT0dZR+AlfA, tj-DgEjT+Ai2ygdnm+yROfE0A,
	Richard Palethorpe

Hi.

On Fri, May 13, 2022 at 11:54:16AM -0700, Roman Gushchin <roman.gushchin-fxUVXftIFDnyG1zEObXtfA@public.gmane.org> wrote:
> Hm, what are our plans here? Are we going to fix it soon-ish, or there
> is still no agreement on how to proceed?

Here are some of my ideas in random order so far and comments:

0) mask memory.events:low
-> not a real fix

1) don't do SWAP_CLUSTER_MAX roundup
-> won't solve sudden lift of protection

2) instead of SWAP_CLUSTER_MAX over-reclaim, do same under-reclaim
-> same problem as 1)

3) update children_low_usage transactionally (after reclaim round is done)
- ???

4) don't recursively distribute residual protection in the same reclaim round
- ???

5) iterate siblings from highest to lowest protection 
- not a solution

6) assign only >SWAP_CLUSTER_MAX of residuum
- need more info

I'm discouraged by possible complexity of 3) or 4), while 6) is what I'd
like to look more into.

HTH,
Michal


^ permalink raw reply	[flat|nested] 70+ messages in thread

end of thread, other threads:[~2022-05-18 15:55 UTC | newest]

Thread overview: 70+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-23 15:56 [PATCH v2 0/5] Fix bugs in memcontroller cgroup tests David Vernet
2022-04-23 15:56 ` David Vernet
2022-04-23 15:56 ` [PATCH v2 1/5] cgroups: Refactor children cgroups in memcg tests David Vernet
2022-04-26  1:56   ` Roman Gushchin
2022-04-26  1:56     ` Roman Gushchin
2022-04-23 15:56 ` [PATCH v2 2/5] cgroup: Account for memory_recursiveprot in test_memcg_low() David Vernet
2022-04-23 15:56   ` David Vernet
2022-04-27 14:09   ` Michal Koutný
2022-04-27 14:09     ` Michal Koutný
2022-04-29  1:03     ` David Vernet
2022-04-29  1:03       ` David Vernet
2022-04-29  9:26       ` Michal Koutný
2022-04-29  9:26         ` Michal Koutný
2022-05-06 16:40         ` David Vernet
2022-05-06 16:40           ` David Vernet
2022-05-09 15:09           ` Johannes Weiner
2022-05-09 15:09             ` Johannes Weiner
2022-05-10  0:44             ` Andrew Morton
2022-05-10  0:44               ` Andrew Morton
2022-05-10 17:43               ` Michal Koutný
2022-05-10 17:43                 ` Michal Koutný
2022-05-11 17:53                 ` Johannes Weiner
2022-05-11 17:53                   ` Johannes Weiner
2022-05-12 17:27                   ` Michal Koutný
2022-05-12 17:27                     ` Michal Koutný
2022-04-23 15:56 ` [PATCH v2 3/5] cgroup: Account for memory_localevents in test_memcg_oom_group_leaf_events() David Vernet
2022-04-23 15:56   ` David Vernet
2022-04-23 15:56 ` [PATCH v2 4/5] cgroup: Removing racy check in test_memcg_sock() David Vernet
2022-04-23 15:56   ` David Vernet
2022-04-23 15:56 ` [PATCH v2 5/5] cgroup: Fix racy check in alloc_pagecache_max_30M() helper function David Vernet
2022-04-23 15:56   ` David Vernet
2022-05-12 17:04 ` [PATCH v2 0/5] Fix bugs in memcontroller cgroup tests Michal Koutný
2022-05-12 17:04   ` Michal Koutný
2022-05-12 17:30   ` David Vernet
2022-05-12 17:30     ` David Vernet
2022-05-12 17:44     ` David Vernet
2022-05-12 17:44       ` David Vernet
2022-05-13 17:18       ` [PATCH 0/4] memcontrol selftests fixups Michal Koutný
2022-05-13 17:18         ` Michal Koutný
2022-05-13 17:18         ` [PATCH 1/4] selftests: memcg: Fix compilation Michal Koutný
2022-05-13 17:18           ` Michal Koutný
2022-05-13 17:40           ` David Vernet
2022-05-13 17:40             ` David Vernet
2022-05-13 18:53           ` Roman Gushchin
2022-05-13 18:53             ` Roman Gushchin
2022-05-13 19:09             ` Roman Gushchin
2022-05-13 19:09               ` Roman Gushchin
2022-05-13 17:18         ` [PATCH 2/4] selftests: memcg: Expect no low events in unprotected sibling Michal Koutný
2022-05-13 17:18           ` Michal Koutný
2022-05-13 17:42           ` David Vernet
2022-05-13 17:42             ` David Vernet
2022-05-13 18:54           ` Roman Gushchin
2022-05-18 15:54             ` Michal Koutný
2022-05-18 15:54               ` Michal Koutný
2022-05-13 17:18         ` [PATCH 3/4] selftests: memcg: Adjust expected reclaim values of protected cgroups Michal Koutný
2022-05-13 17:18           ` Michal Koutný
2022-05-13 18:52           ` Roman Gushchin
2022-05-13 18:52             ` Roman Gushchin
2022-05-13 17:18         ` [PATCH 4/4] selftests: memcg: Remove protection from top level memcg Michal Koutný
2022-05-13 17:18           ` Michal Koutný
2022-05-13 18:59           ` Roman Gushchin
2022-05-13 18:59             ` Roman Gushchin
2022-05-18  0:24             ` Andrew Morton
2022-05-18  0:24               ` Andrew Morton
2022-05-18  0:52               ` Roman Gushchin
2022-05-18  0:52                 ` Roman Gushchin
2022-05-18 15:44                 ` Michal Koutný
2022-05-18 15:44                   ` Michal Koutný
2022-05-13 19:14           ` David Vernet
2022-05-13 19:14             ` David Vernet

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.