* [LTP] [RESEND PATCH 0/4] controllers/memcg: fixes for CPU nodes on newer kernels
@ 2021-08-11 10:10 Krzysztof Kozlowski
2021-08-11 10:10 ` [LTP] [RESEND PATCH 1/4] controllers/memcg: account per-node kernel memory Krzysztof Kozlowski
` (3 more replies)
0 siblings, 4 replies; 9+ messages in thread
From: Krzysztof Kozlowski @ 2021-08-11 10:10 UTC (permalink / raw)
To: ltp
Hi,
This is continuation of my work fixing controllers/memcg.
Rebased on top of latest master - no other dependencies.
The problem is that since v5.9 kernel, not only per-CPU but also
per-node kernel memory is accounted to group. Creating a subgroup on
v5.11 kernel on a two-node machine eats ~440 kB by itself in kernel
memory.
The patchset tries to take it into account and adjust the memory limits.
Have in mind however that kernel might change such accounting in the
future.
Best regards,
Krzysztof
Krzysztof Kozlowski (4):
controllers/memcg: account per-node kernel memory
controllers/memcg: fix memcg_stat_test on two node machines
controllers/memcg: fail early to avoid possible false-positives
controllers/memcg: check status of commands using interface
.../memcg/functional/memcg_failcnt.sh | 4 +-
.../controllers/memcg/functional/memcg_lib.sh | 46 ++++++++++++++++++-
.../memcg_max_usage_in_bytes_test.sh | 10 ++--
.../memcg_memsw_limit_in_bytes_test.sh | 8 ++--
.../memcg_move_charge_at_immigrate_test.sh | 4 +-
.../memcg/functional/memcg_stat_test.sh | 39 ++++++++--------
.../memcg/functional/memcg_subgroup_charge.sh | 14 ++----
.../functional/memcg_usage_in_bytes_test.sh | 4 +-
.../functional/memcg_use_hierarchy_test.sh | 16 ++++---
9 files changed, 96 insertions(+), 49 deletions(-)
--
2.30.2
^ permalink raw reply [flat|nested] 9+ messages in thread
* [LTP] [RESEND PATCH 1/4] controllers/memcg: account per-node kernel memory
2021-08-11 10:10 [LTP] [RESEND PATCH 0/4] controllers/memcg: fixes for CPU nodes on newer kernels Krzysztof Kozlowski
@ 2021-08-11 10:10 ` Krzysztof Kozlowski
2021-08-11 14:42 ` Richard Palethorpe
2021-08-11 10:10 ` [LTP] [RESEND PATCH 2/4] controllers/memcg: fix memcg_stat_test on two node machines Krzysztof Kozlowski
` (2 subsequent siblings)
3 siblings, 1 reply; 9+ messages in thread
From: Krzysztof Kozlowski @ 2021-08-11 10:10 UTC (permalink / raw)
To: ltp
Recent Linux kernels () charge groups also with kernel memory. This is
not limited only to process-allocated memory but also cgroup-handling
code memory as well.
For example since kernel v5.9 with commit 3e38e0aaca9e ("mm: memcg:
charge memcg percpu memory to the parent cgroup") creating a subgroup
causes several kernel allocations towards this group.
These additional kernel memory allocations are proportional to number of
CPUs and number of nodes.
On c4.8xlarge AWS instance with 36 cores in two nodes with v5.11 Linux
kernel the memcg_subgroup_charge and memcg_use_hierarchy_test tests were
failing:
memcg_use_hierarchy_test 1 TINFO: timeout per run is 0h 5m 0s
memcg_use_hierarchy_test 1 TINFO: set /dev/memcg/memory.use_hierarchy to 0 failed
memcg_use_hierarchy_test 1 TINFO: test if one of the ancestors goes over its limit, the proces will be killed
mkdir: cannot create directory ?subgroup?: Cannot allocate memory
/home/ubuntu/ltp-install/testcases/bin/memcg_use_hierarchy_test.sh: 26: cd: can't cd to subgroup
memcg_use_hierarchy_test 1 TINFO: Running memcg_process --mmap-lock1 -s 8192
memcg_use_hierarchy_test 1 TFAIL: process is not killed
rmdir: failed to remove 'subgroup': No such file or directory
The kernel was unable to create the subgroup (mkdir returned -ENOMEM)
due to this additional per-node kernel memory allocations.
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
---
.../controllers/memcg/functional/memcg_lib.sh | 44 +++++++++++++++++++
.../memcg/functional/memcg_subgroup_charge.sh | 8 +---
.../functional/memcg_use_hierarchy_test.sh | 8 +++-
3 files changed, 52 insertions(+), 8 deletions(-)
diff --git a/testcases/kernel/controllers/memcg/functional/memcg_lib.sh b/testcases/kernel/controllers/memcg/functional/memcg_lib.sh
index dad66c798e19..700e9e367bff 100755
--- a/testcases/kernel/controllers/memcg/functional/memcg_lib.sh
+++ b/testcases/kernel/controllers/memcg/functional/memcg_lib.sh
@@ -63,6 +63,50 @@ memcg_require_hierarchy_disabled()
fi
}
+# Kernel memory allocated for the process is also charged. It might depend on
+# the number of CPUs and number of nodes. For example on kernel v5.11
+# additionally total_cpus (plus 1 or 2) pages are charged to the group via
+# kernel memory. For a two-node machine, additional 108 pages kernel memory
+# are charged to the group.
+#
+# Adjust the limit to account such per-CPU and per-node kernel memory.
+# $1 - variable name with limit to adjust
+memcg_adjust_limit_for_kmem()
+{
+ [ $# -ne 1 ] && tst_brk TBROK "memcg_adjust_limit_for_kmem expects 1 parameter"
+ eval "local _limit=\$$1"
+
+ # Total number of CPUs
+ local total_cpus=`tst_ncpus`
+
+ # Get the number of NODES
+ if [ -f "/sys/devices/system/node/has_high_memory" ]; then
+ local mem_string="`cat /sys/devices/system/node/has_high_memory`"
+ else
+ local mem_string="`cat /sys/devices/system/node/has_normal_memory`"
+ fi
+
+ local total_nodes="`echo $mem_string | tr ',' ' '`"
+ local count=0
+ for item in $total_nodes; do
+ local delta=1
+ if [ "${item#*-*}" != "$item" ]; then
+ delta=$((${item#*-*} - ${item%*-*} + 1))
+ fi
+ count=$((count + $delta))
+ done
+ total_nodes=$count
+ # Additional nodes impose charging the kmem, not having regular one node
+ local node_mem=0
+ if [ $total_nodes -gt 1 ]; then
+ node_mem=$((total_nodes - 1))
+ node_mem=$((node_mem * PAGESIZE * 128))
+ fi
+
+ eval "$1='$((_limit + 4 * PAGESIZE + total_cpus * PAGESIZE + node_mem))'"
+ return 0
+}
+
memcg_setup()
{
if ! is_cgroup_subsystem_available_and_enabled "memory"; then
diff --git a/testcases/kernel/controllers/memcg/functional/memcg_subgroup_charge.sh b/testcases/kernel/controllers/memcg/functional/memcg_subgroup_charge.sh
index 0d2b235aff7c..7650128e3605 100755
--- a/testcases/kernel/controllers/memcg/functional/memcg_subgroup_charge.sh
+++ b/testcases/kernel/controllers/memcg/functional/memcg_subgroup_charge.sh
@@ -24,16 +24,12 @@ test_subgroup()
{
local limit_parent=$1
local limit_subgroup=$2
- local total_cpus=`tst_ncpus`
- # Kernel memory allocated for the process is also charged.
- # It might depend on the number of CPUs. For example on kernel v5.11
- # additionally total_cpus plus 1-2 pages are charged to the group.
if [ $limit_parent -ne 0 ]; then
- limit_parent=$((limit_parent + 4 * PAGESIZE + total_cpus * PAGESIZE))
+ memcg_adjust_limit_for_kmem limit_parent
fi
if [ $limit_subgroup -ne 0 ]; then
- limit_subgroup=$((limit_subgroup + 4 * PAGESIZE + total_cpus * PAGESIZE))
+ memcg_adjust_limit_for_kmem limit_subgroup
fi
mkdir subgroup
diff --git a/testcases/kernel/controllers/memcg/functional/memcg_use_hierarchy_test.sh b/testcases/kernel/controllers/memcg/functional/memcg_use_hierarchy_test.sh
index 8be342499ece..b645f9b44a86 100755
--- a/testcases/kernel/controllers/memcg/functional/memcg_use_hierarchy_test.sh
+++ b/testcases/kernel/controllers/memcg/functional/memcg_use_hierarchy_test.sh
@@ -14,13 +14,17 @@ TST_CNT=3
test1()
{
tst_res TINFO "test if one of the ancestors goes over its limit, the proces will be killed"
+ local total_cpus=`tst_ncpus`
+
+ local limit=$PAGESIZE
+ memcg_adjust_limit_for_kmem limit
echo 1 > memory.use_hierarchy
- echo $PAGESIZE > memory.limit_in_bytes
+ echo $limit > memory.limit_in_bytes
mkdir subgroup
cd subgroup
- test_proc_kill $((PAGESIZE * 3)) "--mmap-lock1" $((PAGESIZE * 2)) 0
+ test_proc_kill $((limit + PAGESIZE * 3)) "--mmap-lock1" $((limit + PAGESIZE * 2)) 0
cd ..
rmdir subgroup
--
2.30.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [LTP] [RESEND PATCH 2/4] controllers/memcg: fix memcg_stat_test on two node machines
2021-08-11 10:10 [LTP] [RESEND PATCH 0/4] controllers/memcg: fixes for CPU nodes on newer kernels Krzysztof Kozlowski
2021-08-11 10:10 ` [LTP] [RESEND PATCH 1/4] controllers/memcg: account per-node kernel memory Krzysztof Kozlowski
@ 2021-08-11 10:10 ` Krzysztof Kozlowski
2021-08-11 10:10 ` [LTP] [RESEND PATCH 3/4] controllers/memcg: fail early to avoid possible false-positives Krzysztof Kozlowski
2021-08-11 10:10 ` [LTP] [RESEND PATCH 4/4] controllers/memcg: check status of commands using interface Krzysztof Kozlowski
3 siblings, 0 replies; 9+ messages in thread
From: Krzysztof Kozlowski @ 2021-08-11 10:10 UTC (permalink / raw)
To: ltp
Recent Linux kernels charge groups also with kernel memory. This is not
limited only to process-allocated memory but also cgroup-handling code
memory as well.
For example since kernel v5.9 with commit 3e38e0aaca9e ("mm: memcg:
charge memcg percpu memory to the parent cgroup") creating a subgroup
causes several kernel allocations towards this group.
These additional kernel memory allocations are proportional to number of
CPUs and number of nodes.
On c4.8xlarge AWS instance with 36 cores in two nodes with v5.11 Linux
kernel the memcg_stat_test was failing because of unable to set memory
limit lower than current kernel memory accounted to the group:
memcg_stat_test 5 TINFO: Test hierarchical_memory_limit with enabling hierarchical accounting
shell code from test:
> mkdir subgroup
> cat memory.kmem.usage_in_bytes
> 442368
> echo $PAGESIZE > memory.limit_in_bytes
/home/ubuntu/ltp-install/testcases/bin/memcg_stat_test.sh: 47: echo: echo: I/O error
memcg_stat_test 5 TFAIL: hierarchical_memory_limit is 270336, 135168 expected
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
---
.../controllers/memcg/functional/memcg_stat_test.sh | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/testcases/kernel/controllers/memcg/functional/memcg_stat_test.sh b/testcases/kernel/controllers/memcg/functional/memcg_stat_test.sh
index e5eb7e5d0001..f6b5aa84e31e 100755
--- a/testcases/kernel/controllers/memcg/functional/memcg_stat_test.sh
+++ b/testcases/kernel/controllers/memcg/functional/memcg_stat_test.sh
@@ -43,12 +43,15 @@ test5()
tst_res TINFO "Test hierarchical_memory_limit with enabling hierarchical accounting"
echo 1 > memory.use_hierarchy
+ local limit=$PAGESIZES
+ memcg_adjust_limit_for_kmem limit
+
mkdir subgroup
- echo $PAGESIZES > memory.limit_in_bytes
- echo $((PAGESIZES * 2)) > subgroup/memory.limit_in_bytes
+ echo $limit > memory.limit_in_bytes
+ echo $((limit + PAGESIZES * 2)) > subgroup/memory.limit_in_bytes
cd subgroup
- check_mem_stat "hierarchical_memory_limit" $PAGESIZES
+ check_mem_stat "hierarchical_memory_limit" $limit
cd ..
rmdir subgroup
--
2.30.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [LTP] [RESEND PATCH 3/4] controllers/memcg: fail early to avoid possible false-positives
2021-08-11 10:10 [LTP] [RESEND PATCH 0/4] controllers/memcg: fixes for CPU nodes on newer kernels Krzysztof Kozlowski
2021-08-11 10:10 ` [LTP] [RESEND PATCH 1/4] controllers/memcg: account per-node kernel memory Krzysztof Kozlowski
2021-08-11 10:10 ` [LTP] [RESEND PATCH 2/4] controllers/memcg: fix memcg_stat_test on two node machines Krzysztof Kozlowski
@ 2021-08-11 10:10 ` Krzysztof Kozlowski
2021-08-11 10:10 ` [LTP] [RESEND PATCH 4/4] controllers/memcg: check status of commands using interface Krzysztof Kozlowski
3 siblings, 0 replies; 9+ messages in thread
From: Krzysztof Kozlowski @ 2021-08-11 10:10 UTC (permalink / raw)
To: ltp
Tests require certain prerequisites. If these fail, there is no point to
continue with the test. Trying to continue might lead to false
positives, as seen on memcg_subgroup_charge which failed to set proper
limit and did not catch actual failure:
memcg_subgroup_charge 1 TINFO: set /dev/memcg/memory.use_hierarchy to 0 failed
memcg_subgroup_charge 1 TINFO: Test that group and subgroup have no relationship
/home/ubuntu/ltp-install/testcases/bin/memcg_subgroup_charge.sh: 36: echo: echo: I/O error
memcg_subgroup_charge 1 TINFO: Running memcg_process --mmap-anon -s 270336
memcg_subgroup_charge 1 TINFO: Warming up pid: 13496
memcg_subgroup_charge 1 TINFO: Process is still here after warm up: 13496
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
---
.../controllers/memcg/functional/memcg_failcnt.sh | 4 ++--
.../memcg/functional/memcg_max_usage_in_bytes_test.sh | 2 +-
.../functional/memcg_memsw_limit_in_bytes_test.sh | 8 ++++----
.../functional/memcg_move_charge_at_immigrate_test.sh | 4 ++--
.../controllers/memcg/functional/memcg_stat_test.sh | 10 +++++-----
.../memcg/functional/memcg_subgroup_charge.sh | 2 +-
.../memcg/functional/memcg_use_hierarchy_test.sh | 10 +++++-----
7 files changed, 20 insertions(+), 20 deletions(-)
diff --git a/testcases/kernel/controllers/memcg/functional/memcg_failcnt.sh b/testcases/kernel/controllers/memcg/functional/memcg_failcnt.sh
index ce0885b73fe7..65ad82e0a2a0 100755
--- a/testcases/kernel/controllers/memcg/functional/memcg_failcnt.sh
+++ b/testcases/kernel/controllers/memcg/functional/memcg_failcnt.sh
@@ -18,10 +18,10 @@ MEMORY_TO_ALLOCATE=$((MEMORY_LIMIT * 2))
test()
{
- echo $MEMORY_LIMIT > memory.limit_in_bytes
+ ROD echo $MEMORY_LIMIT \> memory.limit_in_bytes
start_memcg_process $2 -s ${MEMORY_TO_ALLOCATE}
- echo $MEMCG_PROCESS_PID > tasks
+ ROD echo $MEMCG_PROCESS_PID \> tasks
signal_memcg_process ${MEMORY_TO_ALLOCATE}
signal_memcg_process ${MEMORY_TO_ALLOCATE}
diff --git a/testcases/kernel/controllers/memcg/functional/memcg_max_usage_in_bytes_test.sh b/testcases/kernel/controllers/memcg/functional/memcg_max_usage_in_bytes_test.sh
index 24e9d115c19e..a0c4dd17f0f3 100755
--- a/testcases/kernel/controllers/memcg/functional/memcg_max_usage_in_bytes_test.sh
+++ b/testcases/kernel/controllers/memcg/functional/memcg_max_usage_in_bytes_test.sh
@@ -33,7 +33,7 @@ test_max_usage_in_bytes()
return
fi
- echo $MEMCG_PROCESS_PID > tasks
+ ROD echo $MEMCG_PROCESS_PID \> tasks
signal_memcg_process $MEM_TO_ALLOC
signal_memcg_process $MEM_TO_ALLOC
diff --git a/testcases/kernel/controllers/memcg/functional/memcg_memsw_limit_in_bytes_test.sh b/testcases/kernel/controllers/memcg/functional/memcg_memsw_limit_in_bytes_test.sh
index e9950a0df322..ab26cb3d1768 100755
--- a/testcases/kernel/controllers/memcg/functional/memcg_memsw_limit_in_bytes_test.sh
+++ b/testcases/kernel/controllers/memcg/functional/memcg_memsw_limit_in_bytes_test.sh
@@ -55,7 +55,7 @@ test9()
{
memcg_require_memsw
- echo 10M > memory.limit_in_bytes
+ ROD echo 10M \> memory.limit_in_bytes
if tst_kvcmp -lt "2.6.31"; then
EXPECT_FAIL echo -1 \> memory.memsw.limit_in_bytes
@@ -68,7 +68,7 @@ test10()
{
memcg_require_memsw
- echo 10M > memory.limit_in_bytes
+ ROD echo 10M \> memory.limit_in_bytes
EXPECT_FAIL echo 1.0 \> memory.memsw.limit_in_bytes
}
@@ -76,7 +76,7 @@ test11()
{
memcg_require_memsw
- echo 10M > memory.limit_in_bytes
+ ROD echo 10M \> memory.limit_in_bytes
EXPECT_FAIL echo 1xx \> memory.memsw.limit_in_bytes
}
@@ -84,7 +84,7 @@ test12()
{
memcg_require_memsw
- echo 10M > memory.limit_in_bytes
+ ROD echo 10M \> memory.limit_in_bytes
EXPECT_FAIL echo xx \> memory.memsw.limit_in_bytes
}
diff --git a/testcases/kernel/controllers/memcg/functional/memcg_move_charge_at_immigrate_test.sh b/testcases/kernel/controllers/memcg/functional/memcg_move_charge_at_immigrate_test.sh
index 272d7779770d..3c1b3394bd71 100755
--- a/testcases/kernel/controllers/memcg/functional/memcg_move_charge_at_immigrate_test.sh
+++ b/testcases/kernel/controllers/memcg/functional/memcg_move_charge_at_immigrate_test.sh
@@ -34,10 +34,10 @@ test_move_charge()
return
fi
- echo $MEMCG_PROCESS_PID > subgroup_a/tasks
+ ROD echo $MEMCG_PROCESS_PID \> subgroup_a/tasks
signal_memcg_process $total_size "subgroup_a/"
- mkdir subgroup_b
+ ROD mkdir subgroup_b
echo $move_charge_mask > subgroup_b/memory.move_charge_at_immigrate
echo $MEMCG_PROCESS_PID > subgroup_b/tasks
diff --git a/testcases/kernel/controllers/memcg/functional/memcg_stat_test.sh b/testcases/kernel/controllers/memcg/functional/memcg_stat_test.sh
index f6b5aa84e31e..b86331f0fc26 100755
--- a/testcases/kernel/controllers/memcg/functional/memcg_stat_test.sh
+++ b/testcases/kernel/controllers/memcg/functional/memcg_stat_test.sh
@@ -46,7 +46,7 @@ test5()
local limit=$PAGESIZES
memcg_adjust_limit_for_kmem limit
- mkdir subgroup
+ ROD mkdir subgroup
echo $limit > memory.limit_in_bytes
echo $((limit + PAGESIZES * 2)) > subgroup/memory.limit_in_bytes
@@ -62,9 +62,9 @@ test6()
tst_res TINFO "Test hierarchical_memory_limit with disabling hierarchical accounting"
memcg_require_hierarchy_disabled
- echo 0 > memory.use_hierarchy
+ ROD echo 0 \> memory.use_hierarchy
- mkdir subgroup
+ ROD mkdir subgroup
echo $PAGESIZES > memory.limit_in_bytes
echo $((PAGESIZES * 2)) > subgroup/memory.limit_in_bytes
@@ -82,7 +82,7 @@ test7()
ROD echo 1 \> memory.use_hierarchy
- mkdir subgroup
+ ROD mkdir subgroup
echo $PAGESIZES > memory.limit_in_bytes
echo $PAGESIZES > memory.memsw.limit_in_bytes
echo $((PAGESIZES * 2)) > subgroup/memory.limit_in_bytes
@@ -103,7 +103,7 @@ test8()
ROD echo 0 \> memory.use_hierarchy
- mkdir subgroup
+ ROD mkdir subgroup
echo $PAGESIZES > memory.limit_in_bytes
echo $PAGESIZES > memory.memsw.limit_in_bytes
echo $((PAGESIZES * 2)) > subgroup/memory.limit_in_bytes
diff --git a/testcases/kernel/controllers/memcg/functional/memcg_subgroup_charge.sh b/testcases/kernel/controllers/memcg/functional/memcg_subgroup_charge.sh
index 7650128e3605..e831097dab3e 100755
--- a/testcases/kernel/controllers/memcg/functional/memcg_subgroup_charge.sh
+++ b/testcases/kernel/controllers/memcg/functional/memcg_subgroup_charge.sh
@@ -32,7 +32,7 @@ test_subgroup()
memcg_adjust_limit_for_kmem limit_subgroup
fi
- mkdir subgroup
+ ROD mkdir subgroup
echo $limit_parent > memory.limit_in_bytes
echo $limit_subgroup > subgroup/memory.limit_in_bytes
diff --git a/testcases/kernel/controllers/memcg/functional/memcg_use_hierarchy_test.sh b/testcases/kernel/controllers/memcg/functional/memcg_use_hierarchy_test.sh
index b645f9b44a86..4b19480b2ff9 100755
--- a/testcases/kernel/controllers/memcg/functional/memcg_use_hierarchy_test.sh
+++ b/testcases/kernel/controllers/memcg/functional/memcg_use_hierarchy_test.sh
@@ -19,10 +19,10 @@ test1()
local limit=$PAGESIZE
memcg_adjust_limit_for_kmem limit
- echo 1 > memory.use_hierarchy
- echo $limit > memory.limit_in_bytes
+ ROD echo 1 \> memory.use_hierarchy
+ ROD echo $limit \> memory.limit_in_bytes
- mkdir subgroup
+ ROD mkdir subgroup
cd subgroup
test_proc_kill $((limit + PAGESIZE * 3)) "--mmap-lock1" $((limit + PAGESIZE * 2)) 0
@@ -36,7 +36,7 @@ test2()
memcg_require_hierarchy_disabled
- mkdir subgroup
+ ROD mkdir subgroup
EXPECT_FAIL echo 1 \> memory.use_hierarchy
rmdir subgroup
@@ -48,7 +48,7 @@ test3()
memcg_require_hierarchy_disabled
- echo 1 > memory.use_hierarchy
+ ROD echo 1 > memory.use_hierarchy
mkdir subgroup
EXPECT_FAIL echo 0 \> subgroup/memory.use_hierarchy
--
2.30.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [LTP] [RESEND PATCH 4/4] controllers/memcg: check status of commands using interface
2021-08-11 10:10 [LTP] [RESEND PATCH 0/4] controllers/memcg: fixes for CPU nodes on newer kernels Krzysztof Kozlowski
` (2 preceding siblings ...)
2021-08-11 10:10 ` [LTP] [RESEND PATCH 3/4] controllers/memcg: fail early to avoid possible false-positives Krzysztof Kozlowski
@ 2021-08-11 10:10 ` Krzysztof Kozlowski
3 siblings, 0 replies; 9+ messages in thread
From: Krzysztof Kozlowski @ 2021-08-11 10:10 UTC (permalink / raw)
To: ltp
Most of the cgroups interface returns some valid and meaningful error
code (e.g. ENOMEM if creating subgroup failed). Check these to be sure
that all errors are caught, like in this example:
memcg_subgroup_charge 1 TINFO: set /dev/memcg/memory.use_hierarchy to 0 failed
memcg_subgroup_charge 1 TINFO: Test that group and subgroup have no relationship
/home/ubuntu/ltp-install/testcases/bin/memcg_subgroup_charge.sh: 36: echo: echo: I/O error
memcg_subgroup_charge 1 TINFO: Running memcg_process --mmap-anon -s 270336
memcg_subgroup_charge 1 TINFO: Warming up pid: 13496
memcg_subgroup_charge 1 TINFO: Process is still here after warm up: 13496
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
---
.../controllers/memcg/functional/memcg_lib.sh | 2 +-
.../memcg_max_usage_in_bytes_test.sh | 8 +++----
.../memcg/functional/memcg_stat_test.sh | 24 +++++++++----------
.../memcg/functional/memcg_subgroup_charge.sh | 4 ++--
.../functional/memcg_usage_in_bytes_test.sh | 4 ++--
5 files changed, 21 insertions(+), 21 deletions(-)
diff --git a/testcases/kernel/controllers/memcg/functional/memcg_lib.sh b/testcases/kernel/controllers/memcg/functional/memcg_lib.sh
index 700e9e367bff..3d04c3a60700 100755
--- a/testcases/kernel/controllers/memcg/functional/memcg_lib.sh
+++ b/testcases/kernel/controllers/memcg/functional/memcg_lib.sh
@@ -389,7 +389,7 @@ test_limit_in_bytes()
local use_memsw=$2
local elimit
- echo $limit > memory.limit_in_bytes
+ EXPECT_PASS echo $limit \> memory.limit_in_bytes
if [ $use_memsw -eq 1 ]; then
memcg_require_memsw
echo $limit > memory.memsw.limit_in_bytes
diff --git a/testcases/kernel/controllers/memcg/functional/memcg_max_usage_in_bytes_test.sh b/testcases/kernel/controllers/memcg/functional/memcg_max_usage_in_bytes_test.sh
index a0c4dd17f0f3..6a2607c4333b 100755
--- a/testcases/kernel/controllers/memcg/functional/memcg_max_usage_in_bytes_test.sh
+++ b/testcases/kernel/controllers/memcg/functional/memcg_max_usage_in_bytes_test.sh
@@ -67,8 +67,8 @@ test2()
tst_res TINFO "Test memory.memsw.max_usage_in_bytes"
memcg_require_memsw
- echo $MEM_LIMIT > memory.limit_in_bytes
- echo $MEM_LIMIT > memory.memsw.limit_in_bytes
+ EXPECT_PASS echo $MEM_LIMIT \> memory.limit_in_bytes
+ EXPECT_PASS echo $MEM_LIMIT \> memory.memsw.limit_in_bytes
test_max_usage_in_bytes 1 0
}
@@ -83,8 +83,8 @@ test4()
tst_res TINFO "Test reset memory.memsw.max_usage_in_bytes"
memcg_require_memsw
- echo $MEM_LIMIT > memory.limit_in_bytes
- echo $MEM_LIMIT > memory.memsw.limit_in_bytes
+ EXPECT_PASS echo $MEM_LIMIT \> memory.limit_in_bytes
+ EXPECT_PASS echo $MEM_LIMIT \> memory.memsw.limit_in_bytes
test_max_usage_in_bytes 1 1
}
diff --git a/testcases/kernel/controllers/memcg/functional/memcg_stat_test.sh b/testcases/kernel/controllers/memcg/functional/memcg_stat_test.sh
index b86331f0fc26..5fb52bb1d24c 100755
--- a/testcases/kernel/controllers/memcg/functional/memcg_stat_test.sh
+++ b/testcases/kernel/controllers/memcg/functional/memcg_stat_test.sh
@@ -47,8 +47,8 @@ test5()
memcg_adjust_limit_for_kmem limit
ROD mkdir subgroup
- echo $limit > memory.limit_in_bytes
- echo $((limit + PAGESIZES * 2)) > subgroup/memory.limit_in_bytes
+ EXPECT_PASS echo $limit \> memory.limit_in_bytes
+ EXPECT_PASS echo $((limit + PAGESIZES * 2)) \> subgroup/memory.limit_in_bytes
cd subgroup
check_mem_stat "hierarchical_memory_limit" $limit
@@ -65,8 +65,8 @@ test6()
ROD echo 0 \> memory.use_hierarchy
ROD mkdir subgroup
- echo $PAGESIZES > memory.limit_in_bytes
- echo $((PAGESIZES * 2)) > subgroup/memory.limit_in_bytes
+ EXPECT_PASS echo $PAGESIZES \> memory.limit_in_bytes
+ EXPECT_PASS echo $((PAGESIZES * 2)) \> subgroup/memory.limit_in_bytes
cd subgroup
check_mem_stat "hierarchical_memory_limit" $((PAGESIZES * 2))
@@ -83,10 +83,10 @@ test7()
ROD echo 1 \> memory.use_hierarchy
ROD mkdir subgroup
- echo $PAGESIZES > memory.limit_in_bytes
- echo $PAGESIZES > memory.memsw.limit_in_bytes
- echo $((PAGESIZES * 2)) > subgroup/memory.limit_in_bytes
- echo $((PAGESIZES * 2)) > subgroup/memory.memsw.limit_in_bytes
+ EXPECT_PASS echo $PAGESIZES \> memory.limit_in_bytes
+ EXPECT_PASS echo $PAGESIZES \> memory.memsw.limit_in_bytes
+ EXPECT_PASS echo $((PAGESIZES * 2)) \> subgroup/memory.limit_in_bytes
+ EXPECT_PASS echo $((PAGESIZES * 2)) \> subgroup/memory.memsw.limit_in_bytes
cd subgroup
check_mem_stat "hierarchical_memsw_limit" $PAGESIZES
@@ -104,10 +104,10 @@ test8()
ROD echo 0 \> memory.use_hierarchy
ROD mkdir subgroup
- echo $PAGESIZES > memory.limit_in_bytes
- echo $PAGESIZES > memory.memsw.limit_in_bytes
- echo $((PAGESIZES * 2)) > subgroup/memory.limit_in_bytes
- echo $((PAGESIZES * 2)) > subgroup/memory.memsw.limit_in_bytes
+ EXPECT_PASS echo $PAGESIZES \> memory.limit_in_bytes
+ EXPECT_PASS echo $PAGESIZES \> memory.memsw.limit_in_bytes
+ EXPECT_PASS echo $((PAGESIZES * 2)) \> subgroup/memory.limit_in_bytes
+ EXPECT_PASS echo $((PAGESIZES * 2)) \> subgroup/memory.memsw.limit_in_bytes
cd subgroup
check_mem_stat "hierarchical_memsw_limit" $((PAGESIZES * 2))
diff --git a/testcases/kernel/controllers/memcg/functional/memcg_subgroup_charge.sh b/testcases/kernel/controllers/memcg/functional/memcg_subgroup_charge.sh
index e831097dab3e..4434168a7a6c 100755
--- a/testcases/kernel/controllers/memcg/functional/memcg_subgroup_charge.sh
+++ b/testcases/kernel/controllers/memcg/functional/memcg_subgroup_charge.sh
@@ -33,8 +33,8 @@ test_subgroup()
fi
ROD mkdir subgroup
- echo $limit_parent > memory.limit_in_bytes
- echo $limit_subgroup > subgroup/memory.limit_in_bytes
+ EXPECT_PASS echo $limit_parent \> memory.limit_in_bytes
+ EXPECT_PASS echo $limit_subgroup \> subgroup/memory.limit_in_bytes
start_memcg_process --mmap-anon -s $MEM_TO_ALLOC
diff --git a/testcases/kernel/controllers/memcg/functional/memcg_usage_in_bytes_test.sh b/testcases/kernel/controllers/memcg/functional/memcg_usage_in_bytes_test.sh
index 9140fd9d1fd7..6c1b365057f8 100755
--- a/testcases/kernel/controllers/memcg/functional/memcg_usage_in_bytes_test.sh
+++ b/testcases/kernel/controllers/memcg/functional/memcg_usage_in_bytes_test.sh
@@ -28,8 +28,8 @@ test2()
tst_res TINFO "Test memory.memsw.usage_in_bytes"
memcg_require_memsw
- echo $MEM_LIMIT > memory.limit_in_bytes
- echo $MEM_LIMIT > memory.memsw.limit_in_bytes
+ EXPECT_PASS echo $MEM_LIMIT \> memory.limit_in_bytes
+ EXPECT_PASS echo $MEM_LIMIT \> memory.memsw.limit_in_bytes
test_mem_stat "--mmap-anon" $MEM_TO_ALLOC $MEM_TO_ALLOC \
"memory.memsw.usage_in_bytes" $MEM_TO_ALLOC \
$MEM_EXPECTED_UPPER false
--
2.30.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [LTP] [RESEND PATCH 1/4] controllers/memcg: account per-node kernel memory
2021-08-11 10:10 ` [LTP] [RESEND PATCH 1/4] controllers/memcg: account per-node kernel memory Krzysztof Kozlowski
@ 2021-08-11 14:42 ` Richard Palethorpe
2021-08-12 6:45 ` Krzysztof Kozlowski
0 siblings, 1 reply; 9+ messages in thread
From: Richard Palethorpe @ 2021-08-11 14:42 UTC (permalink / raw)
To: ltp
Hello Krzysztof,
Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> writes:
> Recent Linux kernels () charge groups also with kernel memory. This is
> not limited only to process-allocated memory but also cgroup-handling
> code memory as well.
>
> For example since kernel v5.9 with commit 3e38e0aaca9e ("mm: memcg:
> charge memcg percpu memory to the parent cgroup") creating a subgroup
> causes several kernel allocations towards this group.
>
> These additional kernel memory allocations are proportional to number of
> CPUs and number of nodes.
>
> On c4.8xlarge AWS instance with 36 cores in two nodes with v5.11 Linux
> kernel the memcg_subgroup_charge and memcg_use_hierarchy_test tests were
> failing:
>
> memcg_use_hierarchy_test 1 TINFO: timeout per run is 0h 5m 0s
> memcg_use_hierarchy_test 1 TINFO: set /dev/memcg/memory.use_hierarchy to 0 failed
> memcg_use_hierarchy_test 1 TINFO: test if one of the ancestors goes over its limit, the proces will be killed
> mkdir: cannot create directory ?subgroup?: Cannot allocate memory
> /home/ubuntu/ltp-install/testcases/bin/memcg_use_hierarchy_test.sh: 26: cd: can't cd to subgroup
> memcg_use_hierarchy_test 1 TINFO: Running memcg_process --mmap-lock1 -s 8192
> memcg_use_hierarchy_test 1 TFAIL: process is not killed
> rmdir: failed to remove 'subgroup': No such file or directory
>
> The kernel was unable to create the subgroup (mkdir returned -ENOMEM)
> due to this additional per-node kernel memory allocations.
>
> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
> ---
> .../controllers/memcg/functional/memcg_lib.sh | 44 +++++++++++++++++++
> .../memcg/functional/memcg_subgroup_charge.sh | 8 +---
> .../functional/memcg_use_hierarchy_test.sh | 8 +++-
> 3 files changed, 52 insertions(+), 8 deletions(-)
>
> diff --git a/testcases/kernel/controllers/memcg/functional/memcg_lib.sh b/testcases/kernel/controllers/memcg/functional/memcg_lib.sh
> index dad66c798e19..700e9e367bff 100755
> --- a/testcases/kernel/controllers/memcg/functional/memcg_lib.sh
> +++ b/testcases/kernel/controllers/memcg/functional/memcg_lib.sh
> @@ -63,6 +63,50 @@ memcg_require_hierarchy_disabled()
> fi
> }
>
> +# Kernel memory allocated for the process is also charged. It might depend on
> +# the number of CPUs and number of nodes. For example on kernel v5.11
> +# additionally total_cpus (plus 1 or 2) pages are charged to the group via
> +# kernel memory. For a two-node machine, additional 108 pages kernel memory
> +# are charged to the group.
> +#
> +# Adjust the limit to account such per-CPU and per-node kernel memory.
> +# $1 - variable name with limit to adjust
> +memcg_adjust_limit_for_kmem()
> +{
> + [ $# -ne 1 ] && tst_brk TBROK "memcg_adjust_limit_for_kmem expects 1 parameter"
> + eval "local _limit=\$$1"
Could we do this a simpler way?
It would be much easier to read if we just returned the value which
needed to be added.
> +
> + # Total number of CPUs
> + local total_cpus=`tst_ncpus`
> +
> + # Get the number of NODES
Is it acceptable or necessary to use /sys/devices/system/node/possible
(or online) instead?
> + if [ -f "/sys/devices/system/node/has_high_memory" ]; then
> + local mem_string="`cat /sys/devices/system/node/has_high_memory`"
> + else
> + local mem_string="`cat /sys/devices/system/node/has_normal_memory`"
> + fi
> +
> + local total_nodes="`echo $mem_string | tr ',' ' '`"
> + local count=0
> + for item in $total_nodes; do
> + local delta=1
> + if [ "${item#*-*}" != "$item" ]; then
> + delta=$((${item#*-*} - ${item%*-*} + 1))
> + fi
> + count=$((count + $delta))
> + done
Or perhaps we could count the number of 'node[0-9]+' directories? I
think that would be easier to understand.
> + total_nodes=$count
> + # Additional nodes impose charging the kmem, not having regular one node
> + local node_mem=0
> + if [ $total_nodes -gt 1 ]; then
> + node_mem=$((total_nodes - 1))
> + node_mem=$((node_mem * PAGESIZE * 128))
> + fi
> +
> + eval "$1='$((_limit + 4 * PAGESIZE + total_cpus * PAGESIZE + node_mem))'"
> + return 0
> +}
Otherwise looks good.
--
Thank you,
Richard.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [LTP] [RESEND PATCH 1/4] controllers/memcg: account per-node kernel memory
2021-08-11 14:42 ` Richard Palethorpe
@ 2021-08-12 6:45 ` Krzysztof Kozlowski
2021-08-12 7:53 ` Richard Palethorpe
0 siblings, 1 reply; 9+ messages in thread
From: Krzysztof Kozlowski @ 2021-08-12 6:45 UTC (permalink / raw)
To: ltp
On 11/08/2021 16:42, Richard Palethorpe wrote:
> Hello Krzysztof,
>
> Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> writes:
>
>> Recent Linux kernels () charge groups also with kernel memory. This is
>> not limited only to process-allocated memory but also cgroup-handling
>> code memory as well.
>>
>> For example since kernel v5.9 with commit 3e38e0aaca9e ("mm: memcg:
>> charge memcg percpu memory to the parent cgroup") creating a subgroup
>> causes several kernel allocations towards this group.
>>
>> These additional kernel memory allocations are proportional to number of
>> CPUs and number of nodes.
>>
>> On c4.8xlarge AWS instance with 36 cores in two nodes with v5.11 Linux
>> kernel the memcg_subgroup_charge and memcg_use_hierarchy_test tests were
>> failing:
>>
>> memcg_use_hierarchy_test 1 TINFO: timeout per run is 0h 5m 0s
>> memcg_use_hierarchy_test 1 TINFO: set /dev/memcg/memory.use_hierarchy to 0 failed
>> memcg_use_hierarchy_test 1 TINFO: test if one of the ancestors goes over its limit, the proces will be killed
>> mkdir: cannot create directory ?subgroup?: Cannot allocate memory
>> /home/ubuntu/ltp-install/testcases/bin/memcg_use_hierarchy_test.sh: 26: cd: can't cd to subgroup
>> memcg_use_hierarchy_test 1 TINFO: Running memcg_process --mmap-lock1 -s 8192
>> memcg_use_hierarchy_test 1 TFAIL: process is not killed
>> rmdir: failed to remove 'subgroup': No such file or directory
>>
>> The kernel was unable to create the subgroup (mkdir returned -ENOMEM)
>> due to this additional per-node kernel memory allocations.
>>
>> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
>> ---
>> .../controllers/memcg/functional/memcg_lib.sh | 44 +++++++++++++++++++
>> .../memcg/functional/memcg_subgroup_charge.sh | 8 +---
>> .../functional/memcg_use_hierarchy_test.sh | 8 +++-
>> 3 files changed, 52 insertions(+), 8 deletions(-)
>>
>> diff --git a/testcases/kernel/controllers/memcg/functional/memcg_lib.sh b/testcases/kernel/controllers/memcg/functional/memcg_lib.sh
>> index dad66c798e19..700e9e367bff 100755
>> --- a/testcases/kernel/controllers/memcg/functional/memcg_lib.sh
>> +++ b/testcases/kernel/controllers/memcg/functional/memcg_lib.sh
>> @@ -63,6 +63,50 @@ memcg_require_hierarchy_disabled()
>> fi
>> }
>>
>> +# Kernel memory allocated for the process is also charged. It might depend on
>> +# the number of CPUs and number of nodes. For example on kernel v5.11
>> +# additionally total_cpus (plus 1 or 2) pages are charged to the group via
>> +# kernel memory. For a two-node machine, additional 108 pages kernel memory
>> +# are charged to the group.
>> +#
>> +# Adjust the limit to account such per-CPU and per-node kernel memory.
>> +# $1 - variable name with limit to adjust
>> +memcg_adjust_limit_for_kmem()
>> +{
>> + [ $# -ne 1 ] && tst_brk TBROK "memcg_adjust_limit_for_kmem expects 1 parameter"
>> + eval "local _limit=\$$1"
>
> Could we do this a simpler way?
>
> It would be much easier to read if we just returned the value which
> needed to be added.
Sure, I can change it. Just note that the caller/user will require
slightly more code.
>
>> +
>> + # Total number of CPUs
>> + local total_cpus=`tst_ncpus`
>> +
>> + # Get the number of NODES
>
> Is it acceptable or necessary to use /sys/devices/system/node/possible
> (or online) instead?
Makes sense, I took it from existing code but it looks unnecessarily
complicated.
>
>> + if [ -f "/sys/devices/system/node/has_high_memory" ]; then
>> + local mem_string="`cat /sys/devices/system/node/has_high_memory`"
>> + else
>> + local mem_string="`cat /sys/devices/system/node/has_normal_memory`"
>> + fi
>> +
>> + local total_nodes="`echo $mem_string | tr ',' ' '`"
>> + local count=0
>> + for item in $total_nodes; do
>> + local delta=1
>> + if [ "${item#*-*}" != "$item" ]; then
>> + delta=$((${item#*-*} - ${item%*-*} + 1))
>> + fi
>> + count=$((count + $delta))
>> + done
>
> Or perhaps we could count the number of 'node[0-9]+' directories? I
> think that would be easier to understand.
I just copied the existing code, but I can try to simplify it.
>
>> + total_nodes=$count
>> + # Additional nodes impose charging the kmem, not having regular one node
>> + local node_mem=0
>> + if [ $total_nodes -gt 1 ]; then
>> + node_mem=$((total_nodes - 1))
>> + node_mem=$((node_mem * PAGESIZE * 128))
>> + fi
>> +
>> + eval "$1='$((_limit + 4 * PAGESIZE + total_cpus * PAGESIZE + node_mem))'"
>> + return 0
>> +}
>
> Otherwise looks good.
Thanks for review!
Best regards,
Krzysztof
^ permalink raw reply [flat|nested] 9+ messages in thread
* [LTP] [RESEND PATCH 1/4] controllers/memcg: account per-node kernel memory
2021-08-12 6:45 ` Krzysztof Kozlowski
@ 2021-08-12 7:53 ` Richard Palethorpe
2021-08-12 7:55 ` Krzysztof Kozlowski
0 siblings, 1 reply; 9+ messages in thread
From: Richard Palethorpe @ 2021-08-12 7:53 UTC (permalink / raw)
To: ltp
Hello,
Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> writes:
> On 11/08/2021 16:42, Richard Palethorpe wrote:
>> Hello Krzysztof,
>>
>> Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> writes:
>>
>>> Recent Linux kernels () charge groups also with kernel memory. This is
>>> not limited only to process-allocated memory but also cgroup-handling
>>> code memory as well.
>>>
>>> For example since kernel v5.9 with commit 3e38e0aaca9e ("mm: memcg:
>>> charge memcg percpu memory to the parent cgroup") creating a subgroup
>>> causes several kernel allocations towards this group.
>>>
>>> These additional kernel memory allocations are proportional to number of
>>> CPUs and number of nodes.
>>>
>>> On c4.8xlarge AWS instance with 36 cores in two nodes with v5.11 Linux
>>> kernel the memcg_subgroup_charge and memcg_use_hierarchy_test tests were
>>> failing:
>>>
>>> memcg_use_hierarchy_test 1 TINFO: timeout per run is 0h 5m 0s
>>> memcg_use_hierarchy_test 1 TINFO: set /dev/memcg/memory.use_hierarchy to 0 failed
>>> memcg_use_hierarchy_test 1 TINFO: test if one of the ancestors goes over its limit, the proces will be killed
>>> mkdir: cannot create directory ?subgroup?: Cannot allocate memory
>>> /home/ubuntu/ltp-install/testcases/bin/memcg_use_hierarchy_test.sh: 26: cd: can't cd to subgroup
>>> memcg_use_hierarchy_test 1 TINFO: Running memcg_process --mmap-lock1 -s 8192
>>> memcg_use_hierarchy_test 1 TFAIL: process is not killed
>>> rmdir: failed to remove 'subgroup': No such file or directory
>>>
>>> The kernel was unable to create the subgroup (mkdir returned -ENOMEM)
>>> due to this additional per-node kernel memory allocations.
>>>
>>> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
>>> ---
>>> .../controllers/memcg/functional/memcg_lib.sh | 44 +++++++++++++++++++
>>> .../memcg/functional/memcg_subgroup_charge.sh | 8 +---
>>> .../functional/memcg_use_hierarchy_test.sh | 8 +++-
>>> 3 files changed, 52 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/testcases/kernel/controllers/memcg/functional/memcg_lib.sh b/testcases/kernel/controllers/memcg/functional/memcg_lib.sh
>>> index dad66c798e19..700e9e367bff 100755
>>> --- a/testcases/kernel/controllers/memcg/functional/memcg_lib.sh
>>> +++ b/testcases/kernel/controllers/memcg/functional/memcg_lib.sh
>>> @@ -63,6 +63,50 @@ memcg_require_hierarchy_disabled()
>>> fi
>>> }
>>>
>>> +# Kernel memory allocated for the process is also charged. It might depend on
>>> +# the number of CPUs and number of nodes. For example on kernel v5.11
>>> +# additionally total_cpus (plus 1 or 2) pages are charged to the group via
>>> +# kernel memory. For a two-node machine, additional 108 pages kernel memory
>>> +# are charged to the group.
>>> +#
>>> +# Adjust the limit to account such per-CPU and per-node kernel memory.
>>> +# $1 - variable name with limit to adjust
>>> +memcg_adjust_limit_for_kmem()
>>> +{
>>> + [ $# -ne 1 ] && tst_brk TBROK "memcg_adjust_limit_for_kmem expects 1 parameter"
>>> + eval "local _limit=\$$1"
>>
>> Could we do this a simpler way?
>>
>> It would be much easier to read if we just returned the value which
>> needed to be added.
>
> Sure, I can change it. Just note that the caller/user will require
> slightly more code.
Thanks, yes. I think a very large code saving would be required to
justify using eval in this way.
--
Thank you,
Richard.
^ permalink raw reply [flat|nested] 9+ messages in thread
* [LTP] [RESEND PATCH 1/4] controllers/memcg: account per-node kernel memory
2021-08-12 7:53 ` Richard Palethorpe
@ 2021-08-12 7:55 ` Krzysztof Kozlowski
0 siblings, 0 replies; 9+ messages in thread
From: Krzysztof Kozlowski @ 2021-08-12 7:55 UTC (permalink / raw)
To: ltp
On 12/08/2021 09:53, Richard Palethorpe wrote:
> Hello,
>
> Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> writes:
>
>> On 11/08/2021 16:42, Richard Palethorpe wrote:
>>> Hello Krzysztof,
>>>
>>> Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com> writes:
>>>
>>>> Recent Linux kernels () charge groups also with kernel memory. This is
>>>> not limited only to process-allocated memory but also cgroup-handling
>>>> code memory as well.
>>>>
>>>> For example since kernel v5.9 with commit 3e38e0aaca9e ("mm: memcg:
>>>> charge memcg percpu memory to the parent cgroup") creating a subgroup
>>>> causes several kernel allocations towards this group.
>>>>
>>>> These additional kernel memory allocations are proportional to number of
>>>> CPUs and number of nodes.
>>>>
>>>> On c4.8xlarge AWS instance with 36 cores in two nodes with v5.11 Linux
>>>> kernel the memcg_subgroup_charge and memcg_use_hierarchy_test tests were
>>>> failing:
>>>>
>>>> memcg_use_hierarchy_test 1 TINFO: timeout per run is 0h 5m 0s
>>>> memcg_use_hierarchy_test 1 TINFO: set /dev/memcg/memory.use_hierarchy to 0 failed
>>>> memcg_use_hierarchy_test 1 TINFO: test if one of the ancestors goes over its limit, the proces will be killed
>>>> mkdir: cannot create directory ?subgroup?: Cannot allocate memory
>>>> /home/ubuntu/ltp-install/testcases/bin/memcg_use_hierarchy_test.sh: 26: cd: can't cd to subgroup
>>>> memcg_use_hierarchy_test 1 TINFO: Running memcg_process --mmap-lock1 -s 8192
>>>> memcg_use_hierarchy_test 1 TFAIL: process is not killed
>>>> rmdir: failed to remove 'subgroup': No such file or directory
>>>>
>>>> The kernel was unable to create the subgroup (mkdir returned -ENOMEM)
>>>> due to this additional per-node kernel memory allocations.
>>>>
>>>> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski@canonical.com>
>>>> ---
>>>> .../controllers/memcg/functional/memcg_lib.sh | 44 +++++++++++++++++++
>>>> .../memcg/functional/memcg_subgroup_charge.sh | 8 +---
>>>> .../functional/memcg_use_hierarchy_test.sh | 8 +++-
>>>> 3 files changed, 52 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/testcases/kernel/controllers/memcg/functional/memcg_lib.sh b/testcases/kernel/controllers/memcg/functional/memcg_lib.sh
>>>> index dad66c798e19..700e9e367bff 100755
>>>> --- a/testcases/kernel/controllers/memcg/functional/memcg_lib.sh
>>>> +++ b/testcases/kernel/controllers/memcg/functional/memcg_lib.sh
>>>> @@ -63,6 +63,50 @@ memcg_require_hierarchy_disabled()
>>>> fi
>>>> }
>>>>
>>>> +# Kernel memory allocated for the process is also charged. It might depend on
>>>> +# the number of CPUs and number of nodes. For example on kernel v5.11
>>>> +# additionally total_cpus (plus 1 or 2) pages are charged to the group via
>>>> +# kernel memory. For a two-node machine, additional 108 pages kernel memory
>>>> +# are charged to the group.
>>>> +#
>>>> +# Adjust the limit to account such per-CPU and per-node kernel memory.
>>>> +# $1 - variable name with limit to adjust
>>>> +memcg_adjust_limit_for_kmem()
>>>> +{
>>>> + [ $# -ne 1 ] && tst_brk TBROK "memcg_adjust_limit_for_kmem expects 1 parameter"
>>>> + eval "local _limit=\$$1"
>>>
>>> Could we do this a simpler way?
>>>
>>> It would be much easier to read if we just returned the value which
>>> needed to be added.
>>
>> Sure, I can change it. Just note that the caller/user will require
>> slightly more code.
>
> Thanks, yes. I think a very large code saving would be required to
> justify using eval in this way.
Actually I was wrong and caller is also smaller, so new solution looks
much better. I'll send soon. Thanks for review!
Best regards,
Krzysztof
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2021-08-12 7:55 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-11 10:10 [LTP] [RESEND PATCH 0/4] controllers/memcg: fixes for CPU nodes on newer kernels Krzysztof Kozlowski
2021-08-11 10:10 ` [LTP] [RESEND PATCH 1/4] controllers/memcg: account per-node kernel memory Krzysztof Kozlowski
2021-08-11 14:42 ` Richard Palethorpe
2021-08-12 6:45 ` Krzysztof Kozlowski
2021-08-12 7:53 ` Richard Palethorpe
2021-08-12 7:55 ` Krzysztof Kozlowski
2021-08-11 10:10 ` [LTP] [RESEND PATCH 2/4] controllers/memcg: fix memcg_stat_test on two node machines Krzysztof Kozlowski
2021-08-11 10:10 ` [LTP] [RESEND PATCH 3/4] controllers/memcg: fail early to avoid possible false-positives Krzysztof Kozlowski
2021-08-11 10:10 ` [LTP] [RESEND PATCH 4/4] controllers/memcg: check status of commands using interface Krzysztof Kozlowski
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.