[LTP] [RFC] enable OOM protection for the library and test process?

All of lore.kernel.org
 help / color / mirror / Atom feed

* [LTP] [RFC] enable OOM protection for the library and test process?
@ 2021-12-13  8:03 Li Wang
  2021-12-13  9:32 ` Jan Stancek
  2021-12-16  3:41 ` [LTP] [PATCH 1/3] lib: add functions to adjust oom score Li Wang
  0 siblings, 2 replies; 19+ messages in thread
From: Li Wang @ 2021-12-13  8:03 UTC (permalink / raw)
  To: LTP List


[-- Attachment #1.1: Type: text/plain, Size: 2268 bytes --]

Hi All,

As we observed that oom tests occasionally ended with TBROK (Test killed)
on small
RAM system, the reason seems test process(test_pid) get killed early than
the expected
victim process so that can't report the status correctly.

I'm thinking maybe we can purposely make the OOM ignore test
process(test_pid)
and the main process? (achieve this only in mem library for OOM test)

e.g.

set oom_score_adj to -1000 for pid-305071 and main-process

oom03:
main ---> tst_run_tcases --> ... --> fork_testrun
   (pid 305071)    testrun  --> run_tests --> ... --> testoom --> oom()
            (pid 305072)    child_alloc --> child_alloc_thread --> alloc_mem


=============

3 cmdline="oom03"
...
10 mem.c:218: TINFO: start normal OOM testing.
11 mem.c:140: TINFO: expected victim is 305072.

12 mem.c:39: TINFO: thread (7fe173d1a700), allocating 3221225472 bytes.
13 mem.c:39: TINFO: thread (7fe173d1a700), allocating 3221225472 bytes.

14 tst_test.c:1410: TINFO: If you are running on slow machine, try
exporting LTP_TIMEOUT_MUL > 1
15 tst_test.c:1411: TBROK: Test killed! (timeout?)

==========

[ 1117.558867] Tasks state (memory values in pages):
[ 1117.559373] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes
swapents oom_score_adj name
[ 1117.560167] [ 305071]     0 305071     2215       31    61440        4
          0 oom03
[ 1117.560889] [ 305072]     0 305072 1577128 259389 10326016 1019452 0
oom03
...

[ 1117.596510]
oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=/,mems_allowed=0,oom_memcg=/ltp/test-305071,task_memcg=/ltp/test-305071,task=oom03,pid=305071,uid=0

[ 1117.597963] Memory cgroup out of memory: Killed process 305071 (oom03)
total-vm:8860kB, anon-rss:124kB, file-rss:0kB, shmem-rss:0kB, UID:0
pgtables:60kB oom_score_adj:0

=============

# free -h
              total        used        free      shared  buff/cache
available
Mem:          3.6Gi       270Mi       2.3Gi        18Mi       1.1Gi
3.3Gi
Swap:         4.0Gi          0B       4.0Gi

# lscpu
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              2
On-line CPU(s) list: 0,1
Thread(s) per core:  1
Core(s) per socket:  1
Socket(s):           2
NUMA node(s):        1


-- 
Regards,
Li Wang

[-- Attachment #1.2: Type: text/html, Size: 4388 bytes --]

[-- Attachment #2: Type: text/plain, Size: 60 bytes --]


-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [LTP] [RFC] enable OOM protection for the library and test process?
  2021-12-13  8:03 [LTP] [RFC] enable OOM protection for the library and test process? Li Wang
@ 2021-12-13  9:32 ` Jan Stancek
  2021-12-13 10:18   ` Li Wang
  2021-12-13 16:06   ` Martin Doucha
  2021-12-16  3:41 ` [LTP] [PATCH 1/3] lib: add functions to adjust oom score Li Wang
  1 sibling, 2 replies; 19+ messages in thread
From: Jan Stancek @ 2021-12-13  9:32 UTC (permalink / raw)
  To: Li Wang; +Cc: LTP List

On Mon, Dec 13, 2021 at 9:04 AM Li Wang <liwang@redhat.com> wrote:
>
> Hi All,
>
> As we observed that oom tests occasionally ended with TBROK (Test killed) on small
> RAM system, the reason seems test process(test_pid) get killed early than the expected
> victim process so that can't report the status correctly.
>
> I'm thinking maybe we can purposely make the OOM ignore test process(test_pid)
> and the main process? (achieve this only in mem library for OOM test)

There are likely more processes that could become unintended targets
(e.g. harness process)
(if we haven't tried already) Could we make expected victim process
more appealing target by tweaking its oom_score/oom_score_adj ?

>
> e.g.
>
> set oom_score_adj to -1000 for pid-305071 and main-process
>
> oom03:
> main ---> tst_run_tcases --> ... --> fork_testrun
>    (pid 305071)    testrun  --> run_tests --> ... --> testoom --> oom()
>             (pid 305072)    child_alloc --> child_alloc_thread --> alloc_mem
>
>
> =============
>
> 3 cmdline="oom03"
> ...
> 10 mem.c:218: TINFO: start normal OOM testing.
> 11 mem.c:140: TINFO: expected victim is 305072.
>
> 12 mem.c:39: TINFO: thread (7fe173d1a700), allocating 3221225472 bytes.
> 13 mem.c:39: TINFO: thread (7fe173d1a700), allocating 3221225472 bytes.
>
> 14 tst_test.c:1410: TINFO: If you are running on slow machine, try exporting LTP_TIMEOUT_MUL > 1
> 15 tst_test.c:1411: TBROK: Test killed! (timeout?)
>
> ==========
>
> [ 1117.558867] Tasks state (memory values in pages):
> [ 1117.559373] [  pid  ]   uid  tgid total_vm      rss pgtables_bytes swapents oom_score_adj name
> [ 1117.560167] [ 305071]     0 305071     2215       31    61440        4             0 oom03
> [ 1117.560889] [ 305072]     0 305072 1577128 259389 10326016 1019452 0 oom03
> ...
>
> [ 1117.596510] oom-kill:constraint=CONSTRAINT_MEMCG,nodemask=(null),cpuset=/,mems_allowed=0,oom_memcg=/ltp/test-305071,task_memcg=/ltp/test-305071,task=oom03,pid=305071,uid=0
> [ 1117.597963] Memory cgroup out of memory: Killed process 305071 (oom03) total-vm:8860kB, anon-rss:124kB, file-rss:0kB, shmem-rss:0kB, UID:0 pgtables:60kB oom_score_adj:0
>
> =============
>
> # free -h
>               total        used        free      shared  buff/cache   available
> Mem:          3.6Gi       270Mi       2.3Gi        18Mi       1.1Gi       3.3Gi
> Swap:         4.0Gi          0B       4.0Gi
>
> # lscpu
> Architecture:        x86_64
> CPU op-mode(s):      32-bit, 64-bit
> Byte Order:          Little Endian
> CPU(s):              2
> On-line CPU(s) list: 0,1
> Thread(s) per core:  1
> Core(s) per socket:  1
> Socket(s):           2
> NUMA node(s):        1
>
>
> --
> Regards,
> Li Wang
>
> --
> Mailing list info: https://lists.linux.it/listinfo/ltp


-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [LTP] [RFC] enable OOM protection for the library and test process?
  2021-12-13  9:32 ` Jan Stancek
@ 2021-12-13 10:18   ` Li Wang
  2021-12-13 15:08     ` Cyril Hrubis
  2021-12-13 16:06   ` Martin Doucha
  1 sibling, 1 reply; 19+ messages in thread
From: Li Wang @ 2021-12-13 10:18 UTC (permalink / raw)
  To: Jan Stancek; +Cc: LTP List


[-- Attachment #1.1: Type: text/plain, Size: 1282 bytes --]

On Mon, Dec 13, 2021 at 5:32 PM Jan Stancek <jstancek@redhat.com> wrote:

> On Mon, Dec 13, 2021 at 9:04 AM Li Wang <liwang@redhat.com> wrote:
> >
> > Hi All,
> >
> > As we observed that oom tests occasionally ended with TBROK (Test
> killed) on small
> > RAM system, the reason seems test process(test_pid) get killed early
> than the expected
> > victim process so that can't report the status correctly.
> >
> > I'm thinking maybe we can purposely make the OOM ignore test
> process(test_pid)
> > and the main process? (achieve this only in mem library for OOM test)
>
> There are likely more processes that could become unintended targets
> (e.g. harness process)
>

Right, but I don't think that LTP has responsible to take care of harness
process.
(sometimes people even run LTP manually)



> (if we haven't tried already) Could we make expected victim process
> more appealing target by tweaking its oom_score/oom_score_adj ?
>

This might not be a good way.

Because OOM Killer counts the oom_score by itself algorithm for
choosing which process to kill. If we tweak that, it will interfere with
the scientificity of the OOM test. But if we only do protect the
lib-process,
we know that shouldn't be killed and the test will report correctly for us.


-- 
Regards,
Li Wang

[-- Attachment #1.2: Type: text/html, Size: 2567 bytes --]

[-- Attachment #2: Type: text/plain, Size: 60 bytes --]


-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [LTP] [RFC] enable OOM protection for the library and test process?
  2021-12-13 10:18   ` Li Wang
@ 2021-12-13 15:08     ` Cyril Hrubis
  0 siblings, 0 replies; 19+ messages in thread
From: Cyril Hrubis @ 2021-12-13 15:08 UTC (permalink / raw)
  To: Li Wang; +Cc: LTP List

Hi!
> > (if we haven't tried already) Could we make expected victim process
> > more appealing target by tweaking its oom_score/oom_score_adj ?
> >
> 
> This might not be a good way.
> 
> Because OOM Killer counts the oom_score by itself algorithm for
> choosing which process to kill. If we tweak that, it will interfere with
> the scientificity of the OOM test. But if we only do protect the
> lib-process,
> we know that shouldn't be killed and the test will report correctly for us.

Agree here, we shouldn't really touch the score of the process that is
supposed to be killed unless we want to test different scenarios. It
would make sense to run the test with slightly different score for the
child too, but we shouldn't remove the original score.

-- 
Cyril Hrubis
chrubis@suse.cz

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [LTP] [RFC] enable OOM protection for the library and test process?
  2021-12-13  9:32 ` Jan Stancek
  2021-12-13 10:18   ` Li Wang
@ 2021-12-13 16:06   ` Martin Doucha
  2021-12-13 16:15     ` Cyril Hrubis
  1 sibling, 1 reply; 19+ messages in thread
From: Martin Doucha @ 2021-12-13 16:06 UTC (permalink / raw)
  To: Jan Stancek, Li Wang; +Cc: LTP List

On 13. 12. 21 10:32, Jan Stancek wrote:
> On Mon, Dec 13, 2021 at 9:04 AM Li Wang <liwang@redhat.com> wrote:
>>
>> Hi All,
>>
>> As we observed that oom tests occasionally ended with TBROK (Test killed) on small
>> RAM system, the reason seems test process(test_pid) get killed early than the expected
>> victim process so that can't report the status correctly.
>>
>> I'm thinking maybe we can purposely make the OOM ignore test process(test_pid)
>> and the main process? (achieve this only in mem library for OOM test)
> 
> There are likely more processes that could become unintended targets
> (e.g. harness process)
> (if we haven't tried already) Could we make expected victim process
> more appealing target by tweaking its oom_score/oom_score_adj ?

I'm afraid it won't be that easy. The main cause of OOM killer going
postal and killing processes with tiny memory footprint is that a
process executing the mlock() syscall cannot be targeted by OOM killer
at all. That's a known issue in the kernel with no easy fix.

You can protect the main test process using oom_score_adj but chances
are that OOM killer will just kill PID 1 (kernel panic), or find no
killable process left (also kernel panic).

Protecting the test harness is a bad idea because oom_score_adj is
inherited by child processes and it'll affect other tests as well. Given
the nature of OOM tests, I'd rather not assume that the protection will
be properly removed at the end.

-- 
Martin Doucha   mdoucha@suse.cz
QA Engineer for Software Maintenance
SUSE LINUX, s.r.o.
CORSO IIa
Krizikova 148/34
186 00 Prague 8
Czech Republic

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [LTP] [RFC] enable OOM protection for the library and test process?
  2021-12-13 16:06   ` Martin Doucha
@ 2021-12-13 16:15     ` Cyril Hrubis
  2021-12-13 16:59       ` Martin Doucha
  2021-12-14  6:31       ` Li Wang
  0 siblings, 2 replies; 19+ messages in thread
From: Cyril Hrubis @ 2021-12-13 16:15 UTC (permalink / raw)
  To: Martin Doucha; +Cc: LTP List

Hi!
> > There are likely more processes that could become unintended targets
> > (e.g. harness process)
> > (if we haven't tried already) Could we make expected victim process
> > more appealing target by tweaking its oom_score/oom_score_adj ?
> 
> I'm afraid it won't be that easy. The main cause of OOM killer going
> postal and killing processes with tiny memory footprint is that a
> process executing the mlock() syscall cannot be targeted by OOM killer
> at all. That's a known issue in the kernel with no easy fix.

This is only single test out of at least 10 that can trigger OOM, right?

> You can protect the main test process using oom_score_adj but chances
> are that OOM killer will just kill PID 1 (kernel panic), or find no
> killable process left (also kernel panic).
> 
> Protecting the test harness is a bad idea because oom_score_adj is
> inherited by child processes and it'll affect other tests as well. Given
> the nature of OOM tests, I'd rather not assume that the protection will
> be properly removed at the end.

This should be easily doable since the test library forks right before
it executes the test, so we have a single place where the score has to
be reset.

For new library tests there is a process that does nothing but waits for
the actuall test pid to finish and kills it on timeout. It really makes
sense to protect this exact process and maybe even mlock() it into the
memory.

-- 
Cyril Hrubis
chrubis@suse.cz

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [LTP] [RFC] enable OOM protection for the library and test process?
  2021-12-13 16:15     ` Cyril Hrubis
@ 2021-12-13 16:59       ` Martin Doucha
  2021-12-14  6:46         ` Li Wang
  2021-12-14  6:31       ` Li Wang
  1 sibling, 1 reply; 19+ messages in thread
From: Martin Doucha @ 2021-12-13 16:59 UTC (permalink / raw)
  To: Cyril Hrubis; +Cc: LTP List

On 13. 12. 21 17:15, Cyril Hrubis wrote:
> Hi!
>> I'm afraid it won't be that easy. The main cause of OOM killer going
>> postal and killing processes with tiny memory footprint is that a
>> process executing the mlock() syscall cannot be targeted by OOM killer
>> at all. That's a known issue in the kernel with no easy fix.
> 
> This is only single test out of at least 10 that can trigger OOM, right?

All 5 oom* tests in the mm runfile have mlock() subtest. All of them can
end up killing the whole userspace by accident. And all of them
regularly do so in our automated test system.

-- 
Martin Doucha   mdoucha@suse.cz
QA Engineer for Software Maintenance
SUSE LINUX, s.r.o.
CORSO IIa
Krizikova 148/34
186 00 Prague 8
Czech Republic

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [LTP] [RFC] enable OOM protection for the library and test process?
  2021-12-13 16:15     ` Cyril Hrubis
  2021-12-13 16:59       ` Martin Doucha
@ 2021-12-14  6:31       ` Li Wang
  1 sibling, 0 replies; 19+ messages in thread
From: Li Wang @ 2021-12-14  6:31 UTC (permalink / raw)
  To: Cyril Hrubis; +Cc: LTP List


[-- Attachment #1.1: Type: text/plain, Size: 1127 bytes --]

On Tue, Dec 14, 2021 at 12:15 AM Cyril Hrubis <chrubis@suse.cz> wrote:


> > Protecting the test harness is a bad idea because oom_score_adj is
> > inherited by child processes and it'll affect other tests as well. Given
> > the nature of OOM tests, I'd rather not assume that the protection will
> > be properly removed at the end.
>
> This should be easily doable since the test library forks right before
> it executes the test, so we have a single place where the score has to
> be reset.
>

I think so. And we can even export the function as global to
make it easy to enable/cancel the OOM protection for any
process at any time we wanted. Then, just resetting the child
process oom_score_adj to 0 can avoid to inherited from the
lib-process score as well.

e.g.

    void tst_enable_oom_protection(pid_t pid)
    void tst_cancel_oom_protection(pid_t pid)



>
> For new library tests there is a process that does nothing but waits for
> the actuall test pid to finish and kills it on timeout. It really makes
> sense to protect this exact process and maybe even mlock() it into the
> memory.
>

+1


-- 
Regards,
Li Wang

[-- Attachment #1.2: Type: text/html, Size: 2540 bytes --]

[-- Attachment #2: Type: text/plain, Size: 60 bytes --]


-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [LTP] [RFC] enable OOM protection for the library and test process?
  2021-12-13 16:59       ` Martin Doucha
@ 2021-12-14  6:46         ` Li Wang
  0 siblings, 0 replies; 19+ messages in thread
From: Li Wang @ 2021-12-14  6:46 UTC (permalink / raw)
  To: Martin Doucha; +Cc: LTP List

[-- Attachment #1.1: Type: text/plain, Size: 1189 bytes --]

On Tue, Dec 14, 2021 at 12:59 AM Martin Doucha <mdoucha@suse.cz> wrote:

> On 13. 12. 21 17:15, Cyril Hrubis wrote:
> > Hi!
> >> I'm afraid it won't be that easy. The main cause of OOM killer going
> >> postal and killing processes with tiny memory footprint is that a
> >> process executing the mlock() syscall cannot be targeted by OOM killer
> >> at all. That's a known issue in the kernel with no easy fix.
> >
> > This is only single test out of at least 10 that can trigger OOM, right?
>
> All 5 oom* tests in the mm runfile have mlock() subtest. All of them can
> end up killing the whole userspace by accident. And all of them
> regularly do so in our automated test system.
>

It is possible, but avoid enabling the OOM protection on lib-process
can miss this panic? I guess no, because lib-process is not eaten
too much memory and there are many other processes that still have
the big potential to be killed.

So we might shouldn't believe protect main or lib process will add the
possibility to kill PID 1, in other words, do protection or not, the
panic/bug
can't be perfectly avoided, so why not do it? Or I have something
that misunderstands you here.

-- 
Regards,
Li Wang

[-- Attachment #1.2: Type: text/html, Size: 2281 bytes --]

[-- Attachment #2: Type: text/plain, Size: 60 bytes --]

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 19+ messages in thread

* [LTP] [PATCH 1/3] lib: add functions to adjust oom score
  2021-12-13  8:03 [LTP] [RFC] enable OOM protection for the library and test process? Li Wang
  2021-12-13  9:32 ` Jan Stancek
@ 2021-12-16  3:41 ` Li Wang
  2021-12-16  3:41   ` [LTP] [PATCH 2/3] ltp: enable OOM protection for main and test harness process Li Wang
                     ` (2 more replies)
  1 sibling, 3 replies; 19+ messages in thread
From: Li Wang @ 2021-12-16  3:41 UTC (permalink / raw)
  To: ltp

This introduces function to LTP for adjusting the oom_score_adj of
target process, which may be helpful in OOM tests to prevent kernel
killing the main or lib process during test running.

The exported global tst_enable_oom_protection function can be used
at anywhere you want to protect, but please remember that if you
do enable protection on a process($PID) that all the children will
inherit its score and be ignored by OOM Killer as well. So that's
why tst_cancel_oom_protection is recommended to combination in use.

Signed-off-by: Li Wang <liwang@redhat.com>
---
 include/tst_memutils.h | 12 ++++++++++++
 lib/tst_memutils.c     | 29 +++++++++++++++++++++++++++++
 2 files changed, 41 insertions(+)

diff --git a/include/tst_memutils.h b/include/tst_memutils.h
index f605f544e..e569aef8d 100644
--- a/include/tst_memutils.h
+++ b/include/tst_memutils.h
@@ -25,4 +25,16 @@ void tst_pollute_memory(size_t maxsize, int fillchar);
  */
 long long tst_available_mem(void);
 
+/*
+ * Enable OOM protection to prevent process($PID) being killed by OOM Killer.
+ *   echo -1000 >/proc/$PID/oom_score_adj
+ */
+void tst_enable_oom_protection(pid_t pid);
+
+/*
+ * Cancel the OOM protection for the process($PID).
+ *   echo 0 >/proc/$PID/oom_score_adj
+ */
+void tst_cancel_oom_protection(pid_t pid);
+
 #endif /* TST_MEMUTILS_H__ */
diff --git a/lib/tst_memutils.c b/lib/tst_memutils.c
index bd09cf6fa..b9b85677b 100644
--- a/lib/tst_memutils.c
+++ b/lib/tst_memutils.c
@@ -3,6 +3,7 @@
  * Copyright (c) 2020 SUSE LLC <mdoucha@suse.cz>
  */
 
+#include <stdio.h>
 #include <unistd.h>
 #include <limits.h>
 #include <sys/sysinfo.h>
@@ -91,3 +92,31 @@ long long tst_available_mem(void)
 
 	return mem_available;
 }
+
+static void set_oom_score_adj(pid_t pid, int value)
+{
+	int val;
+	char score_path[64];
+
+	if (access("/proc/self/oom_score_adj", F_OK) == -1) {
+		tst_res(TINFO, "Warning: oom_score_adj is not exist");
+		return;
+	}
+
+	sprintf(score_path, "/proc/%d/oom_score_adj", pid);
+	SAFE_FILE_PRINTF(score_path, "%d", value);
+
+	SAFE_FILE_SCANF(score_path, "%d", &val);
+	if (val != value)
+		tst_brk(TBROK, "oom_score_adj = %d, but expect %d.", val, value);
+}
+
+void tst_enable_oom_protection(pid_t pid)
+{
+	set_oom_score_adj(pid, -1000);
+}
+
+void tst_cancel_oom_protection(pid_t pid)
+{
+	set_oom_score_adj(pid, 0);
+}
-- 
2.31.1


-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [LTP] [PATCH 2/3] ltp: enable OOM protection for main and test harness process
  2021-12-16  3:41 ` [LTP] [PATCH 1/3] lib: add functions to adjust oom score Li Wang
@ 2021-12-16  3:41   ` Li Wang
  2021-12-16  7:55     ` Petr Vorel
  2021-12-16  9:50     ` Martin Doucha
  2021-12-16  3:41   ` [LTP] [PATCH 3/3] oom: enable OOM protection for mem lib process Li Wang
  2021-12-16  7:49   ` [LTP] [PATCH 1/3] lib: add functions to adjust oom score Petr Vorel
  2 siblings, 2 replies; 19+ messages in thread
From: Li Wang @ 2021-12-16  3:41 UTC (permalink / raw)
  To: ltp

Here invoke OOM protection in fork_testrun, since it is the key point
to distiguish many process branches. We do protect ltp test harness($PPID)
and main ($PID) process from killing by OOM Killer, hope this can help
to get the completed correct report for all of LTP tests.

Fundamental principle:

  (oom protection) ltp test harness --> library process
  (oom protection)   main --> tst_run_tcases --> ... --> fork_testrun
  (cancel protection)  testrun --> run_tests --> ... --> testname()
                         child_test --> ... --> end

Note: there might be still argument on doing this protection for test harness,
      because it will affect all common testcases (I mean none oom tests), but
      I slightly think it is safe as there seems no much system load during
      perform them.

Signed-off-by: Li Wang <liwang@redhat.com>
---
 lib/tst_test.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/lib/tst_test.c b/lib/tst_test.c
index ce2b8239d..f3ae48240 100644
--- a/lib/tst_test.c
+++ b/lib/tst_test.c
@@ -1441,11 +1441,15 @@ static int fork_testrun(void)
 
 	SAFE_SIGNAL(SIGINT, sigint_handler);
 
+	tst_enable_oom_protection(getppid());
+	tst_enable_oom_protection(getpid());
+
 	test_pid = fork();
 	if (test_pid < 0)
 		tst_brk(TBROK | TERRNO, "fork()");
 
 	if (!test_pid) {
+		tst_cancel_oom_protection(getpid());
 		SAFE_SIGNAL(SIGALRM, SIG_DFL);
 		SAFE_SIGNAL(SIGUSR1, SIG_DFL);
 		SAFE_SIGNAL(SIGINT, SIG_DFL);
-- 
2.31.1


-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [LTP] [PATCH 3/3] oom: enable OOM protection for mem lib process
  2021-12-16  3:41 ` [LTP] [PATCH 1/3] lib: add functions to adjust oom score Li Wang
  2021-12-16  3:41   ` [LTP] [PATCH 2/3] ltp: enable OOM protection for main and test harness process Li Wang
@ 2021-12-16  3:41   ` Li Wang
  2021-12-16  7:57     ` Petr Vorel
  2021-12-16  7:49   ` [LTP] [PATCH 1/3] lib: add functions to adjust oom score Petr Vorel
  2 siblings, 1 reply; 19+ messages in thread
From: Li Wang @ 2021-12-16  3:41 UTC (permalink / raw)
  To: ltp

Just simply invoke oom protection on mem library to make
it can collect full state of hildren.

Signed-off-by: Li Wang <liwang@redhat.com>
---
 testcases/kernel/mem/lib/mem.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/testcases/kernel/mem/lib/mem.c b/testcases/kernel/mem/lib/mem.c
index ac890491c..566e29055 100644
--- a/testcases/kernel/mem/lib/mem.c
+++ b/testcases/kernel/mem/lib/mem.c
@@ -129,8 +129,11 @@ void oom(int testcase, int lite, int retcode, int allow_sigkill)
 	pid_t pid;
 	int status, threads;
 
+	tst_enable_oom_protection(getpid());
+
 	switch (pid = SAFE_FORK()) {
 	case 0:
+		tst_cancel_oom_protection(getpid());
 		threads = MAX(1, tst_ncpus() - 1);
 		child_alloc(testcase, lite, threads);
 	default:
-- 
2.31.1


-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [LTP] [PATCH 1/3] lib: add functions to adjust oom score
  2021-12-16  3:41 ` [LTP] [PATCH 1/3] lib: add functions to adjust oom score Li Wang
  2021-12-16  3:41   ` [LTP] [PATCH 2/3] ltp: enable OOM protection for main and test harness process Li Wang
  2021-12-16  3:41   ` [LTP] [PATCH 3/3] oom: enable OOM protection for mem lib process Li Wang
@ 2021-12-16  7:49   ` Petr Vorel
  2021-12-17  2:02     ` Li Wang
  2 siblings, 1 reply; 19+ messages in thread
From: Petr Vorel @ 2021-12-16  7:49 UTC (permalink / raw)
  To: Li Wang; +Cc: ltp

Hi Li,

> This introduces function to LTP for adjusting the oom_score_adj of
> target process, which may be helpful in OOM tests to prevent kernel
> killing the main or lib process during test running.
very good idea.

Reviewed-by: Petr Vorel <pvorel@suse.cz>

> The exported global tst_enable_oom_protection function can be used
> at anywhere you want to protect, but please remember that if you
> do enable protection on a process($PID) that all the children will
> inherit its score and be ignored by OOM Killer as well. So that's
> why tst_cancel_oom_protection is recommended to combination in use.

BTW deliberately not documenting it as it should not be commonly
used in tests? Also although oom_score_adj inheritance should be known to
person who will want to add it somewhere, I'd move it from commit message to
source code (into header docs or or C API doc).

> +static void set_oom_score_adj(pid_t pid, int value)
> +{
> +	int val;
> +	char score_path[64];
> +
> +	if (access("/proc/self/oom_score_adj", F_OK) == -1) {
> +		tst_res(TINFO, "Warning: oom_score_adj is not exist");
nit: IMHO "does not exist" or just "not exist"

...

Kind regards,
Petr

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [LTP] [PATCH 2/3] ltp: enable OOM protection for main and test harness process
  2021-12-16  3:41   ` [LTP] [PATCH 2/3] ltp: enable OOM protection for main and test harness process Li Wang
@ 2021-12-16  7:55     ` Petr Vorel
  2021-12-16  9:50     ` Martin Doucha
  1 sibling, 0 replies; 19+ messages in thread
From: Petr Vorel @ 2021-12-16  7:55 UTC (permalink / raw)
  To: Li Wang; +Cc: ltp

Hi Li,

Not really expert on memory, but this LGTM.

Reviewed-by: Petr Vorel <pvorel@suse.cz>

Kind regards,
Petr

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [LTP] [PATCH 3/3] oom: enable OOM protection for mem lib process
  2021-12-16  3:41   ` [LTP] [PATCH 3/3] oom: enable OOM protection for mem lib process Li Wang
@ 2021-12-16  7:57     ` Petr Vorel
  0 siblings, 0 replies; 19+ messages in thread
From: Petr Vorel @ 2021-12-16  7:57 UTC (permalink / raw)
  To: Li Wang; +Cc: ltp

Hi Li,

Also LGTM.

Reviewed-by: Petr Vorel <pvorel@suse.cz>

Kind regards,
Petr

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [LTP] [PATCH 2/3] ltp: enable OOM protection for main and test harness process
  2021-12-16  3:41   ` [LTP] [PATCH 2/3] ltp: enable OOM protection for main and test harness process Li Wang
  2021-12-16  7:55     ` Petr Vorel
@ 2021-12-16  9:50     ` Martin Doucha
  2021-12-17  1:50       ` Li Wang
  1 sibling, 1 reply; 19+ messages in thread
From: Martin Doucha @ 2021-12-16  9:50 UTC (permalink / raw)
  To: Li Wang, ltp

Hi,

On 16. 12. 21 4:41, Li Wang wrote:
> diff --git a/lib/tst_test.c b/lib/tst_test.c
> index ce2b8239d..f3ae48240 100644
> --- a/lib/tst_test.c
> +++ b/lib/tst_test.c
> @@ -1441,11 +1441,15 @@ static int fork_testrun(void)
>  
>  	SAFE_SIGNAL(SIGINT, sigint_handler);
>  
> +	tst_enable_oom_protection(getppid());

this is exactly what you should *NOT* do because then the OOM protection
will also be inherited by all non-LTP processes executed by the same
shell (or whatever the parent process is).

> +	tst_enable_oom_protection(getpid());
> +
>  	test_pid = fork();
>  	if (test_pid < 0)
>  		tst_brk(TBROK | TERRNO, "fork()");
>  
>  	if (!test_pid) {
> +		tst_cancel_oom_protection(getpid());
>  		SAFE_SIGNAL(SIGALRM, SIG_DFL);
>  		SAFE_SIGNAL(SIGUSR1, SIG_DFL);
>  		SAFE_SIGNAL(SIGINT, SIG_DFL);
> 


-- 
Martin Doucha   mdoucha@suse.cz
QA Engineer for Software Maintenance
SUSE LINUX, s.r.o.
CORSO IIa
Krizikova 148/34
186 00 Prague 8
Czech Republic

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [LTP] [PATCH 2/3] ltp: enable OOM protection for main and test harness process
  2021-12-16  9:50     ` Martin Doucha
@ 2021-12-17  1:50       ` Li Wang
  0 siblings, 0 replies; 19+ messages in thread
From: Li Wang @ 2021-12-17  1:50 UTC (permalink / raw)
  To: Martin Doucha; +Cc: LTP List


[-- Attachment #1.1: Type: text/plain, Size: 1041 bytes --]

On Thu, Dec 16, 2021 at 5:56 PM Martin Doucha <mdoucha@suse.cz> wrote:

> Hi,
>
> On 16. 12. 21 4:41, Li Wang wrote:
> > diff --git a/lib/tst_test.c b/lib/tst_test.c
> > index ce2b8239d..f3ae48240 100644
> > --- a/lib/tst_test.c
> > +++ b/lib/tst_test.c
> > @@ -1441,11 +1441,15 @@ static int fork_testrun(void)
> >
> >       SAFE_SIGNAL(SIGINT, sigint_handler);
> >
> > +     tst_enable_oom_protection(getppid());
>
> this is exactly what you should *NOT* do because then the OOM protection
> will also be inherited by all non-LTP processes executed by the same
> shell (or whatever the parent process is).
>

You are right! I previously thought the parent process is only ltp-pan
and we only need to cancel the protection in fork_testrun's children.
But obviously, one thing I neglected is that some shell tests will still
under the affected. And furthermore if run LTP test manually the parent
will be the shell, non-LTP process also inherits the score.

Thanks for pointing out this, I will remove this line in V2.

-- 
Regards,
Li Wang

[-- Attachment #1.2: Type: text/html, Size: 2024 bytes --]

[-- Attachment #2: Type: text/plain, Size: 60 bytes --]


-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [LTP] [PATCH 1/3] lib: add functions to adjust oom score
  2021-12-16  7:49   ` [LTP] [PATCH 1/3] lib: add functions to adjust oom score Petr Vorel
@ 2021-12-17  2:02     ` Li Wang
  2021-12-17  8:25       ` Petr Vorel
  0 siblings, 1 reply; 19+ messages in thread
From: Li Wang @ 2021-12-17  2:02 UTC (permalink / raw)
  To: Petr Vorel; +Cc: LTP List


[-- Attachment #1.1: Type: text/plain, Size: 1663 bytes --]

Hi Petr,

On Thu, Dec 16, 2021 at 3:50 PM Petr Vorel <pvorel@suse.cz> wrote:

> Hi Li,
>
> > This introduces function to LTP for adjusting the oom_score_adj of
> > target process, which may be helpful in OOM tests to prevent kernel
> > killing the main or lib process during test running.
> very good idea.
>
> Reviewed-by: Petr Vorel <pvorel@suse.cz>
>
> > The exported global tst_enable_oom_protection function can be used
> > at anywhere you want to protect, but please remember that if you
> > do enable protection on a process($PID) that all the children will
> > inherit its score and be ignored by OOM Killer as well. So that's
> > why tst_cancel_oom_protection is recommended to combination in use.
>
> BTW deliberately not documenting it as it should not be commonly
>

Yes, actually it's not a commendatory API to users, and I think
we do really avoid using it unless we have no better choice.
(at least for OOM tests I can tell this)
The main reason we use it is current kernel OOM is not very
perfect, we just use it to help get the completed log for LTP.



> used in tests? Also although oom_score_adj inheritance should be known to
> person who will want to add it somewhere, I'd move it from commit message
> to
> source code (into header docs or or C API doc).
>

Sounds reasonable, will add this in V2.


>
> > +static void set_oom_score_adj(pid_t pid, int value)
> > +{
> > +     int val;
> > +     char score_path[64];
> > +
> > +     if (access("/proc/self/oom_score_adj", F_OK) == -1) {
> > +             tst_res(TINFO, "Warning: oom_score_adj is not exist");
> nit: IMHO "does not exist" or just "not exist"
>

Agree.

-- 
Regards,
Li Wang

[-- Attachment #1.2: Type: text/html, Size: 3137 bytes --]

[-- Attachment #2: Type: text/plain, Size: 60 bytes --]


-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [LTP] [PATCH 1/3] lib: add functions to adjust oom score
  2021-12-17  2:02     ` Li Wang
@ 2021-12-17  8:25       ` Petr Vorel
  0 siblings, 0 replies; 19+ messages in thread
From: Petr Vorel @ 2021-12-17  8:25 UTC (permalink / raw)
  To: Li Wang; +Cc: LTP List

Hi Li,

> > BTW deliberately not documenting it as it should not be commonly

> Yes, actually it's not a commendatory API to users, and I think
> we do really avoid using it unless we have no better choice.
> (at least for OOM tests I can tell this)
> The main reason we use it is current kernel OOM is not very
> perfect, we just use it to help get the completed log for LTP.

Understand, thx for info.


> > used in tests? Also although oom_score_adj inheritance should be known to
> > person who will want to add it somewhere, I'd move it from commit message
> > to
> > source code (into header docs or or C API doc).

> Sounds reasonable, will add this in V2.

Thx!

Kind regards,
Petr

-- 
Mailing list info: https://lists.linux.it/listinfo/ltp

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2021-12-17  8:26 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-13  8:03 [LTP] [RFC] enable OOM protection for the library and test process? Li Wang
2021-12-13  9:32 ` Jan Stancek
2021-12-13 10:18   ` Li Wang
2021-12-13 15:08     ` Cyril Hrubis
2021-12-13 16:06   ` Martin Doucha
2021-12-13 16:15     ` Cyril Hrubis
2021-12-13 16:59       ` Martin Doucha
2021-12-14  6:46         ` Li Wang
2021-12-14  6:31       ` Li Wang
2021-12-16  3:41 ` [LTP] [PATCH 1/3] lib: add functions to adjust oom score Li Wang
2021-12-16  3:41   ` [LTP] [PATCH 2/3] ltp: enable OOM protection for main and test harness process Li Wang
2021-12-16  7:55     ` Petr Vorel
2021-12-16  9:50     ` Martin Doucha
2021-12-17  1:50       ` Li Wang
2021-12-16  3:41   ` [LTP] [PATCH 3/3] oom: enable OOM protection for mem lib process Li Wang
2021-12-16  7:57     ` Petr Vorel
2021-12-16  7:49   ` [LTP] [PATCH 1/3] lib: add functions to adjust oom score Petr Vorel
2021-12-17  2:02     ` Li Wang
2021-12-17  8:25       ` Petr Vorel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.