All of lore.kernel.org
 help / color / mirror / Atom feed
* [LTP] Memory requirements for ltp
@ 2020-05-26  8:54 Richard Purdie
  2020-05-28 15:13 ` Petr Vorel
  2020-06-01 15:06 ` Cyril Hrubis
  0 siblings, 2 replies; 3+ messages in thread
From: Richard Purdie @ 2020-05-26  8:54 UTC (permalink / raw)
  To: ltp

Hi,

I work on the Yocto Project and we run ltp tests as part of our testing
infrastructure. We're having problems where the tests hang during
execution and are trying to figure out why as this is disruptive.

It appears to be the controllers tests which hang. Its also clear we
are running the tests on a system with too little memory (512MB) as
there is OOM killer activity all over the logs (as well as errors from
missing tools like nice, bc, gdb, ifconfig and others).

I did dump all the logs and output I could find into a bug for tracking
purposes, https://bugzilla.yoctoproject.org/show_bug.cgi?id=13802

Petr tells me SUSE use 4GB for QEMU, does anyone have any other
boundaries on what works/doesn't work?

Other questions that come to mind:

Could/should ltp test for the tools it uses up front?
Are there any particular tests we should avoid as they are known to be
unreliable?

The ones we're currently running are:

"math", "syscalls", "dio", "io", "mm", "ipc", "sched", "nptl", "pty",
"containers", "controllers", 
"filecaps", "cap_bounds", "fcntl-locktests", "connectors", "commands",
"net.ipv6_lib", "input",
"fs_perms_simple", "fs", "fsx", "fs_bind"

someone suggested I should just remove controllers but I'm not sure
that is the best way forward.

I will test with more memory (not sure how much yet) but I'd welcome
more data if anyone has any.

Cheers,

Richard


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [LTP] Memory requirements for ltp
  2020-05-26  8:54 [LTP] Memory requirements for ltp Richard Purdie
@ 2020-05-28 15:13 ` Petr Vorel
  2020-06-01 15:06 ` Cyril Hrubis
  1 sibling, 0 replies; 3+ messages in thread
From: Petr Vorel @ 2020-05-28 15:13 UTC (permalink / raw)
  To: ltp

Hi Richard,

> I work on the Yocto Project and we run ltp tests as part of our testing
> infrastructure. We're having problems where the tests hang during
> execution and are trying to figure out why as this is disruptive.

> It appears to be the controllers tests which hang. Its also clear we
> are running the tests on a system with too little memory (512MB) as
> there is OOM killer activity all over the logs (as well as errors from
> missing tools like nice, bc, gdb, ifconfig and others).
TCONF messages for missing tools are ok (although many of the dependencies are
available in busybox).

> I did dump all the logs and output I could find into a bug for tracking
> purposes, https://bugzilla.yoctoproject.org/show_bug.cgi?id=13802

> Petr tells me SUSE use 4GB for QEMU, does anyone have any other
> boundaries on what works/doesn't work?

Maybe memory itself isn't the only parameter, more would be interesting
available memory, which depends on used user space and particular kernel
configuration (which can be different embedded targets).

> Other questions that come to mind:

> Could/should ltp test for the tools it uses up front?
> Are there any particular tests we should avoid as they are known to be
> unreliable?
IMHO generally anything which haven't been changed for 10+ years :).
Thus tests which are using legacy API (test.h or test.sh).
(well, of course, some old tests are stable and new can be broken).

> The ones we're currently running are:

> "math", "syscalls", "dio", "io", "mm", "ipc", "sched", "nptl", "pty",
> "containers", "controllers", 
> "filecaps", "cap_bounds", "fcntl-locktests", "connectors", "commands",
> "net.ipv6_lib", "input",
> "fs_perms_simple", "fs", "fsx", "fs_bind"
IMHO cve (many of these are also in other runtest files, but here with CVE
number), uevent (new one), maybe pty (has found some bugs).
Maybe you're also interested to test some networking: net_stress.interface,
net_stress.ipsec_icmp are stable.

Last release there was added lvmtest.sh, which does testing filesystems over
LVM (maybe not interesting for you due not having enough disc capacity. What I
like that it detect presented filesystems via generate_lvm_runfile.sh).

> someone suggested I should just remove controllers but I'm not sure
> that is the best way forward.
Until tests are fixed, it's really better to disable them.
They need to be revised, there are bugs [1] and no support for cgroup2.

BTW C based tests in testcases/kernel/mem/ use cgroups, thus there will be some
(very limited) cgroups testing. And there is an attempt to improve these tests
[2] to fix use with cgroup2 [3].

> I will test with more memory (not sure how much yet) but I'd welcome
> more data if anyone has any.
Please report back your investigation.

Kind regards,
Petr

[1] https://github.com/linux-test-project/ltp/issues?q=is%3Aissue+is%3Aopen+cgroup
[2] https://patchwork.ozlabs.org/project/ltp/list/?series=178573
[3] https://github.com/linux-test-project/ltp/issues/611

^ permalink raw reply	[flat|nested] 3+ messages in thread

* [LTP] Memory requirements for ltp
  2020-05-26  8:54 [LTP] Memory requirements for ltp Richard Purdie
  2020-05-28 15:13 ` Petr Vorel
@ 2020-06-01 15:06 ` Cyril Hrubis
  1 sibling, 0 replies; 3+ messages in thread
From: Cyril Hrubis @ 2020-06-01 15:06 UTC (permalink / raw)
  To: ltp

Hi!
> I work on the Yocto Project and we run ltp tests as part of our testing
> infrastructure. We're having problems where the tests hang during
> execution and are trying to figure out why as this is disruptive.
> 
> It appears to be the controllers tests which hang. Its also clear we
> are running the tests on a system with too little memory (512MB) as
> there is OOM killer activity all over the logs (as well as errors from
> missing tools like nice, bc, gdb, ifconfig and others).

We do have plans to scale memory intensive testcases with the system
memory, but that haven't been put into an action yet. See:

https://github.com/linux-test-project/ltp/issues/664

Generally most of the tests should run fine with 1GB of RAM and
everything should well with 2GB.

The cgroup stress tests are creating a lot of directories in the
hierarchy and attaching processes there, so they may cause OOM and
timeouts on embedded hardware. Ideally they should have some heuristic
on how much processes we can fork given the system available memory and
skip the more intesive testcases if needed. But even estimating how much
memory process and cgroup hierarchy could take would be not that
trivial...

> I did dump all the logs and output I could find into a bug for tracking
> purposes, https://bugzilla.yoctoproject.org/show_bug.cgi?id=13802
> 
> Petr tells me SUSE use 4GB for QEMU, does anyone have any other
> boundaries on what works/doesn't work?
> 
> Other questions that come to mind:
> 
> Could/should ltp test for the tools it uses up front?

This is actually being solved slowly, we are moving to a declarative
approach where test requirements are listed in a static structure. There
is also parser that can extract that information and produce a json file
that describes all (new library) tests in LTP testsuite. However this is
still experimental and out-of-tree at this point. But I do have a web
page demo that renders that json at:

http://metan.ucw.cz/outgoing/metadata.html

So in the (hopefully not so far) future the testrunner would consume
that file and could make much better decisions based on that metadata.

The main motivation for me are parallel testruns, if the testrunner
knows what testcases require/use we can easily avoid them competing for
resources and false possitives caused by this.

> Are there any particular tests we should avoid as they are known to be
> unreliable?
> 
> The ones we're currently running are:
> 
> "math", "syscalls", "dio", "io", "mm", "ipc", "sched", "nptl", "pty",
> "containers", "controllers", 
> "filecaps", "cap_bounds", "fcntl-locktests", "connectors", "commands",
> "net.ipv6_lib", "input",
> "fs_perms_simple", "fs", "fsx", "fs_bind"
> 
> someone suggested I should just remove controllers but I'm not sure
> that is the best way forward.
> 
> I will test with more memory (not sure how much yet) but I'd welcome
> more data if anyone has any.

I would advise to filter out oom* testcases from mm if you have problems
with OOM killing the wrong processes, these testcases are intended to
trigger OOM and test that the kernel is able to recover, but they tend
to be problematic especially on machines with little RAM.

Apart from that the rest should be reasonably safe on modern hardware,
but with less than 1G of RAM you mileage may vary.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-06-01 15:06 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-26  8:54 [LTP] Memory requirements for ltp Richard Purdie
2020-05-28 15:13 ` Petr Vorel
2020-06-01 15:06 ` Cyril Hrubis

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.