All of lore.kernel.org
 help / color / mirror / Atom feed
* [LTP] [PATCH v6 0/7] CGroup API rewrite
@ 2021-05-04 13:40 Richard Palethorpe
  2021-05-04 13:40 ` [LTP] [PATCH v6 1/7] API: Add safe openat, printfat, readat and unlinkat Richard Palethorpe
                   ` (6 more replies)
  0 siblings, 7 replies; 11+ messages in thread
From: Richard Palethorpe @ 2021-05-04 13:40 UTC (permalink / raw)
  To: ltp

Hello,

This is a complete rewrite of the CGroups API. To understand why this
is so complicated, please see the commments in tst_cgroup.h and
tst_cgroup.c.

V6:

* Make cgroup_dir_mk return void as its return value is never used.

* Constify tst_safe_file_at, remove TEST macro and remove nested
  ternary expression from safe_unlinkat.

* Make fd_path, cgroup_roots and cgroup_dir_mk static.

* Add nonnull and warn_unused_return attributes.

* fix ksm02 to get the drain group. Note that nonnull prevents this
  mistake at compile time.

* Just use strcpy cgroup_cgroup_init

* Fix madvise06 doc comment

* Reviewed-by: Cyril Hrubis <chrubis@suse.cz>

V5:

* Change group_name to a dynamic buffer and use NAME_MAX for file name
  buffer sizes.

* Compare whole string in file_find and brk if name is not in
  recognised format.

* Fix bug introduced in V4 by mixing up cgroup.subtree_control and
  cgroup.clone_children.

* Add note to for_each_dir.

* Add check for the number of conversions in cgroup_scanf. This
  requires count_scanf_conversions to become public.

* Constify

* Add license and stdio.h to tst_safe_file_at.c

V4:

* Move openat based helpers to tst_safe_file_at.h.

* Switch to userland naming of controllers (ctrl for short) instead of
  kernel's css (CGroup subsystem).

* Use more descriptive names such as dir_fd and tst_cgroup_group. Note
  that we discussed calling it tst_cgroup_node. However in the end I
  thought this was too generic and instead went with repetition.

* Split cgroup_item into cgroup_file and cgroup_ctrl. Also make the
  lookup tree definition more verbose to avoid the missing field
  warnings and setting the ctrl index at runtime.

* make enum tst_cgroup_ctrl private; use controller name in
  tst_cgroup_require.

* make struct tst_cgroup_dir private.

* Eliminate use of cgroup_get_ctrl and pass the ctrl struct around
  instead of the indx.

* Deleted test21 as it is superseded by tst_cgroup02.c.

* Remove container_of macro as it is no longer used by this patch set.

* Fix tst_cgroup_{scan,require,mount} in unlikely case it is run
  inside cleanup or some other context where tst_brk may return.

* Fix potential null-ptr-deref in cgroup_file_alias.

* Include sys/types.h in header due to use of (s)size_t.

V3:

* Replaced the object API with a string based lookup.

* Replaced tst_cgroup_css struct and name mapping functions with an
  item info tree.

* Merged the header files again as there is no longer much seperation
  between the core and item parts of the library.

* Rename some variables and functions to make them more consistent.

V2:

* Created an object (Item) API which looks a bit like the unified V2
  hierarchy. The implementation is quite verbose, but not complicated
  IMO.

* Add the ability to extend the LTP CGroup hierarchy with child
  groups. We already have a reproducer that requires such a hierarchy,
  but I have not had chance to turn it into a test case yet.

* Add documentation for the new API in test-writing-guidelines.txt.

* Convert madvise06 to the CGroups API

* Better error reporting for the *at functions. Add tst_decode_fd
  which tries to print the path an FD was opened with.

TODO/NOTES:

* There are other tests which mount CGroups in an ad-hoc way and need
  to be converted to the new API. This at least includes memcg_test_3
  and maybe cgroup_xattr.

Richard Palethorpe (7):
  API: Add safe openat, printfat, readat and unlinkat
  API: Make tst_count_scanf_conversions public
  Add new CGroups APIs
  Add new CGroups API library tests
  docs: Update CGroups API
  mem: Convert tests to new CGroups API
  madvise06: Convert to new CGroups API

 doc/test-writing-guidelines.txt               |  175 ++-
 include/safe_file_ops_fn.h                    |   10 +
 include/tst_cgroup.h                          |  190 ++-
 include/tst_safe_file_at.h                    |   61 +
 include/tst_test.h                            |    1 -
 lib/newlib_tests/.gitignore                   |    3 +-
 lib/newlib_tests/test21.c                     |   66 -
 lib/newlib_tests/tst_cgroup01.c               |   51 +
 lib/newlib_tests/tst_cgroup02.c               |   90 ++
 lib/safe_file_ops.c                           |   16 +-
 lib/tst_cgroup.c                              | 1304 +++++++++++++----
 lib/tst_safe_file_at.c                        |  197 +++
 testcases/kernel/mem/cpuset/cpuset01.c        |   34 +-
 testcases/kernel/mem/include/mem.h            |    2 +-
 testcases/kernel/mem/ksm/ksm02.c              |   14 +-
 testcases/kernel/mem/ksm/ksm03.c              |   12 +-
 testcases/kernel/mem/ksm/ksm04.c              |   17 +-
 testcases/kernel/mem/lib/mem.c                |   10 +-
 testcases/kernel/mem/oom/oom03.c              |   18 +-
 testcases/kernel/mem/oom/oom04.c              |   19 +-
 testcases/kernel/mem/oom/oom05.c              |   32 +-
 testcases/kernel/syscalls/madvise/madvise06.c |  129 +-
 22 files changed, 1879 insertions(+), 572 deletions(-)
 create mode 100644 include/tst_safe_file_at.h
 delete mode 100644 lib/newlib_tests/test21.c
 create mode 100644 lib/newlib_tests/tst_cgroup01.c
 create mode 100644 lib/newlib_tests/tst_cgroup02.c
 create mode 100644 lib/tst_safe_file_at.c

-- 
2.31.1



^ permalink raw reply	[flat|nested] 11+ messages in thread

* [LTP] [PATCH v6 1/7] API: Add safe openat, printfat, readat and unlinkat
  2021-05-04 13:40 [LTP] [PATCH v6 0/7] CGroup API rewrite Richard Palethorpe
@ 2021-05-04 13:40 ` Richard Palethorpe
  2021-05-04 13:40 ` [LTP] [PATCH v6 2/7] API: Make tst_count_scanf_conversions public Richard Palethorpe
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Richard Palethorpe @ 2021-05-04 13:40 UTC (permalink / raw)
  To: ltp

Add 'at' variants for a number of system calls and LTP SAFE API
functions. This avoids using sprintf everywhere to build paths.

Also adds tst_decode_fd which allows us to retrieve the path for an FD
for debugging purposes without having to store it ourselves. However
the proc symlink may not be available on some systems.

Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com>
Reviewed-by: Petr Vorel <pvorel@suse.cz>
Reviewed-by: Cyril Hrubis <chrubis@suse.cz>
---
 include/tst_safe_file_at.h |  61 ++++++++++++
 lib/tst_safe_file_at.c     | 197 +++++++++++++++++++++++++++++++++++++
 2 files changed, 258 insertions(+)
 create mode 100644 include/tst_safe_file_at.h
 create mode 100644 lib/tst_safe_file_at.c

diff --git a/include/tst_safe_file_at.h b/include/tst_safe_file_at.h
new file mode 100644
index 000000000..8df34227f
--- /dev/null
+++ b/include/tst_safe_file_at.h
@@ -0,0 +1,61 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (c) 2021 SUSE LLC <rpalethorpe@suse.com>
+ */
+
+#ifndef TST_SAFE_FILE_AT_H
+#define TST_SAFE_FILE_AT_H
+
+#include <sys/types.h>
+#include <stdarg.h>
+
+#define SAFE_OPENAT(dirfd, path, oflags, ...)			\
+	safe_openat(__FILE__, __LINE__,				\
+		    (dirfd), (path), (oflags), ## __VA_ARGS__)
+
+#define SAFE_FILE_READAT(dirfd, path, buf, nbyte)			\
+	safe_file_readat(__FILE__, __LINE__,				\
+			 (dirfd), (path), (buf), (nbyte))
+
+
+#define SAFE_FILE_PRINTFAT(dirfd, path, fmt, ...)			\
+	safe_file_printfat(__FILE__, __LINE__,				\
+			   (dirfd), (path), (fmt), __VA_ARGS__)
+
+#define SAFE_UNLINKAT(dirfd, path, flags)				\
+	safe_unlinkat(__FILE__, __LINE__, (dirfd), (path), (flags))
+
+const char *tst_decode_fd(const int fd)
+			  __attribute__((warn_unused_result));
+
+int safe_openat(const char *const file, const int lineno, const int dirfd,
+                const char *const path, const int oflags, ...)
+		__attribute__((nonnull, warn_unused_result));
+
+ssize_t safe_file_readat(const char *const file, const int lineno,
+			 const int dirfd, const char *const path,
+			 char *const buf, const size_t nbyte)
+			 __attribute__ ((nonnull));
+
+int tst_file_vprintfat(const int dirfd, const char *const path,
+		       const char *const fmt, va_list va)
+		       __attribute__((nonnull));
+int tst_file_printfat(const int dirfd, const char *const path,
+		      const char *const fmt, ...)
+		      __attribute__ ((format (printf, 3, 4), nonnull));
+
+int safe_file_vprintfat(const char *const file, const int lineno,
+			const int dirfd, const char *const path,
+			const char *const fmt, va_list va)
+			__attribute__ ((nonnull));
+
+int safe_file_printfat(const char *const file, const int lineno,
+		       const int dirfd, const char *const path,
+		       const char *const fmt, ...)
+		       __attribute__ ((format (printf, 5, 6), nonnull));
+
+int safe_unlinkat(const char *const file, const int lineno,
+		  const int dirfd, const char *const path, const int flags)
+		  __attribute__ ((nonnull));
+
+#endif
diff --git a/lib/tst_safe_file_at.c b/lib/tst_safe_file_at.c
new file mode 100644
index 000000000..ca8ef2f68
--- /dev/null
+++ b/lib/tst_safe_file_at.c
@@ -0,0 +1,197 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (c) 2021 SUSE LLC <rpalethorpe@suse.com>
+ */
+
+#define _GNU_SOURCE
+#include <stdio.h>
+#include "lapi/fcntl.h"
+#include "tst_safe_file_at.h"
+
+#define TST_NO_DEFAULT_MAIN
+#include "tst_test.h"
+
+static char fd_path[PATH_MAX];
+
+const char *tst_decode_fd(const int fd)
+{
+	ssize_t ret;
+	char proc_path[32];
+
+	if (fd < 0)
+		return "!";
+
+	sprintf(proc_path, "/proc/self/fd/%d", fd);
+	ret = readlink(proc_path, fd_path, sizeof(fd_path));
+
+	if (ret < 0)
+		return "?";
+
+	fd_path[ret] = '\0';
+
+	return fd_path;
+}
+
+int safe_openat(const char *const file, const int lineno,
+		const int dirfd, const char *const path, const int oflags, ...)
+{
+	va_list ap;
+	int fd;
+	mode_t mode;
+
+	va_start(ap, oflags);
+	mode = va_arg(ap, int);
+	va_end(ap);
+
+	fd = openat(dirfd, path, oflags, mode);
+	if (fd > -1)
+		return fd;
+
+	tst_brk_(file, lineno, TBROK | TERRNO,
+		 "openat(%d<%s>, '%s', %o, %o)",
+		 dirfd, tst_decode_fd(dirfd), path, oflags, mode);
+
+	return fd;
+}
+
+ssize_t safe_file_readat(const char *const file, const int lineno,
+			 const int dirfd, const char *const path,
+			 char *const buf, const size_t nbyte)
+{
+	int fd = safe_openat(file, lineno, dirfd, path, O_RDONLY);
+	ssize_t rval;
+
+	if (fd < 0)
+		return -1;
+
+	rval = safe_read(file, lineno, NULL, 0, fd, buf, nbyte - 1);
+	if (rval < 0)
+		return -1;
+
+	close(fd);
+	buf[rval] = '\0';
+
+	if (rval >= (ssize_t)nbyte - 1) {
+		tst_brk_(file, lineno, TBROK,
+			"Buffer length %zu too small to read %d<%s>/%s",
+			nbyte, dirfd, tst_decode_fd(dirfd), path);
+	}
+
+	return rval;
+}
+
+int tst_file_vprintfat(const int dirfd, const char *const path,
+		       const char *const fmt, va_list va)
+{
+	const int fd = openat(dirfd, path, O_WRONLY);
+	int ret, errno_cpy;
+
+	if (fd < 0)
+		return -1;
+
+	ret = vdprintf(fd, fmt, va);
+	errno_cpy = errno;
+	close(fd);
+
+	if (ret < 0) {
+		errno = errno_cpy;
+		return -2;
+	}
+
+	return ret;
+}
+
+int tst_file_printfat(const int dirfd, const char *const path,
+		      const char *const fmt, ...)
+{
+	va_list va;
+	int rval;
+
+	va_start(va, fmt);
+	rval = tst_file_vprintfat(dirfd, path, fmt, va);
+	va_end(va);
+
+	return rval;
+}
+
+int safe_file_vprintfat(const char *const file, const int lineno,
+			const int dirfd, const char *const path,
+			const char *const fmt, va_list va)
+{
+	char buf[16];
+	va_list vac;
+	int rval, errno_cpy;
+
+	va_copy(vac, va);
+
+	rval = tst_file_vprintfat(dirfd, path, fmt, va);
+
+	if (rval == -2) {
+		errno_cpy = errno;
+		rval = vsnprintf(buf, sizeof(buf), fmt, vac);
+		va_end(vac);
+
+		if (rval >= (ssize_t)sizeof(buf))
+			strcpy(buf + sizeof(buf) - 5, "...");
+		else if (rval < 0)
+			buf[0] = '\0';
+
+		errno = errno_cpy;
+		tst_brk_(file, lineno, TBROK | TERRNO,
+			 "vdprintf(%d<%s>, '%s', '%s'<%s>)",
+			 dirfd, tst_decode_fd(dirfd), path, fmt, buf);
+		return -1;
+	}
+
+	va_end(vac);
+
+	if (rval == -1) {
+		tst_brk_(file, lineno, TBROK | TERRNO,
+			"openat(%d<%s>, '%s', O_WRONLY)",
+			dirfd, tst_decode_fd(dirfd), path);
+	}
+
+	return rval;
+}
+
+int safe_file_printfat(const char *const file, const int lineno,
+		       const int dirfd, const char *const path,
+		       const char *const fmt, ...)
+{
+	va_list va;
+	int rval;
+
+	va_start(va, fmt);
+	rval = safe_file_vprintfat(file, lineno, dirfd, path, fmt, va);
+	va_end(va);
+
+	return rval;
+}
+
+int safe_unlinkat(const char *const file, const int lineno,
+		  const int dirfd, const char *const path, const int flags)
+{
+	const int rval = unlinkat(dirfd, path, flags);
+	const char *flags_sym;
+
+	if (!rval)
+		return rval;
+
+	switch(flags) {
+	case AT_REMOVEDIR:
+		flags_sym = "AT_REMOVEDIR";
+		break;
+	case 0:
+		flags_sym = "0";
+		break;
+	default:
+		flags_sym = "?";
+		break;
+	}
+
+	tst_brk_(file, lineno, TBROK | TERRNO,
+		 "unlinkat(%d<%s>, '%s', %s)",
+		 dirfd, tst_decode_fd(dirfd), path, flags_sym);
+
+	return rval;
+}
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [LTP] [PATCH v6 2/7] API: Make tst_count_scanf_conversions public
  2021-05-04 13:40 [LTP] [PATCH v6 0/7] CGroup API rewrite Richard Palethorpe
  2021-05-04 13:40 ` [LTP] [PATCH v6 1/7] API: Add safe openat, printfat, readat and unlinkat Richard Palethorpe
@ 2021-05-04 13:40 ` Richard Palethorpe
  2021-05-04 13:40 ` [LTP] [PATCH v6 3/7] Add new CGroups APIs Richard Palethorpe
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Richard Palethorpe @ 2021-05-04 13:40 UTC (permalink / raw)
  To: ltp

Useful in other parts of the library like tst_cgroup.c

Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com>
Reviewed-by: Cyril Hrubis <chrubis@suse.cz>
---
 include/safe_file_ops_fn.h | 10 ++++++++++
 lib/safe_file_ops.c        | 16 ++++------------
 2 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/include/safe_file_ops_fn.h b/include/safe_file_ops_fn.h
index ed7d978dd..6d680967b 100644
--- a/include/safe_file_ops_fn.h
+++ b/include/safe_file_ops_fn.h
@@ -23,6 +23,16 @@
 
 #include "lapi/utime.h"
 
+/*
+ * Count number of expected assigned conversions. Any conversion starts with '%'.
+ * The '%%' matches % and no assignment is done. The %*x matches as x would do but
+ * the assignment is suppressed.
+ *
+ * NOTE: This is not 100% correct for complex scanf strings, but will do for
+ *       all of our intended usage.
+ */
+int tst_count_scanf_conversions(const char *fmt);
+
 /*
  * All-in-one function to scanf value(s) from a file.
  */
diff --git a/lib/safe_file_ops.c b/lib/safe_file_ops.c
index 0ec2ff8fe..249a512a1 100644
--- a/lib/safe_file_ops.c
+++ b/lib/safe_file_ops.c
@@ -34,15 +34,7 @@
 #include "test.h"
 #include "safe_file_ops_fn.h"
 
-/*
- * Count number of expected assigned conversions. Any conversion starts with '%'.
- * The '%%' matches % and no assignment is done. The %*x matches as x would do but
- * the assignment is suppressed.
- *
- * NOTE: This is not 100% correct for complex scanf strings, but will do for
- *       all of our intended usage.
- */
-static int count_scanf_conversions(const char *fmt)
+int tst_count_scanf_conversions(const char *fmt)
 {
 	unsigned int cnt = 0;
 	int flag = 0;
@@ -89,7 +81,7 @@ int file_scanf(const char *file, const int lineno,
 		return 1;
 	}
 
-	exp_convs = count_scanf_conversions(fmt);
+	exp_convs = tst_count_scanf_conversions(fmt);
 
 	va_start(va, fmt);
 	ret = vfscanf(f, fmt, va);
@@ -141,7 +133,7 @@ void safe_file_scanf(const char *file, const int lineno,
 		return;
 	}
 
-	exp_convs = count_scanf_conversions(fmt);
+	exp_convs = tst_count_scanf_conversions(fmt);
 
 	va_start(va, fmt);
 	ret = vfscanf(f, fmt, va);
@@ -195,7 +187,7 @@ int file_lines_scanf(const char *file, const int lineno,
 		return 1;
 	}
 
-	arg_count = count_scanf_conversions(fmt);
+	arg_count = tst_count_scanf_conversions(fmt);
 
 	while (fgets(line, BUFSIZ, fp) != NULL) {
 		va_start(ap, fmt);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [LTP] [PATCH v6 3/7] Add new CGroups APIs
  2021-05-04 13:40 [LTP] [PATCH v6 0/7] CGroup API rewrite Richard Palethorpe
  2021-05-04 13:40 ` [LTP] [PATCH v6 1/7] API: Add safe openat, printfat, readat and unlinkat Richard Palethorpe
  2021-05-04 13:40 ` [LTP] [PATCH v6 2/7] API: Make tst_count_scanf_conversions public Richard Palethorpe
@ 2021-05-04 13:40 ` Richard Palethorpe
  2021-05-06  7:56   ` Li Wang
  2021-05-04 13:40 ` [LTP] [PATCH v6 4/7] Add new CGroups API library tests Richard Palethorpe
                   ` (3 subsequent siblings)
  6 siblings, 1 reply; 11+ messages in thread
From: Richard Palethorpe @ 2021-05-04 13:40 UTC (permalink / raw)
  To: ltp

Complete rewrite of the CGroups API which provides two layers of
indirection between the test author and the SUT's CGroup
configuration.

Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com>
Reviewed-by: Cyril Hrubis <chrubis@suse.cz>
---
 include/tst_cgroup.h |  190 +++++-
 include/tst_test.h   |    1 -
 lib/tst_cgroup.c     | 1304 ++++++++++++++++++++++++++++++++----------
 3 files changed, 1154 insertions(+), 341 deletions(-)

diff --git a/include/tst_cgroup.h b/include/tst_cgroup.h
index bfd848260..bcf465a91 100644
--- a/include/tst_cgroup.h
+++ b/include/tst_cgroup.h
@@ -2,46 +2,190 @@
 /*
  * Copyright (c) 2020 Red Hat, Inc.
  * Copyright (c) 2020 Li Wang <liwang@redhat.com>
+ * Copyright (c) 2020-2021 SUSE LLC <rpalethorpe@suse.com>
+ */
+/*\
+ * [DESCRIPTION]
+ *
+ * The LTP CGroups API tries to present a consistent interface to the
+ * many possible CGroup configurations a system could have.
+ *
+ * You may ask; "Why don't you just mount a simple CGroup hierarchy,
+ * instead of scanning the current setup?". The short answer is that
+ * it is not possible unless no CGroups are currently active and
+ * almost all of our users will have CGroups active. Even if
+ * unmounting the current CGroup hierarchy is a reasonable thing to do
+ * to the sytem manager, it is highly unlikely the CGroup hierarchy
+ * will be destroyed. So users would be forced to remove their CGroup
+ * configuration and reboot the system.
+ *
+ * The core library tries to ensure an LTP CGroup exists on each
+ * hierarchy root. Inside the LTP group it ensures a 'drain' group
+ * exists and creats a test group for the current test. In the worst
+ * case we end up with a set of hierarchies like the follwoing. Where
+ * existing system-manager-created CGroups have been omitted.
+ *
+ * 	(V2 Root)	(V1 Root 1)	...	(V1 Root N)
+ * 	    |		     |			     |
+ *	  (ltp)		   (ltp)	...	   (ltp)
+ *	 /     \	  /	\		  /	\
+ *  (drain) (test-n) (drain)  (test-n)  ...  (drain)  (test-n)
+ *
+ * V2 CGroup controllers use a single unified hierarchy on a single
+ * root. Two or more V1 controllers may share a root or have their own
+ * root. However there may exist only one instance of a controller.
+ * So you can not have the same V1 controller on multiple roots.
+ *
+ * It is possible to have both a V2 hierarchy and V1 hierarchies
+ * active at the same time. Which is what is shown above. Any
+ * controllers attached to V1 hierarchies will not be available in the
+ * V2 hierarchy. The reverse is also true.
+ *
+ * Note that a single hierarchy may be mounted multiple
+ * times. Allowing it to be accessed at different locations. However
+ * subsequent mount operations will fail if the mount options are
+ * different from the first.
+ *
+ * The user may pre-create the CGroup hierarchies and the ltp CGroup,
+ * otherwise the library will try to create them. If the ltp group
+ * already exists and has appropriate permissions, then admin
+ * privileges will not be required to run the tests.
+ *
+ * Because the test may not have access to the CGroup root(s), the
+ * drain CGroup is created. This can be used to store processes which
+ * would otherwise block the destruction of the individual test CGroup
+ * or one of its descendants.
+ *
+ * The test author may create child CGroups within the test CGroup
+ * using the CGroup Item API. The library will create the new CGroup
+ * in all the relevant hierarchies.
+ *
+ * There are many differences between the V1 and V2 CGroup APIs. If a
+ * controller is on both V1 and V2, it may have different parameters
+ * and control files. Some of these control files have a different
+ * name, but similar functionality. In this case the Item API uses
+ * the V2 names and aliases them to the V1 name when appropriate.
+ *
+ * Some control files only exist on one of the versions or they can be
+ * missing due to other reasons. The Item API allows the user to check
+ * if the file exists before trying to use it.
+ *
+ * Often a control file has almost the same functionality between V1
+ * and V2. Which means it can be used in the same way most of the
+ * time, but not all. For now this is handled by exposing the API
+ * version a controller is using to allow the test author to handle
+ * edge cases. (e.g. V2 memory.swap.max accepts "max", but V1
+ * memory.memsw.limit_in_bytes does not).
  */
 
 #ifndef TST_CGROUP_H
 #define TST_CGROUP_H
 
-#define PATH_TMP_CG_MEM	"/tmp/cgroup_mem"
-#define PATH_TMP_CG_CST	"/tmp/cgroup_cst"
+#include <sys/types.h>
 
+/* CGroups Kernel API version */
 enum tst_cgroup_ver {
 	TST_CGROUP_V1 = 1,
 	TST_CGROUP_V2 = 2,
 };
 
-enum tst_cgroup_ctrl {
-	TST_CGROUP_MEMCG = 1,
-	TST_CGROUP_CPUSET = 2,
-	/* add cgroup controller */
+/* Used to specify CGroup hierarchy configuration options, allowing a
+ * test to request a particular CGroup structure.
+ */
+struct tst_cgroup_opts {
+	/* Only try to mount V1 CGroup controllers. This will not
+	 * prevent V2 from being used if it is already mounted, it
+	 * only indicates that we should mount V1 controllers if
+	 * nothing is present. By default we try to mount V2 first. */
+	int only_mount_v1:1;
 };
 
-enum tst_cgroup_ver tst_cgroup_version(void);
+/* A Control Group in LTP's aggregated hierarchy */
+struct tst_cgroup_group;
+
+/* Search the system for mounted cgroups and available
+ * controllers. Called automatically by tst_cgroup_require.
+ */
+void tst_cgroup_scan(void);
+/* Print the config detected by tst_cgroup_scan */
+void tst_cgroup_print_config(void);
+
+/* Ensure the specified controller is available in the test's default
+ * CGroup, mounting/enabling it if necessary */
+void tst_cgroup_require(const char *const ctrl_name,
+			const struct tst_cgroup_opts *const options)
+			__attribute__ ((nonnull (1)));
+
+/* Tear down any CGroups created by calls to tst_cgroup_require */
+void tst_cgroup_cleanup(void);
+
+/* Get the default CGroup for the test. It allocates memory (in a
+ * guarded buffer) so should always be called from setup
+ */
+const struct tst_cgroup_group *tst_cgroup_get_test_group(void)
+	__attribute__ ((warn_unused_result));
+/* Get the shared drain group. Also should be called from setup */
+const struct tst_cgroup_group *tst_cgroup_get_drain_group(void)
+	__attribute__ ((warn_unused_result));
+
+/* Create a descendant CGroup */
+struct tst_cgroup_group *
+tst_cgroup_group_mk(const struct tst_cgroup_group *const parent,
+		    const char *const group_name)
+		    __attribute__ ((nonnull, warn_unused_result));
+/* Remove a descendant CGroup */
+struct tst_cgroup_group *
+tst_cgroup_group_rm(struct tst_cgroup_group *const cg)
+		    __attribute__ ((nonnull, warn_unused_result));
+
+#define TST_CGROUP_VER(cg, ctrl_name) \
+	tst_cgroup_ver(__FILE__, __LINE__, (cg), (ctrl_name))
+
+enum tst_cgroup_ver tst_cgroup_ver(const char *const file, const int lineno,
+				   const struct tst_cgroup_group *const cg,
+				   const char *const ctrl_name)
+				   __attribute__ ((nonnull, warn_unused_result));
+
+#define SAFE_CGROUP_HAS(cg, file_name) \
+	safe_cgroup_has(__FILE__, __LINE__, (cg), (file_name))
+
+int safe_cgroup_has(const char *const file, const int lineno,
+		    const struct tst_cgroup_group *const cg,
+		    const char *const file_name)
+		    __attribute__ ((nonnull, warn_unused_result));
+
+#define SAFE_CGROUP_READ(cg, file_name, out, len)			\
+	safe_cgroup_read(__FILE__, __LINE__,				\
+			 (cg), (file_name), (out), (len))
+
+ssize_t safe_cgroup_read(const char *const file, const int lineno,
+			 const struct tst_cgroup_group *const cg,
+			 const char *const file_name,
+			 char *const out, const size_t len)
+			 __attribute__ ((nonnull));
+
+#define SAFE_CGROUP_PRINTF(cg, file_name, fmt, ...)			\
+	safe_cgroup_printf(__FILE__, __LINE__,				\
+			   (cg), (file_name), (fmt), __VA_ARGS__)
 
-/* To mount/umount specified cgroup controller on 'cgroup_dir' path */
-void tst_cgroup_mount(enum tst_cgroup_ctrl ctrl, const char *cgroup_dir);
-void tst_cgroup_umount(const char *cgroup_dir);
+#define SAFE_CGROUP_PRINT(cg, file_name, str)				\
+	safe_cgroup_printf(__FILE__, __LINE__, (cg), (file_name), "%s", (str))
 
-/* To move current process PID to the mounted cgroup tasks */
-void tst_cgroup_move_current(const char *cgroup_dir);
+void safe_cgroup_printf(const char *const file, const int lineno,
+			const struct tst_cgroup_group *const cg,
+			const char *const file_name,
+			const char *const fmt, ...)
+			__attribute__ ((format (printf, 5, 6), nonnull));
 
-/* To set cgroup controller knob with new value */
-void tst_cgroup_set_knob(const char *cgroup_dir, const char *knob, long value);
+#define SAFE_CGROUP_SCANF(cg, file_name, fmt, ...)			\
+	safe_cgroup_scanf(__FILE__, __LINE__,				\
+			  (cg), (file_name), (fmt), __VA_ARGS__)
 
-/* Set of functions to set knobs under the memory controller */
-void tst_cgroup_mem_set_maxbytes(const char *cgroup_dir, long memsz);
-int  tst_cgroup_mem_swapacct_enabled(const char *cgroup_dir);
-void tst_cgroup_mem_set_maxswap(const char *cgroup_dir, long memsz);
+void safe_cgroup_scanf(const char *file, const int lineno,
+		       const struct tst_cgroup_group *const cg,
+		       const char *const file_name,
+		       const char *fmt, ...)
+		       __attribute__ ((format (scanf, 5, 6), nonnull));
 
-/* Set of functions to read/write cpuset controller files content */
-void tst_cgroup_cpuset_read_files(const char *cgroup_dir, const char *filename,
-	char *retbuf, size_t retbuf_sz);
-void tst_cgroup_cpuset_write_files(const char *cgroup_dir, const char *filename,
-	const char *buf);
 
 #endif /* TST_CGROUP_H */
diff --git a/include/tst_test.h b/include/tst_test.h
index 4eee6f897..6ad355506 100644
--- a/include/tst_test.h
+++ b/include/tst_test.h
@@ -39,7 +39,6 @@
 #include "tst_capability.h"
 #include "tst_hugepage.h"
 #include "tst_assert.h"
-#include "tst_cgroup.h"
 #include "tst_lockdown.h"
 #include "tst_fips.h"
 #include "tst_taint.h"
diff --git a/lib/tst_cgroup.c b/lib/tst_cgroup.c
index 96c9524d2..a5a9f35e7 100644
--- a/lib/tst_cgroup.c
+++ b/lib/tst_cgroup.c
@@ -2,453 +2,1123 @@
 /*
  * Copyright (c) 2020 Red Hat, Inc.
  * Copyright (c) 2020 Li Wang <liwang@redhat.com>
+ * Copyright (c) 2020-2021 SUSE LLC <rpalethorpe@suse.com>
  */
 
 #define TST_NO_DEFAULT_MAIN
 
 #include <stdio.h>
+#include <stddef.h>
 #include <stdlib.h>
+#include <mntent.h>
 #include <sys/mount.h>
-#include <fcntl.h>
-#include <unistd.h>
 
 #include "tst_test.h"
-#include "tst_safe_macros.h"
-#include "tst_safe_stdio.h"
+#include "lapi/fcntl.h"
+#include "lapi/mount.h"
+#include "lapi/mkdirat.h"
+#include "tst_safe_file_at.h"
 #include "tst_cgroup.h"
-#include "tst_device.h"
 
-static enum tst_cgroup_ver tst_cg_ver;
-static int clone_children;
+struct cgroup_root;
 
-static int tst_cgroup_check(const char *cgroup)
+/* A node in a single CGroup hierarchy. It exists mainly for
+ * convenience so that we do not have to traverse the CGroup structure
+ * for frequent operations.
+ *
+ * This is actually a single-linked list not a tree. We only need to
+ * traverse from leaf towards root.
+ */
+struct cgroup_dir {
+	const char *dir_name;
+	const struct cgroup_dir *dir_parent;
+
+	/* Shortcut to root */
+	const struct cgroup_root *dir_root;
+
+	/* Subsystems (controllers) bit field. Only controllers which
+	 * were required and configured for this node are added to
+	 * this field. So it may be different from root->css_field.
+	 */
+	uint32_t ctrl_field;
+
+	/* In general we avoid having sprintfs everywhere and use
+	 * openat, linkat, etc.
+	 */
+	int dir_fd;
+
+	int we_created_it:1;
+};
+
+/* The root of a CGroup hierarchy/tree */
+struct cgroup_root {
+	enum tst_cgroup_ver ver;
+	/* A mount path */
+	char mnt_path[PATH_MAX];
+	/* Subsystems (controllers) bit field. Includes all
+	 * controllers found while scanning this root.
+	 */
+	uint32_t ctrl_field;
+
+	/* CGroup hierarchy: mnt -> ltp -> {drain, test -> ??? } We
+	 * keep a flat reference to ltp, drain and test for
+	 * convenience.
+	 */
+
+	/* Mount directory */
+	struct cgroup_dir mnt_dir;
+	/* LTP CGroup directory, contains drain and test dirs */
+	struct cgroup_dir ltp_dir;
+	/* Drain CGroup, see cgroup_cleanup */
+	struct cgroup_dir drain_dir;
+	/* CGroup for current test. Which may have children. */
+	struct cgroup_dir test_dir;
+
+	int we_mounted_it:1;
+	/* cpuset is in compatability mode */
+	int no_cpuset_prefix:1;
+};
+
+/* Controller sub-systems */
+enum cgroup_ctrl_indx {
+	CTRL_MEMORY = 1,
+	CTRL_CPUSET = 2,
+};
+#define CTRLS_MAX CTRL_CPUSET
+
+/* At most we can have one cgroup V1 tree for each controller and one
+ * (empty) v2 tree.
+ */
+#define ROOTS_MAX (CTRLS_MAX + 1)
+
+/* Describes a controller file or knob
+ *
+ * The primary purpose of this is to map V2 names to V1
+ * names.
+ */
+struct cgroup_file {
+	/* Canonical name. Is the V2 name unless an item is V1 only */
+	const char *const file_name;
+	/* V1 name or NULL if item is V2 only */
+	const char *const file_name_v1;
+
+	/* The controller this item belongs to or zero for
+	 * 'cgroup.<item>'.
+	 */
+	const enum cgroup_ctrl_indx ctrl_indx;
+};
+
+/* Describes a Controller or subsystem
+ *
+ * Internally the kernel seems to call controllers subsystems and uses
+ * the abbreviations subsys and css.
+ */
+struct cgroup_ctrl {
+	/* Userland name of the controller (e.g. 'memory' not 'memcg') */
+	const char *const ctrl_name;
+	/* List of files belonging to this controller */
+	const struct cgroup_file *const files;
+	/* Our index for the controller */
+	const enum cgroup_ctrl_indx ctrl_indx;
+
+	/* Runtime; hierarchy the controller is attached to */
+        struct cgroup_root *ctrl_root;
+	/* Runtime; whether we required the controller */
+	int we_require_it:1;
+};
+
+struct tst_cgroup_group {
+	char group_name[NAME_MAX + 1];
+	/* Maps controller ID to the tree which contains it. The V2
+	 * tree is at zero even if it contains no controllers.
+	 */
+	struct cgroup_dir *dirs_by_ctrl[ROOTS_MAX];
+	/* NULL terminated list of trees */
+	struct cgroup_dir *dirs[ROOTS_MAX + 1];
+};
+
+/* Always use first item for unified hierarchy */
+static struct cgroup_root roots[ROOTS_MAX + 1];
+
+/* Lookup tree for item names. */
+typedef struct cgroup_file files_t[];
+
+static const files_t cgroup_ctrl_files = {
+	/* procs exists on V1, however it was read-only until kernel v3.0. */
+	{ "cgroup.procs", "tasks", 0 },
+	{ "cgroup.subtree_control", NULL, 0 },
+	{ "cgroup.clone_children", "clone_children", 0 },
+	{ }
+};
+
+static const files_t memory_ctrl_files = {
+	{ "memory.current", "memory.usage_in_bytes", CTRL_MEMORY },
+	{ "memory.max", "memory.limit_in_bytes", CTRL_MEMORY },
+	{ "memory.swappiness", "memory.swappiness", CTRL_MEMORY },
+	{ "memory.swap.current", "memory.memsw.usage_in_bytes", CTRL_MEMORY },
+	{ "memory.swap.max", "memory.memsw.limit_in_bytes", CTRL_MEMORY },
+	{ "memory.kmem.usage_in_bytes", "memory.kmem.usage_in_bytes", CTRL_MEMORY },
+	{ "memory.kmem.limit_in_bytes", "memory.kmem.usage_in_bytes", CTRL_MEMORY },
+	{ }
+};
+
+static const files_t cpuset_ctrl_files = {
+	{ "cpuset.cpus", "cpuset.cpus", CTRL_CPUSET },
+	{ "cpuset.mems", "cpuset.mems", CTRL_CPUSET },
+	{ "cpuset.memory_migrate", "cpuset.memory_migrate", CTRL_CPUSET },
+	{ }
+};
+
+static struct cgroup_ctrl controllers[] = {
+	[0] = { "cgroup", cgroup_ctrl_files, 0, NULL, 0 },
+	[CTRL_MEMORY] = {
+		"memory", memory_ctrl_files, CTRL_MEMORY, NULL, 0
+	},
+	[CTRL_CPUSET] = {
+		"cpuset", cpuset_ctrl_files, CTRL_CPUSET, NULL, 0
+	},
+	{ }
+};
+
+static const struct tst_cgroup_opts default_opts = { 0 };
+
+/* We should probably allow these to be set in environment
+ * variables */
+static const char *ltp_cgroup_dir = "ltp";
+static const char *ltp_cgroup_drain_dir = "drain";
+static char test_cgroup_dir[NAME_MAX + 1];
+static const char *ltp_mount_prefix = "/tmp/cgroup_";
+static const char *ltp_v2_mount = "unified";
+
+#define first_root				\
+	(roots[0].ver ? roots : roots + 1)
+#define for_each_root(r)			\
+	for ((r) = first_root; (r)->ver; (r)++)
+#define for_each_v1_root(r)			\
+	for ((r) = roots + 1; (r)->ver; (r)++)
+#define for_each_ctrl(ctrl)			\
+	for ((ctrl) = controllers + 1; (ctrl)->ctrl_name; (ctrl)++)
+
+/* In all cases except one, this only loops once.
+ *
+ * If (ctrl) == 0 and multiple V1 (and a V2) hierarchies are mounted,
+ * then we need to loop over multiple directories. For example if we
+ * need to write to "tasks"/"cgroup.procs" which exists for each
+ * hierarchy.
+ */
+#define for_each_dir(cg, ctrl, t)					\
+	for ((t) = (ctrl) ? (cg)->dirs_by_ctrl + (ctrl) : (cg)->dirs;	\
+	     *(t);							\
+	     (t) = (ctrl) ? (cg)->dirs + ROOTS_MAX : (t) + 1)
+
+__attribute__ ((nonnull))
+static int has_ctrl(const uint32_t ctrl_field,
+		    const struct cgroup_ctrl *const ctrl)
 {
-	char line[PATH_MAX];
-	FILE *file;
-	int cg_check = 0;
+	return !!(ctrl_field & (1 << ctrl->ctrl_indx));
+}
 
-	file = SAFE_FOPEN("/proc/filesystems", "r");
-	while (fgets(line, sizeof(line), file)) {
-		if (strstr(line, cgroup) != NULL) {
-			cg_check = 1;
-			break;
-		}
+__attribute__ ((nonnull))
+static void add_ctrl(uint32_t *const ctrl_field,
+		     const struct cgroup_ctrl *const ctrl)
+{
+	*ctrl_field |= 1 << ctrl->ctrl_indx;
+}
+
+__attribute__ ((warn_unused_result))
+struct cgroup_root *tst_cgroup_root_get(void)
+{
+	return roots[0].ver ? roots : roots + 1;
+}
+
+static int cgroup_v2_mounted(void)
+{
+	return !!roots[0].ver;
+}
+
+static int cgroup_v1_mounted(void)
+{
+	return !!roots[1].ver;
+}
+
+static int cgroup_mounted(void)
+{
+	return cgroup_v2_mounted() || cgroup_v1_mounted();
+}
+
+__attribute__ ((nonnull, warn_unused_result))
+static int cgroup_ctrl_on_v2(const struct cgroup_ctrl *const ctrl)
+{
+	return ctrl->ctrl_root && ctrl->ctrl_root->ver == TST_CGROUP_V2;
+}
+
+__attribute__ ((nonnull))
+static void cgroup_dir_mk(const struct cgroup_dir *const parent,
+			  const char *const dir_name,
+			  struct cgroup_dir *const new)
+{
+	const char *dpath;
+
+	new->dir_root = parent->dir_root;
+	new->dir_name = dir_name;
+	new->dir_parent = parent;
+	new->ctrl_field = parent->ctrl_field;
+	new->we_created_it = 0;
+
+	if (!mkdirat(parent->dir_fd, dir_name, 0777)) {
+		new->we_created_it = 1;
+		goto opendir;
+	}
+
+	if (errno == EEXIST)
+		goto opendir;
+
+	dpath = tst_decode_fd(parent->dir_fd);
+
+	if (errno == EACCES) {
+		tst_brk(TCONF | TERRNO,
+			"Lack permission to make '%s/%s'; premake it or run as root",
+			dpath, dir_name);
+	} else {
+		tst_brk(TBROK | TERRNO,
+			"mkdirat(%d<%s>, '%s', 0777)",
+			parent->dir_fd, dpath, dir_name);
 	}
-	SAFE_FCLOSE(file);
 
-	return cg_check;
+opendir:
+	new->dir_fd = SAFE_OPENAT(parent->dir_fd, dir_name,
+				  O_PATH | O_DIRECTORY);
 }
 
-enum tst_cgroup_ver tst_cgroup_version(void)
+void tst_cgroup_print_config(void)
 {
-        enum tst_cgroup_ver cg_ver;
+	struct cgroup_root *root;
+	const struct cgroup_ctrl *ctrl;
 
-        if (tst_cgroup_check("cgroup2")) {
-                if (!tst_is_mounted("cgroup2") && tst_is_mounted("cgroup"))
-                        cg_ver = TST_CGROUP_V1;
-                else
-                        cg_ver = TST_CGROUP_V2;
+	tst_res(TINFO, "Detected Controllers:");
 
-                goto out;
-        }
+	for_each_ctrl(ctrl) {
+		root = ctrl->ctrl_root;
 
-        if (tst_cgroup_check("cgroup"))
-                cg_ver = TST_CGROUP_V1;
+		if (!root)
+			continue;
 
-        if (!cg_ver)
-                tst_brk(TCONF, "Cgroup is not configured");
+		tst_res(TINFO, "\t%.10s %s @ %s:%s",
+			ctrl->ctrl_name,
+			root->no_cpuset_prefix ? "[noprefix]" : "",
+			root->ver == TST_CGROUP_V1 ? "V1" : "V2",
+			root->mnt_path);
+	}
+}
+
+__attribute__ ((nonnull, warn_unused_result))
+static struct cgroup_ctrl *cgroup_find_ctrl(const char *const ctrl_name)
+{
+	struct cgroup_ctrl *ctrl = controllers;
 
-out:
-        return cg_ver;
+	while (ctrl->ctrl_name && strcmp(ctrl_name, ctrl->ctrl_name))
+		ctrl++;
+
+	if (!ctrl->ctrl_name)
+		ctrl = NULL;
+
+	return ctrl;
 }
 
-static void tst_cgroup1_mount(const char *name, const char *option,
-			const char *mnt_path, const char *new_path)
+/* Determine if a mounted cgroup hierarchy is unique and record it if so.
+ *
+ * For CGroups V2 this is very simple as there is only one
+ * hierarchy. We just record which controllers are available and check
+ * if this matches what we read from any previous mount points.
+ *
+ * For V1 the set of controllers S is partitioned into sets {P_1, P_2,
+ * ..., P_n} with one or more controllers in each partion. Each
+ * partition P_n can be mounted multiple times, but the same
+ * controller can not appear in more than one partition. Usually each
+ * partition contains a single controller, but, for example, cpu and
+ * cpuacct are often mounted together in the same partiion.
+ *
+ * Each controller partition has its own hierarchy (root) which we
+ * must track and update independently.
+ */
+__attribute__ ((nonnull))
+static void cgroup_root_scan(const char *const mnt_type,
+			     const char *const mnt_dir,
+			     char *const mnt_opts)
 {
-	char knob_path[PATH_MAX];
-	if (tst_is_mounted(mnt_path))
-		goto out;
+	struct cgroup_root *root = roots;
+	const struct cgroup_ctrl *const_ctrl;
+	struct cgroup_ctrl *ctrl;
+	uint32_t ctrl_field = 0;
+	int no_prefix = 0;
+	char buf[BUFSIZ];
+	char *tok;
+	const int mnt_dfd = SAFE_OPEN(mnt_dir, O_PATH | O_DIRECTORY);
+
+	if (!strcmp(mnt_type, "cgroup"))
+		goto v1;
+
+	SAFE_FILE_READAT(mnt_dfd, "cgroup.controllers", buf, sizeof(buf));
+
+	for (tok = strtok(buf, " "); tok; tok = strtok(NULL, " ")) {
+		if ((const_ctrl = cgroup_find_ctrl(tok)))
+			add_ctrl(&ctrl_field, const_ctrl);
+	}
 
-	SAFE_MKDIR(mnt_path, 0777);
-	if (mount(name, mnt_path, "cgroup", 0, option) == -1) {
-		if (errno == ENODEV) {
-			if (rmdir(mnt_path) == -1)
-				tst_res(TWARN | TERRNO, "rmdir %s failed", mnt_path);
-			tst_brk(TCONF,
-				 "Cgroup v1 is not configured in kernel");
-		}
-		tst_brk(TBROK | TERRNO, "mount %s", mnt_path);
+	if (root->ver && ctrl_field == root->ctrl_field)
+		goto discard;
+
+	if (root->ctrl_field)
+		tst_brk(TBROK, "Available V2 controllers are changing between scans?");
+
+	root->ver = TST_CGROUP_V2;
+
+	goto backref;
+
+v1:
+	for (tok = strtok(mnt_opts, ","); tok; tok = strtok(NULL, ",")) {
+		if ((const_ctrl = cgroup_find_ctrl(tok)))
+			add_ctrl(&ctrl_field, const_ctrl);
+
+		no_prefix |= !strcmp("noprefix", tok);
 	}
 
-	/*
-	 * We should assign one or more memory nodes to cpuset.mems and
-	 * cpuset.cpus, otherwise, echo $$ > tasks gives ?ENOSPC: no space
-	 * left on device? when trying to use cpuset.
-	 *
-	 * Or, setting cgroup.clone_children to 1 can help in automatically
-	 * inheriting memory and node setting from parent cgroup when a
-	 * child cgroup is created.
-	 */
-	if (strcmp(option, "cpuset") == 0) {
-		sprintf(knob_path, "%s/cgroup.clone_children", mnt_path);
-		SAFE_FILE_SCANF(knob_path, "%d", &clone_children);
-		SAFE_FILE_PRINTF(knob_path, "%d", 1);
+	if (!ctrl_field)
+		goto discard;
+
+	for_each_v1_root(root) {
+		if (!(ctrl_field & root->ctrl_field))
+			continue;
+
+		if (ctrl_field == root->ctrl_field)
+			goto discard;
+
+		tst_brk(TBROK,
+			"The intersection of two distinct sets of mounted controllers should be null?"
+			"Check '%s' and '%s'", root->mnt_path, mnt_dir);
+	}
+
+	if (root >= roots + ROOTS_MAX) {
+		tst_brk(TBROK,
+			"Unique controller mounts have exceeded our limit %d?",
+			ROOTS_MAX);
 	}
-out:
-	SAFE_MKDIR(new_path, 0777);
 
-	tst_res(TINFO, "Cgroup(%s) v1 mount at %s success", option, mnt_path);
+	root->ver = TST_CGROUP_V1;
+
+backref:
+	strcpy(root->mnt_path, mnt_dir);
+	root->mnt_dir.dir_root = root;
+	root->mnt_dir.dir_name = root->mnt_path;
+	root->mnt_dir.dir_fd = mnt_dfd;
+	root->ctrl_field = ctrl_field;
+	root->no_cpuset_prefix = no_prefix;
+
+	for_each_ctrl(ctrl) {
+		if (has_ctrl(root->ctrl_field, ctrl))
+			ctrl->ctrl_root = root;
+	}
+
+	return;
+
+discard:
+	close(mnt_dfd);
 }
 
-static void tst_cgroup2_mount(const char *mnt_path, const char *new_path)
+void tst_cgroup_scan(void)
 {
-	if (tst_is_mounted(mnt_path))
-		goto out;
+	struct mntent *mnt;
+	FILE *f = setmntent("/proc/self/mounts", "r");
 
-	SAFE_MKDIR(mnt_path, 0777);
-	if (mount("cgroup2", mnt_path, "cgroup2", 0, NULL) == -1) {
-		if (errno == ENODEV) {
-			if (rmdir(mnt_path) == -1)
-				tst_res(TWARN | TERRNO, "rmdir %s failed", mnt_path);
-			tst_brk(TCONF,
-				 "Cgroup v2 is not configured in kernel");
-		}
-		tst_brk(TBROK | TERRNO, "mount %s", mnt_path);
+	if (!f) {
+		tst_brk(TBROK | TERRNO, "Can't open /proc/self/mounts");
+		return;
 	}
 
-out:
-	SAFE_MKDIR(new_path, 0777);
+	mnt = getmntent(f);
+	if (!mnt) {
+		tst_brk(TBROK | TERRNO, "Can't read mounts or no mounts?");
+		return;
+	}
 
-	tst_res(TINFO, "Cgroup v2 mount at %s success", mnt_path);
+	do {
+                if (strncmp(mnt->mnt_type, "cgroup", 6))
+			continue;
+
+		cgroup_root_scan(mnt->mnt_type, mnt->mnt_dir, mnt->mnt_opts);
+	} while ((mnt = getmntent(f)));
 }
 
-static void tst_cgroupN_umount(const char *mnt_path, const char *new_path)
+static void cgroup_mount_v2(void)
 {
-	FILE *fp;
-	int fd;
-	char s_new[BUFSIZ], s[BUFSIZ], value[BUFSIZ];
-	char knob_path[PATH_MAX];
+	char mnt_path[PATH_MAX];
 
-	if (!tst_is_mounted(mnt_path))
+	sprintf(mnt_path, "%s%s", ltp_mount_prefix, ltp_v2_mount);
+
+	if (!mkdir(mnt_path, 0777)) {
+		roots[0].mnt_dir.we_created_it = 1;
+		goto mount;
+	}
+
+	if (errno == EEXIST)
+		goto mount;
+
+	if (errno == EACCES) {
+		tst_res(TINFO | TERRNO,
+			"Lack permission to make %s, premake it or run as root",
+			mnt_path);
 		return;
+	}
 
-	/* Move all processes in task(v2: cgroup.procs) to its parent node. */
-	if (tst_cg_ver & TST_CGROUP_V1)
-		sprintf(s, "%s/tasks", mnt_path);
-	if (tst_cg_ver & TST_CGROUP_V2)
-		sprintf(s, "%s/cgroup.procs", mnt_path);
-
-	fd = open(s, O_WRONLY);
-	if (fd == -1)
-		tst_res(TWARN | TERRNO, "open %s", s);
-
-	if (tst_cg_ver & TST_CGROUP_V1)
-		snprintf(s_new, BUFSIZ, "%s/tasks", new_path);
-	if (tst_cg_ver & TST_CGROUP_V2)
-		snprintf(s_new, BUFSIZ, "%s/cgroup.procs", new_path);
-
-	fp = fopen(s_new, "r");
-	if (fp == NULL)
-		tst_res(TWARN | TERRNO, "fopen %s", s_new);
-	if ((fd != -1) && (fp != NULL)) {
-		while (fgets(value, BUFSIZ, fp) != NULL)
-			if (write(fd, value, strlen(value) - 1)
-			    != (ssize_t)strlen(value) - 1)
-				tst_res(TWARN | TERRNO, "write %s", s);
-	}
-	if (tst_cg_ver & TST_CGROUP_V1) {
-		sprintf(knob_path, "%s/cpuset.cpus", mnt_path);
-		if (!access(knob_path, F_OK)) {
-			sprintf(knob_path, "%s/cgroup.clone_children", mnt_path);
-			SAFE_FILE_PRINTF(knob_path, "%d", clone_children);
-		}
+	tst_brk(TBROK | TERRNO, "mkdir(%s, 0777)", mnt_path);
+	return;
+
+mount:
+	if (!mount("cgroup2", mnt_path, "cgroup2", 0, NULL)) {
+		tst_res(TINFO, "Mounted V2 CGroups on %s", mnt_path);
+		tst_cgroup_scan();
+		roots[0].we_mounted_it = 1;
+		return;
 	}
-	if (fd != -1)
-		close(fd);
-	if (fp != NULL)
-		fclose(fp);
-	if (rmdir(new_path) == -1)
-		tst_res(TWARN | TERRNO, "rmdir %s", new_path);
-	if (umount(mnt_path) == -1)
-		tst_res(TWARN | TERRNO, "umount %s", mnt_path);
-	if (rmdir(mnt_path) == -1)
-		tst_res(TWARN | TERRNO, "rmdir %s", mnt_path);
-
-	if (tst_cg_ver & TST_CGROUP_V1)
-		tst_res(TINFO, "Cgroup v1 unmount success");
-	if (tst_cg_ver & TST_CGROUP_V2)
-		tst_res(TINFO, "Cgroup v2 unmount success");
-}
-
-struct tst_cgroup_path {
-	char *mnt_path;
-	char *new_path;
-	struct tst_cgroup_path *next;
-};
 
-static struct tst_cgroup_path *tst_cgroup_paths;
+	tst_res(TINFO | TERRNO, "Could not mount V2 CGroups on %s", mnt_path);
+
+	if (roots[0].mnt_dir.we_created_it) {
+		roots[0].mnt_dir.we_created_it = 0;
+		SAFE_RMDIR(mnt_path);
+	}
+}
 
-static void tst_cgroup_set_path(const char *cgroup_dir)
+__attribute__ ((nonnull))
+static void cgroup_mount_v1(struct cgroup_ctrl *const ctrl)
 {
-	char cgroup_new_dir[PATH_MAX];
-	struct tst_cgroup_path *tst_cgroup_path, *a;
+	char mnt_path[PATH_MAX];
+	int made_dir = 0;
 
-	if (!cgroup_dir)
-		tst_brk(TBROK, "Invalid cgroup dir, plese check cgroup_dir");
+	sprintf(mnt_path, "%s%s", ltp_mount_prefix, ctrl->ctrl_name);
 
-	sprintf(cgroup_new_dir, "%s/ltp_%d", cgroup_dir, rand());
+	if (!mkdir(mnt_path, 0777)) {
+		made_dir = 1;
+		goto mount;
+	}
 
-	/* To store cgroup path in the 'path' list */
-	tst_cgroup_path = SAFE_MALLOC(sizeof(struct tst_cgroup_path));
-	tst_cgroup_path->mnt_path = SAFE_MALLOC(strlen(cgroup_dir) + 1);
-	tst_cgroup_path->new_path = SAFE_MALLOC(strlen(cgroup_new_dir) + 1);
-	tst_cgroup_path->next = NULL;
+	if (errno == EEXIST)
+		goto mount;
 
-	if (!tst_cgroup_paths) {
-		tst_cgroup_paths = tst_cgroup_path;
-	} else {
-		a = tst_cgroup_paths;
-		do {
-			if (!a->next) {
-				a->next = tst_cgroup_path;
-				break;
-			}
-			a = a->next;
-		} while (a);
+	if (errno == EACCES) {
+		tst_res(TINFO | TERRNO,
+			"Lack permission to make %s, premake it or run as root",
+			mnt_path);
+		return;
 	}
 
-	sprintf(tst_cgroup_path->mnt_path, "%s", cgroup_dir);
-	sprintf(tst_cgroup_path->new_path, "%s", cgroup_new_dir);
+	tst_brk(TBROK | TERRNO, "mkdir(%s, 0777)", mnt_path);
+	return;
+
+mount:
+	if (mount(ctrl->ctrl_name, mnt_path, "cgroup", 0, ctrl->ctrl_name)) {
+		tst_res(TINFO | TERRNO,
+			"Could not mount V1 CGroup on %s", mnt_path);
+
+		if (made_dir)
+			SAFE_RMDIR(mnt_path);
+		return;
+	}
+
+	tst_res(TINFO, "Mounted V1 %s CGroup on %s", ctrl->ctrl_name, mnt_path);
+	tst_cgroup_scan();
+	if (!ctrl->ctrl_root)
+		return;
+
+        ctrl->ctrl_root->we_mounted_it = 1;
+	ctrl->ctrl_root->mnt_dir.we_created_it = made_dir;
+
+	if (ctrl->ctrl_indx == CTRL_MEMORY) {
+		SAFE_FILE_PRINTFAT(ctrl->ctrl_root->mnt_dir.dir_fd,
+				   "memory.use_hierarchy", "%d", 1);
+	}
+}
+
+__attribute__ ((nonnull))
+static void cgroup_copy_cpuset(const struct cgroup_root *const root)
+{
+	char knob_val[BUFSIZ];
+	int i;
+	const char *const n0[] = {"mems", "cpus"};
+	const char *const n1[] = {"cpuset.mems", "cpuset.cpus"};
+	const char *const *const fname = root->no_cpuset_prefix ? n0 : n1;
+
+	for (i = 0; i < 2; i++) {
+		SAFE_FILE_READAT(root->mnt_dir.dir_fd,
+				 fname[i], knob_val, sizeof(knob_val));
+		SAFE_FILE_PRINTFAT(root->ltp_dir.dir_fd,
+				   fname[i], "%s", knob_val);
+	}
 }
 
-static char *tst_cgroup_get_path(const char *cgroup_dir)
+/* Ensure the specified controller is available.
+ *
+ * First we check if the specified controller has a known mount point,
+ * if not then we scan the system. If we find it then we goto ensuring
+ * the LTP group exists in the hierarchy the controller is using.
+ *
+ * If we can't find the controller, then we try to create it. First we
+ * check if the V2 hierarchy/tree is mounted. If it isn't then we try
+ * mounting it and look for the controller. If it is already mounted
+ * then we know the controller is not available on V2 on this system.
+ *
+ * If we can't mount V2 or the controller is not on V2, then we try
+ * mounting it on its own V1 tree.
+ *
+ * Once we have mounted the controller somehow, we create a hierarchy
+ * of cgroups. If we are on V2 we first need to enable the controller
+ * for all children of root. Then we create hierarchy described in
+ * tst_cgroup.h.
+ *
+ * If we are using V1 cpuset then we copy the available mems and cpus
+ * from root to the ltp group and set clone_children on the ltp group
+ * to distribute these settings to the test cgroups. This means the
+ * test author does not have to copy these settings before using the
+ * cpuset.
+ *
+ */
+void tst_cgroup_require(const char *const ctrl_name,
+			const struct tst_cgroup_opts *options)
 {
-	struct tst_cgroup_path *a;
+	const char *const cgsc = "cgroup.subtree_control";
+	struct cgroup_ctrl *const ctrl = cgroup_find_ctrl(ctrl_name);
+	struct cgroup_root *root;
 
-	if (!tst_cgroup_paths)
-		return NULL;
+	if (!options)
+		options = &default_opts;
 
-	a = tst_cgroup_paths;
+	if (ctrl->we_require_it) {
+		tst_res(TWARN, "Duplicate tst_cgroup_require(%s, )",
+			ctrl->ctrl_name);
+	}
+	ctrl->we_require_it = 1;
+
+	if (ctrl->ctrl_root)
+		goto mkdirs;
+
+	tst_cgroup_scan();
+
+	if (ctrl->ctrl_root)
+		goto mkdirs;
 
-	while (strcmp(a->mnt_path, cgroup_dir) != 0){
-		if (!a->next) {
-			tst_res(TINFO, "%s is not found", cgroup_dir);
-			return NULL;
+	if (!cgroup_v2_mounted() && !options->only_mount_v1)
+		cgroup_mount_v2();
+
+	if (ctrl->ctrl_root)
+		goto mkdirs;
+
+	cgroup_mount_v1(ctrl);
+
+	if (!ctrl->ctrl_root) {
+		tst_brk(TCONF,
+			"'%s' controller required, but not available",
+			ctrl->ctrl_name);
+		return;
+	}
+
+mkdirs:
+	root = ctrl->ctrl_root;
+	add_ctrl(&root->mnt_dir.ctrl_field, ctrl);
+
+	if (cgroup_ctrl_on_v2(ctrl)) {
+		if (root->we_mounted_it) {
+			SAFE_FILE_PRINTFAT(root->mnt_dir.dir_fd,
+					   cgsc, "+%s", ctrl->ctrl_name);
+		} else {
+			tst_file_printfat(root->mnt_dir.dir_fd,
+					  cgsc, "+%s", ctrl->ctrl_name);
 		}
-		a = a->next;
-	};
+	}
+
+	if (!root->ltp_dir.dir_fd)
+		cgroup_dir_mk(&root->mnt_dir, ltp_cgroup_dir, &root->ltp_dir);
+	else
+		root->ltp_dir.ctrl_field |= root->mnt_dir.ctrl_field;
 
-	return a->new_path;
+	if (cgroup_ctrl_on_v2(ctrl)) {
+		SAFE_FILE_PRINTFAT(root->ltp_dir.dir_fd,
+				   cgsc, "+%s", ctrl->ctrl_name);
+	} else {
+		SAFE_FILE_PRINTFAT(root->ltp_dir.dir_fd,
+				   "cgroup.clone_children", "%d", 1);
+
+		if (ctrl->ctrl_indx == CTRL_CPUSET)
+			cgroup_copy_cpuset(root);
+	}
+
+	cgroup_dir_mk(&root->ltp_dir, ltp_cgroup_drain_dir, &root->drain_dir);
+
+	sprintf(test_cgroup_dir, "test-%d", getpid());
+	cgroup_dir_mk(&root->ltp_dir, test_cgroup_dir, &root->test_dir);
 }
 
-static void tst_cgroup_del_path(const char *cgroup_dir)
+static void cgroup_drain(const enum tst_cgroup_ver ver,
+			 const int source_dfd, const int dest_dfd)
 {
-	struct tst_cgroup_path *a, *b;
+	char pid_list[BUFSIZ];
+	char *tok;
+	const char *const file_name =
+		ver == TST_CGROUP_V1 ? "tasks" : "cgroup.procs";
+	int fd;
+	ssize_t ret;
 
-	if (!tst_cgroup_paths)
+	ret = SAFE_FILE_READAT(source_dfd, file_name,
+			       pid_list, sizeof(pid_list));
+	if (ret < 0)
 		return;
 
-	a = b = tst_cgroup_paths;
+	fd = SAFE_OPENAT(dest_dfd, file_name, O_WRONLY);
+	if (fd < 0)
+		return;
 
-	while (strcmp(b->mnt_path, cgroup_dir) != 0) {
-		if (!b->next) {
-			tst_res(TINFO, "%s is not found", cgroup_dir);
-			return;
-		}
-		a = b;
-		b = b->next;
-	};
+	for (tok = strtok(pid_list, "\n"); tok; tok = strtok(NULL, "\n")) {
+		ret = dprintf(fd, "%s", tok);
 
-	if (b == tst_cgroup_paths)
-		tst_cgroup_paths = b->next;
-	else
-		a->next = b->next;
+		if (ret < (ssize_t)strlen(tok))
+			tst_brk(TBROK | TERRNO, "Failed to drain %s", tok);
+	}
+	SAFE_CLOSE(fd);
+}
 
-	free(b->mnt_path);
-	free(b->new_path);
-	free(b);
+__attribute__ ((nonnull))
+static void close_path_fds(struct cgroup_root *const root)
+{
+	if (root->test_dir.dir_fd > 0)
+		SAFE_CLOSE(root->test_dir.dir_fd);
+	if (root->ltp_dir.dir_fd > 0)
+		SAFE_CLOSE(root->ltp_dir.dir_fd);
+	if (root->drain_dir.dir_fd > 0)
+		SAFE_CLOSE(root->drain_dir.dir_fd);
+	if (root->mnt_dir.dir_fd > 0)
+		SAFE_CLOSE(root->mnt_dir.dir_fd);
 }
 
-void tst_cgroup_mount(enum tst_cgroup_ctrl ctrl, const char *cgroup_dir)
+/* Maybe remove CGroups used during testing and clear our data
+ *
+ * This will never remove CGroups we did not create to allow tests to
+ * be run in parallel.
+ *
+ * Each test process is given its own unique CGroup. Unless we want to
+ * stress test the CGroup system. We should at least remove these
+ * unique per test CGroups.
+ *
+ * We probably also want to remove the LTP parent CGroup, although
+ * this may have been created by the system manager or another test
+ * (see notes on parallel testing).
+ *
+ * On systems with no initial CGroup setup we may try to destroy the
+ * CGroup roots we mounted so that they can be recreated by another
+ * test. Note that successfully unmounting a CGroup root does not
+ * necessarily indicate that it was destroyed.
+ *
+ * The ltp/drain CGroup is required for cleaning up test CGroups when
+ * we can not move them to the root CGroup. CGroups can only be
+ * removed when they have no members and only leaf or root CGroups may
+ * have processes within them. As test processes create and destroy
+ * their own CGroups they must move themselves either to root or
+ * another leaf CGroup. So we move them to drain while destroying the
+ * unique test CGroup.
+ *
+ * If we have access to root and created the LTP CGroup we then move
+ * the test process to root and destroy the drain and LTP
+ * CGroups. Otherwise we just leave the test process to die in the
+ * drain, much like many a unwanted terrapin.
+ *
+ * Finally we clear any data we have collected on CGroups. This will
+ * happen regardless of whether anything was removed.
+ */
+void tst_cgroup_cleanup(void)
 {
-	char *cgroup_new_dir;
-	char knob_path[PATH_MAX];
+	struct cgroup_root *root;
+	struct cgroup_ctrl *ctrl;
 
-	tst_cg_ver = tst_cgroup_version();
+	if (!cgroup_mounted())
+		goto clear_data;
 
-	tst_cgroup_set_path(cgroup_dir);
-	cgroup_new_dir = tst_cgroup_get_path(cgroup_dir);
+	for_each_root(root) {
+		if (!root->test_dir.dir_name)
+			continue;
 
-	if (tst_cg_ver & TST_CGROUP_V1) {
-		switch(ctrl) {
-		case TST_CGROUP_MEMCG:
-			tst_cgroup1_mount("memcg", "memory", cgroup_dir, cgroup_new_dir);
-		break;
-		case TST_CGROUP_CPUSET:
-			tst_cgroup1_mount("cpusetcg", "cpuset", cgroup_dir, cgroup_new_dir);
-		break;
-		default:
-			tst_brk(TBROK, "Invalid cgroup controller: %d", ctrl);
-		}
+		cgroup_drain(root->ver,
+			     root->test_dir.dir_fd, root->drain_dir.dir_fd);
+		SAFE_UNLINKAT(root->ltp_dir.dir_fd, root->test_dir.dir_name,
+			      AT_REMOVEDIR);
 	}
 
-	if (tst_cg_ver & TST_CGROUP_V2) {
-		tst_cgroup2_mount(cgroup_dir, cgroup_new_dir);
+	for_each_root(root) {
+		if (!root->ltp_dir.we_created_it)
+			continue;
 
-		switch(ctrl) {
-		case TST_CGROUP_MEMCG:
-			sprintf(knob_path, "%s/cgroup.subtree_control", cgroup_dir);
-			SAFE_FILE_PRINTF(knob_path, "%s", "+memory");
-		break;
-		case TST_CGROUP_CPUSET:
-			tst_brk(TCONF, "Cgroup v2 hasn't achieve cpuset subsystem");
-		break;
-		default:
-			tst_brk(TBROK, "Invalid cgroup controller: %d", ctrl);
+		cgroup_drain(root->ver,
+			     root->drain_dir.dir_fd, root->mnt_dir.dir_fd);
+
+		if (root->drain_dir.dir_name) {
+			SAFE_UNLINKAT(root->ltp_dir.dir_fd,
+				      root->drain_dir.dir_name, AT_REMOVEDIR);
 		}
+
+		if (root->ltp_dir.dir_name) {
+			SAFE_UNLINKAT(root->mnt_dir.dir_fd,
+				      root->ltp_dir.dir_name, AT_REMOVEDIR);
+		}
+	}
+
+	for_each_ctrl(ctrl) {
+		if (!cgroup_ctrl_on_v2(ctrl) || !ctrl->ctrl_root->we_mounted_it)
+			continue;
+
+		SAFE_FILE_PRINTFAT(ctrl->ctrl_root->mnt_dir.dir_fd,
+				   "cgroup.subtree_control",
+				   "-%s", ctrl->ctrl_name);
+	}
+
+	for_each_root(root) {
+		if (!root->we_mounted_it)
+			continue;
+
+		/* This probably does not result in the CGroup root
+		 * being destroyed */
+		if (umount2(root->mnt_path, MNT_DETACH))
+			continue;
+
+		SAFE_RMDIR(root->mnt_path);
+	}
+
+clear_data:
+	for_each_ctrl(ctrl) {
+		ctrl->ctrl_root = NULL;
+		ctrl->we_require_it = 0;
 	}
+
+	for_each_root(root)
+		close_path_fds(root);
+
+	memset(roots, 0, sizeof(roots));
 }
 
-void tst_cgroup_umount(const char *cgroup_dir)
+__attribute__ ((nonnull (1)))
+static void cgroup_group_init(struct tst_cgroup_group *const cg,
+			      const char *const group_name)
 {
-	char *cgroup_new_dir;
+	memset(cg, 0, sizeof(*cg));
+
+	if (!group_name)
+		return;
+
+	if (strlen(group_name) > NAME_MAX)
+		tst_brk(TBROK, "Group name is too long");
 
-	cgroup_new_dir = tst_cgroup_get_path(cgroup_dir);
-	tst_cgroupN_umount(cgroup_dir, cgroup_new_dir);
-	tst_cgroup_del_path(cgroup_dir);
+	strcpy(cg->group_name, group_name);
 }
 
-void tst_cgroup_set_knob(const char *cgroup_dir, const char *knob, long value)
+__attribute__ ((nonnull))
+static void cgroup_group_add_dir(struct tst_cgroup_group *const cg,
+				 struct cgroup_dir *const dir)
 {
-	char *cgroup_new_dir;
-	char knob_path[PATH_MAX];
+	const struct cgroup_ctrl *ctrl;
+	int i;
 
-	cgroup_new_dir = tst_cgroup_get_path(cgroup_dir);
-	sprintf(knob_path, "%s/%s", cgroup_new_dir, knob);
-	SAFE_FILE_PRINTF(knob_path, "%ld", value);
+	if (dir->dir_root->ver == TST_CGROUP_V2)
+		cg->dirs_by_ctrl[0] = dir;
+
+	for_each_ctrl(ctrl) {
+		if (has_ctrl(dir->ctrl_field, ctrl))
+			cg->dirs_by_ctrl[ctrl->ctrl_indx] = dir;
+	}
+
+	for (i = 0; cg->dirs[i]; i++);
+	cg->dirs[i] = dir;
 }
 
-void tst_cgroup_move_current(const char *cgroup_dir)
+struct tst_cgroup_group *
+tst_cgroup_group_mk(const struct tst_cgroup_group *const parent,
+		    const char *const group_name)
 {
-	if (tst_cg_ver & TST_CGROUP_V1)
-		tst_cgroup_set_knob(cgroup_dir, "tasks", getpid());
+	struct tst_cgroup_group *cg;
+	struct cgroup_dir *const *dir;
+	struct cgroup_dir *new_dir;
+
+	cg = SAFE_MALLOC(sizeof(*cg));
+	cgroup_group_init(cg, group_name);
 
-	if (tst_cg_ver & TST_CGROUP_V2)
-		tst_cgroup_set_knob(cgroup_dir, "cgroup.procs", getpid());
+	for_each_dir(parent, 0, dir) {
+		new_dir = SAFE_MALLOC(sizeof(*new_dir));
+		cgroup_dir_mk(*dir, group_name, new_dir);
+		cgroup_group_add_dir(cg, new_dir);
+	}
+
+	return cg;
 }
 
-void tst_cgroup_mem_set_maxbytes(const char *cgroup_dir, long memsz)
+struct tst_cgroup_group *tst_cgroup_group_rm(struct tst_cgroup_group *const cg)
 {
-	if (tst_cg_ver & TST_CGROUP_V1)
-		tst_cgroup_set_knob(cgroup_dir, "memory.limit_in_bytes", memsz);
+	struct cgroup_dir **dir;
+
+	for_each_dir(cg, 0, dir) {
+		close((*dir)->dir_fd);
+		SAFE_UNLINKAT((*dir)->dir_parent->dir_fd,
+			      (*dir)->dir_name,
+			      AT_REMOVEDIR);
+		free(*dir);
+	}
 
-	if (tst_cg_ver & TST_CGROUP_V2)
-		tst_cgroup_set_knob(cgroup_dir, "memory.max", memsz);
+	free(cg);
+	return NULL;
 }
 
-int tst_cgroup_mem_swapacct_enabled(const char *cgroup_dir)
+__attribute__ ((nonnull, warn_unused_result))
+static const struct cgroup_file *cgroup_file_find(const char *const file,
+						  const int lineno,
+						  const char *const file_name)
 {
-	char *cgroup_new_dir;
-	char knob_path[PATH_MAX];
+	const struct cgroup_file *cfile;
+	const struct cgroup_ctrl *ctrl;
+	char ctrl_name[32];
+	const char *const sep = strchr(file_name, '.');
+	size_t len;
+
+	if (!sep) {
+		tst_brk_(file, lineno, TBROK,
+			 "Invalid file name '%s'; did not find controller separator '.'",
+			 file_name);
+		return NULL;
+	}
 
-	cgroup_new_dir = tst_cgroup_get_path(cgroup_dir);
+	len = sep - file_name;
+	memcpy(ctrl_name, file_name, len);
+	ctrl_name[len] = '\0';
 
-	if (tst_cg_ver & TST_CGROUP_V1) {
-		sprintf(knob_path, "%s/%s",
-				cgroup_new_dir, "/memory.memsw.limit_in_bytes");
+        ctrl = cgroup_find_ctrl(ctrl_name);
 
-		if ((access(knob_path, F_OK) == -1)) {
-			if (errno == ENOENT)
-				tst_res(TCONF, "memcg swap accounting is disabled");
-			else
-				tst_brk(TBROK | TERRNO, "failed to access %s", knob_path);
-		} else {
-			return 1;
-		}
+	if (!ctrl) {
+		tst_brk_(file, lineno, TBROK,
+			 "Did not find controller '%s'\n", ctrl_name);
+		return NULL;
 	}
 
-	if (tst_cg_ver & TST_CGROUP_V2) {
-		sprintf(knob_path, "%s/%s",
-				cgroup_new_dir, "/memory.swap.max");
+	for (cfile = ctrl->files; cfile->file_name; cfile++) {
+		if (!strcmp(file_name, cfile->file_name))
+			break;
+	}
 
-		if ((access(knob_path, F_OK) == -1)) {
-			if (errno == ENOENT)
-				tst_res(TCONF, "memcg swap accounting is disabled");
-			else
-				tst_brk(TBROK | TERRNO, "failed to access %s", knob_path);
-		} else {
+	if (!cfile->file_name) {
+		tst_brk_(file, lineno, TBROK,
+			 "Did not find '%s' in '%s'\n",
+			 file_name, ctrl->ctrl_name);
+		return NULL;
+	}
+
+	return cfile;
+}
+
+enum tst_cgroup_ver tst_cgroup_ver(const char *const file, const int lineno,
+				    const struct tst_cgroup_group *const cg,
+				    const char *const ctrl_name)
+{
+	const struct cgroup_ctrl *const ctrl = cgroup_find_ctrl(ctrl_name);
+	const struct cgroup_dir *dir;
+
+	if (!strcmp(ctrl_name, "cgroup")) {
+		tst_brk_(file, lineno,
+			 TBROK,
+			 "cgroup may be present on both V1 and V2 hierarchies");
+		return 0;
+	}
+
+	if (!ctrl) {
+		tst_brk_(file, lineno,
+			 TBROK, "Unknown controller '%s'", ctrl_name);
+		return 0;
+	}
+
+	dir = cg->dirs_by_ctrl[ctrl->ctrl_indx];
+
+	if (!dir) {
+		tst_brk_(file, lineno,
+			 TBROK, "%s controller not attached to CGroup %s",
+			 ctrl_name, cg->group_name);
+		return 0;
+	}
+
+	return dir->dir_root->ver;
+}
+
+__attribute__ ((nonnull, warn_unused_result))
+static const char *cgroup_file_alias(const struct cgroup_file *const cfile,
+				     const struct cgroup_dir *const dir)
+{
+	if (dir->dir_root->ver != TST_CGROUP_V1)
+		return cfile->file_name;
+
+	if (cfile->ctrl_indx == CTRL_CPUSET &&
+	    dir->dir_root->no_cpuset_prefix &&
+	    cfile->file_name_v1) {
+		return strchr(cfile->file_name_v1, '.') + 1;
+	}
+
+	return cfile->file_name_v1;
+}
+
+int safe_cgroup_has(const char *const file, const int lineno,
+		    const struct tst_cgroup_group *cg,
+		    const char *const file_name)
+{
+	const struct cgroup_file *const cfile =
+		cgroup_file_find(file, lineno, file_name);
+	struct cgroup_dir *const *dir;
+	const char *alias;
+
+	if (!cfile)
+		return 0;
+
+	for_each_dir(cg, cfile->ctrl_indx, dir) {
+		if (!(alias = cgroup_file_alias(cfile, *dir)))
+		    continue;
+
+		if (!faccessat((*dir)->dir_fd, file_name, F_OK, 0))
 			return 1;
-		}
+
+		if (errno == ENOENT)
+			continue;
+
+		tst_brk_(file, lineno, TBROK | TERRNO,
+			 "faccessat(%d<%s>, %s, F_OK, 0)",
+			 (*dir)->dir_fd, tst_decode_fd((*dir)->dir_fd), alias);
 	}
 
 	return 0;
 }
 
-void tst_cgroup_mem_set_maxswap(const char *cgroup_dir, long memsz)
+__attribute__ ((warn_unused_result))
+static struct tst_cgroup_group *cgroup_group_from_roots(const size_t tree_off)
 {
-	if (tst_cg_ver & TST_CGROUP_V1)
-		tst_cgroup_set_knob(cgroup_dir, "memory.memsw.limit_in_bytes", memsz);
+	struct cgroup_root *root;
+	struct cgroup_dir *dir;
+	struct tst_cgroup_group *cg;
+
+	cg = tst_alloc(sizeof(*cg));
+	cgroup_group_init(cg, NULL);
 
-	if (tst_cg_ver & TST_CGROUP_V2)
-		tst_cgroup_set_knob(cgroup_dir, "memory.swap.max", memsz);
+	for_each_root(root) {
+		dir = (typeof(dir))(((char *)root) + tree_off);
+
+		if (dir->ctrl_field)
+			cgroup_group_add_dir(cg, dir);
+	}
+
+	if (cg->dirs[0]) {
+		strncpy(cg->group_name, cg->dirs[0]->dir_name, NAME_MAX);
+		return cg;
+	}
+
+	tst_brk(TBROK,
+		"No CGroups found; maybe you forgot to call tst_cgroup_require?");
+
+	return cg;
 }
 
-void tst_cgroup_cpuset_read_files(const char *cgroup_dir, const char *filename,
-	char *retbuf, size_t retbuf_sz)
+const struct tst_cgroup_group *tst_cgroup_get_test_group(void)
 {
-	int fd;
-	char *cgroup_new_dir;
-	char knob_path[PATH_MAX];
+	return cgroup_group_from_roots(offsetof(struct cgroup_root, test_dir));
+}
 
-	cgroup_new_dir = tst_cgroup_get_path(cgroup_dir);
+const struct tst_cgroup_group *tst_cgroup_get_drain_group(void)
+{
+	return cgroup_group_from_roots(offsetof(struct cgroup_root, drain_dir));
+}
 
-	/*
-	 * try either '/dev/cpuset/XXXX' or '/dev/cpuset/cpuset.XXXX'
-	 * please see Documentation/cgroups/cpusets.txt from kernel src
-	 * for details
-	 */
-	sprintf(knob_path, "%s/%s", cgroup_new_dir, filename);
-	fd = open(knob_path, O_RDONLY);
-	if (fd == -1) {
-		if (errno == ENOENT) {
-			sprintf(knob_path, "%s/cpuset.%s",
-					cgroup_new_dir, filename);
-			fd = SAFE_OPEN(knob_path, O_RDONLY);
-		} else
-			tst_brk(TBROK | TERRNO, "open %s", knob_path);
+ssize_t safe_cgroup_read(const char *const file, const int lineno,
+			 const struct tst_cgroup_group *const cg,
+			 const char *const file_name,
+			 char *const out, const size_t len)
+{
+	const struct cgroup_file *const cfile =
+		cgroup_file_find(file, lineno, file_name);
+	struct cgroup_dir *const *dir;
+	const char *alias;
+	size_t prev_len = 0;
+	char prev_buf[BUFSIZ];
+
+	for_each_dir(cg, cfile->ctrl_indx, dir) {
+		if (!(alias = cgroup_file_alias(cfile, *dir)))
+			continue;
+
+		if (prev_len)
+			memcpy(prev_buf, out, prev_len);
+
+		TEST(safe_file_readat(file, lineno,
+				      (*dir)->dir_fd, alias, out, len));
+		if (TST_RET < 0)
+			continue;
+
+		if (prev_len && memcmp(out, prev_buf, prev_len)) {
+			tst_brk_(file, lineno, TBROK,
+				 "%s has different value across roots",
+				 file_name);
+			break;
+		}
+
+		prev_len = MIN(sizeof(prev_buf), (size_t)TST_RET);
 	}
 
-	memset(retbuf, 0, retbuf_sz);
-	if (read(fd, retbuf, retbuf_sz) < 0)
-		tst_brk(TBROK | TERRNO, "read %s", knob_path);
+	out[MAX(TST_RET, 0)] = '\0';
 
-	close(fd);
+	return TST_RET;
 }
 
-void tst_cgroup_cpuset_write_files(const char *cgroup_dir, const char *filename, const char *buf)
+void safe_cgroup_printf(const char *const file, const int lineno,
+			const struct tst_cgroup_group *cg,
+			const char *const file_name,
+			const char *const fmt, ...)
 {
-	int fd;
-	char *cgroup_new_dir;
-	char knob_path[PATH_MAX];
+	const struct cgroup_file *const cfile =
+		cgroup_file_find(file, lineno, file_name);
+	struct cgroup_dir *const *dir;
+	const char *alias;
+	va_list va;
+
+	for_each_dir(cg, cfile->ctrl_indx, dir) {
+		if (!(alias = cgroup_file_alias(cfile, *dir)))
+		    continue;
+
+		va_start(va, fmt);
+		safe_file_vprintfat(file, lineno,
+				    (*dir)->dir_fd, alias, fmt, va);
+		va_end(va);
+	}
+}
 
-	cgroup_new_dir = tst_cgroup_get_path(cgroup_dir);
+void safe_cgroup_scanf(const char *const file, const int lineno,
+		       const struct tst_cgroup_group *const cg,
+		       const char *const file_name,
+		       const char *const fmt, ...)
+{
+	va_list va;
+	char buf[BUFSIZ];
+	ssize_t len = safe_cgroup_read(file, lineno,
+				       cg, file_name, buf, sizeof(buf));
+	const int conv_cnt = tst_count_scanf_conversions(fmt);
+	int ret;
+
+	if (len < 1)
+		return;
 
-	/*
-	 * try either '/dev/cpuset/XXXX' or '/dev/cpuset/cpuset.XXXX'
-	 * please see Documentation/cgroups/cpusets.txt from kernel src
-	 * for details
-	 */
-	sprintf(knob_path, "%s/%s", cgroup_new_dir, filename);
-	fd = open(knob_path, O_WRONLY);
-	if (fd == -1) {
-		if (errno == ENOENT) {
-			sprintf(knob_path, "%s/cpuset.%s", cgroup_new_dir, filename);
-			fd = SAFE_OPEN(knob_path, O_WRONLY);
-		} else
-			tst_brk(TBROK | TERRNO, "open %s", knob_path);
+	va_start(va, fmt);
+	if ((ret = vsscanf(buf, fmt, va)) < 1) {
+		tst_brk_(file, lineno, TBROK | TERRNO,
+			 "'%s': vsscanf('%s', '%s', ...)", file_name, buf, fmt);
 	}
+	va_end(va);
 
-	SAFE_WRITE(1, fd, buf, strlen(buf));
+	if (conv_cnt == ret)
+		return;
 
-	close(fd);
+	tst_brk_(file, lineno, TBROK,
+		 "'%s': vsscanf('%s', '%s', ..): Less conversions than expected: %d != %d",
+		 file_name, buf, fmt, ret, conv_cnt);
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [LTP] [PATCH v6 4/7] Add new CGroups API library tests
  2021-05-04 13:40 [LTP] [PATCH v6 0/7] CGroup API rewrite Richard Palethorpe
                   ` (2 preceding siblings ...)
  2021-05-04 13:40 ` [LTP] [PATCH v6 3/7] Add new CGroups APIs Richard Palethorpe
@ 2021-05-04 13:40 ` Richard Palethorpe
  2021-05-04 13:40 ` [LTP] [PATCH v6 5/7] docs: Update CGroups API Richard Palethorpe
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 11+ messages in thread
From: Richard Palethorpe @ 2021-05-04 13:40 UTC (permalink / raw)
  To: ltp

Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com>
Reviewed-by: Cyril Hrubis <chrubis@suse.cz>
---
 lib/newlib_tests/.gitignore     |  3 +-
 lib/newlib_tests/test21.c       | 66 ------------------------
 lib/newlib_tests/tst_cgroup01.c | 51 +++++++++++++++++++
 lib/newlib_tests/tst_cgroup02.c | 90 +++++++++++++++++++++++++++++++++
 4 files changed, 143 insertions(+), 67 deletions(-)
 delete mode 100644 lib/newlib_tests/test21.c
 create mode 100644 lib/newlib_tests/tst_cgroup01.c
 create mode 100644 lib/newlib_tests/tst_cgroup02.c

diff --git a/lib/newlib_tests/.gitignore b/lib/newlib_tests/.gitignore
index bebecad8b..b95ead2c2 100644
--- a/lib/newlib_tests/.gitignore
+++ b/lib/newlib_tests/.gitignore
@@ -16,6 +16,8 @@ test15
 test16
 tst_capability01
 tst_capability02
+tst_cgroup01
+tst_cgroup02
 tst_device
 tst_safe_fileops
 tst_res_hexd
@@ -24,7 +26,6 @@ test17
 test18
 test19
 test20
-test21
 test22
 tst_expiration_timer
 test_assert
diff --git a/lib/newlib_tests/test21.c b/lib/newlib_tests/test21.c
deleted file mode 100644
index f29a2f702..000000000
--- a/lib/newlib_tests/test21.c
+++ /dev/null
@@ -1,66 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0-or-later
-/*
- * Copyright (c) 2020 Red Hat, Inc.
- * Copyright (c) 2020 Li Wang <liwang@redhat.com>
- */
-
-/*
- * Tests tst_cgroup.h APIs
- */
-
-#include "tst_test.h"
-#include "tst_cgroup.h"
-
-#define PATH_CGROUP1 "/mnt/liwang1"
-#define PATH_CGROUP2 "/mnt/liwang2"
-#define MEMSIZE 1024 * 1024
-
-static void do_test(void)
-{
-	pid_t pid = SAFE_FORK();
-
-	switch (pid) {
-	case 0:
-		tst_cgroup_move_current(PATH_CGROUP1);
-		tst_cgroup_mem_set_maxbytes(PATH_CGROUP1, MEMSIZE);
-		tst_cgroup_mem_set_maxswap(PATH_CGROUP1, MEMSIZE);
-
-		tst_cgroup_move_current(PATH_CGROUP2);
-
-	break;
-	default:
-		tst_cgroup_move_current(PATH_TMP_CG_CST);
-
-		tst_cgroup_move_current(PATH_TMP_CG_MEM);
-		tst_cgroup_mem_set_maxbytes(PATH_TMP_CG_MEM, MEMSIZE);
-		tst_cgroup_mem_set_maxswap(PATH_TMP_CG_MEM, MEMSIZE);
-	break;
-	}
-
-	tst_res(TPASS, "Cgroup mount test");
-}
-
-static void setup(void)
-{
-	tst_cgroup_mount(TST_CGROUP_MEMCG, PATH_TMP_CG_MEM);
-	tst_cgroup_mount(TST_CGROUP_MEMCG, PATH_CGROUP1);
-
-	tst_cgroup_mount(TST_CGROUP_CPUSET, PATH_TMP_CG_CST);
-	tst_cgroup_mount(TST_CGROUP_CPUSET, PATH_CGROUP2);
-}
-
-static void cleanup(void)
-{
-	tst_cgroup_umount(PATH_TMP_CG_MEM);
-	tst_cgroup_umount(PATH_CGROUP1);
-
-	tst_cgroup_umount(PATH_TMP_CG_CST);
-	tst_cgroup_umount(PATH_CGROUP2);
-}
-
-static struct tst_test test = {
-	.test_all = do_test,
-	.setup = setup,
-	.cleanup = cleanup,
-	.forks_child = 1,
-};
diff --git a/lib/newlib_tests/tst_cgroup01.c b/lib/newlib_tests/tst_cgroup01.c
new file mode 100644
index 000000000..54a370362
--- /dev/null
+++ b/lib/newlib_tests/tst_cgroup01.c
@@ -0,0 +1,51 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/* Copyright (c) 2021 SUSE LLC */
+
+#include <stdio.h>
+
+#include "tst_test.h"
+#include "tst_cgroup.h"
+
+static char *only_mount_v1;
+static char *no_cleanup;
+static struct tst_option opts[] = {
+	{"v", &only_mount_v1, "-v\tOnly try to mount CGroups V1"},
+	{"n", &no_cleanup, "-n\tLeave CGroups created by test"},
+	{NULL, NULL, NULL},
+};
+struct tst_cgroup_opts cgopts;
+
+static void do_test(void)
+{
+	tst_res(TPASS, "pass");
+}
+
+static void setup(void)
+{
+	cgopts.only_mount_v1 = !!only_mount_v1,
+
+	tst_cgroup_scan();
+	tst_cgroup_print_config();
+
+	tst_cgroup_require("memory", &cgopts);
+	tst_cgroup_print_config();
+	tst_cgroup_require("cpuset", &cgopts);
+	tst_cgroup_print_config();
+}
+
+static void cleanup(void)
+{
+	if (no_cleanup) {
+		tst_res(TINFO, "no cleanup");
+	} else {
+		tst_res(TINFO, "cleanup");
+		tst_cgroup_cleanup();
+	}
+}
+
+static struct tst_test test = {
+	.test_all = do_test,
+	.setup = setup,
+	.cleanup = cleanup,
+	.options = opts,
+};
diff --git a/lib/newlib_tests/tst_cgroup02.c b/lib/newlib_tests/tst_cgroup02.c
new file mode 100644
index 000000000..1e2152064
--- /dev/null
+++ b/lib/newlib_tests/tst_cgroup02.c
@@ -0,0 +1,90 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/* Copyright (c) 2021 SUSE LLC */
+
+#include <stdlib.h>
+#include <stdio.h>
+
+#include "tst_test.h"
+#include "tst_cgroup.h"
+
+static char *only_mount_v1;
+static char *no_cleanup;
+static struct tst_option opts[] = {
+	{"v", &only_mount_v1, "-v\tOnly try to mount CGroups V1"},
+	{"n", &no_cleanup, "-n\tLeave CGroups created by test"},
+	{NULL, NULL, NULL},
+};
+static struct tst_cgroup_opts cgopts;
+static const struct tst_cgroup_group *cg;
+static const struct tst_cgroup_group *cg_drain;
+static struct tst_cgroup_group *cg_child;
+
+static void do_test(void)
+{
+	char buf[BUFSIZ];
+	size_t mem;
+
+	if (TST_CGROUP_VER(cg, "memory") != TST_CGROUP_V1)
+		SAFE_CGROUP_PRINT(cg, "cgroup.subtree_control", "+memory");
+	if (TST_CGROUP_VER(cg, "cpuset") != TST_CGROUP_V1)
+		SAFE_CGROUP_PRINT(cg, "cgroup.subtree_control", "+cpuset");
+
+	cg_child = tst_cgroup_group_mk(cg, "child");
+	if (!SAFE_FORK()) {
+		SAFE_CGROUP_PRINTF(cg_child, "cgroup.procs", "%d", getpid());
+
+		SAFE_CGROUP_SCANF(cg_child, "memory.current", "%zu", &mem);
+		tst_res(TPASS, "child/memory.current = %zu", mem);
+		SAFE_CGROUP_PRINTF(cg_child, "memory.max",
+				   "%zu", (1UL << 24) - 1);
+		SAFE_CGROUP_PRINTF(cg_child, "memory.swap.max",
+				   "%zu", 1UL << 31);
+
+		SAFE_CGROUP_READ(cg_child, "cpuset.mems", buf, sizeof(buf));
+		tst_res(TPASS, "child/cpuset.mems = %s", buf);
+		SAFE_CGROUP_PRINT(cg_child, "cpuset.mems", buf);
+
+		exit(0);
+	}
+
+	SAFE_CGROUP_PRINTF(cg, "memory.max", "%zu", (1UL << 24) - 1);
+	SAFE_CGROUP_PRINTF(cg_child, "cgroup.procs", "%d", getpid());
+	SAFE_CGROUP_SCANF(cg, "memory.current", "%zu", &mem);
+	tst_res(TPASS, "memory.current = %zu", mem);
+
+	tst_reap_children();
+	SAFE_CGROUP_PRINTF(cg_drain, "cgroup.procs", "%d", getpid());
+	cg_child = tst_cgroup_group_rm(cg_child);
+}
+
+static void setup(void)
+{
+	cgopts.only_mount_v1 = !!only_mount_v1,
+
+	tst_cgroup_scan();
+	tst_cgroup_print_config();
+
+	tst_cgroup_require("memory", &cgopts);
+	tst_cgroup_require("cpuset", &cgopts);
+
+	cg = tst_cgroup_get_test_group();
+	cg_drain = tst_cgroup_get_drain_group();
+}
+
+static void cleanup(void)
+{
+	if (cg_child) {
+		SAFE_CGROUP_PRINTF(cg_drain, "cgroup.procs", "%d", getpid());
+		cg_child = tst_cgroup_group_rm(cg_child);
+	}
+	if (!no_cleanup)
+		tst_cgroup_cleanup();
+}
+
+static struct tst_test test = {
+	.test_all = do_test,
+	.setup = setup,
+	.cleanup = cleanup,
+	.options = opts,
+	.forks_child = 1,
+};
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [LTP] [PATCH v6 5/7] docs: Update CGroups API
  2021-05-04 13:40 [LTP] [PATCH v6 0/7] CGroup API rewrite Richard Palethorpe
                   ` (3 preceding siblings ...)
  2021-05-04 13:40 ` [LTP] [PATCH v6 4/7] Add new CGroups API library tests Richard Palethorpe
@ 2021-05-04 13:40 ` Richard Palethorpe
  2021-05-04 13:40 ` [LTP] [PATCH v6 6/7] mem: Convert tests to new " Richard Palethorpe
  2021-05-04 13:41 ` [LTP] [PATCH v6 7/7] madvise06: Convert " Richard Palethorpe
  6 siblings, 0 replies; 11+ messages in thread
From: Richard Palethorpe @ 2021-05-04 13:40 UTC (permalink / raw)
  To: ltp

Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com>
Reviewed-by: Cyril Hrubis <chrubis@suse.cz>
---
 doc/test-writing-guidelines.txt | 175 +++++++++++++++++++++++++++++---
 1 file changed, 162 insertions(+), 13 deletions(-)

diff --git a/doc/test-writing-guidelines.txt b/doc/test-writing-guidelines.txt
index a77c114c1..c268b8804 100644
--- a/doc/test-writing-guidelines.txt
+++ b/doc/test-writing-guidelines.txt
@@ -2186,48 +2186,197 @@ the field value of file.
 
 2.2.36 Using Control Group
 ^^^^^^^^^^^^^^^^^^^^^^^^^^
-Some of LTP tests need Control Group in their configuration, tst_cgroup.h provides
-APIs to make cgroup unified mounting at setup phase to be possible. The method?is
-extracted from mem.h with the purpose of?extendible for further developing, and
-trying to compatible the current two versions of cgroup.
 
-Considering there are many differences between cgroup v1 and v2, here we capsulate
-the detail of cgroup mounting in high-level functions, which will be easier to use
-cgroup without caring about too much technical thing.? ?
+Some LTP tests need specific Control Group configurations. tst_cgroup.h provides
+APIs to discover and use CGroups. There are many differences between CGroups API
+V1 and V2. We encapsulate the details of configuring CGroups in high-level
+functions which follow the V2 kernel API. Allowing one to use CGroups without
+caring too much about the current system's configuration.
 
-Also, we do cgroup mount/umount work for the different hierarchy automatically.
+Also, the LTP library will automatically mount/umount and configure the CGroup
+hierarchies if that is required (e.g. if you run the tests from init with no
+system manager).
 
 [source,c]
 -------------------------------------------------------------------------------
 #include "tst_test.h"
+#include "tst_cgroup.h"
+
+static const struct tst_cgroup_group *cg;
 
 static void run(void)
 {
 	...
+	// do test under cgroup
+	...
+}
+
+static void setup(void)
+{
+	tst_cgroup_require("memory", NULL);
+	cg = tst_cgroup_get_test_group();
+	SAFE_CGROUP_PRINTF(cg, "cgroup.procs", "%d", getpid());
+	SAFE_CGROUP_PRINTF(cg, "memory.max", "%lu", MEMSIZE);
+	if (SAFE_CGROUP_HAS(cg, "memory.swap.max"))
+		SAFE_CGROUP_PRINTF(cg, "memory.swap.max", "%zu", memsw);
+}
 
-	tst_cgroup_move_current(PATH_TMP_CG_MEM);
-	tst_cgroup_mem_set_maxbytes(PATH_TMP_CG_MEM, MEMSIZE);
+static void cleanup(void)
+{
+	tst_cgroup_cleanup();
+}
 
-	// do test under cgroup
+struct tst_test test = {
+	.setup = setup,
+	.test_all = run,
+	.cleanup = cleanup,
 	...
+};
+-------------------------------------------------------------------------------
+
+Above, we first ensure the memory controller is available on the
+test's CGroup with 'tst_cgroup_require'. We then get a structure,
+'cg', which represents the test's CGroup. Note that
+'tst_cgroup_get_test_group' should not be called many times, as it is
+allocated in a guarded buffer (See section 2.2.31). Therefor it is
+best to call it once in 'setup' and not 'run' because 'run' may be
+repeated with the '-i' option.
+
+We then write the current processes PID into 'cgroup.procs', which
+moves the current process into the test's CGroup. After which we set
+the maximum memory size by writing to 'memory.max'. If the memory
+controller is mounted on CGroups V1 then the library will actually
+write to 'memory.limit_in_bytes'. As a general rule, if a file exists
+on both CGroup versions, then we use the V2 naming.
+
+Some controller features, such as 'memory.swap', can be
+disabled. Therefor we need to check if they exist before accessing
+them. This can be done with 'SAFE_CGROUP_HAS' which can be called on
+any control file or feature.
+
+Most tests only require setting a few limits similar to the above. In
+such cases the differences between V1 and V2 are hidden. Setup and
+cleanup is also mostly hidden. However things can get much worse.
+
+[source,c]
+-------------------------------------------------------------------------------
+static const struct tst_cgroup_group *cg;
+static const struct tst_cgroup_group *cg_drain;
+static struct tst_cgroup_group *cg_child;
+
+static void run(void)
+{
+	char buf[BUFSIZ];
+	size_t mem = 0;
+
+	cg_child = tst_cgroup_group_mk(cg, "child");
+	SAFE_CGROUP_PRINTF(cg_child, "cgroup.procs", "%d", getpid());
+
+	if (TST_CGROUP_VER(cg, "memory") != TST_CGROUP_V1)
+		SAFE_CGROUP_PRINT(cg, "cgroup.subtree_control", "+memory");
+	if (TST_CGROUP_VER(cg, "cpuset") != TST_CGROUP_V1)
+		SAFE_CGROUP_PRINT(cg, "cgroup.subtree_control", "+cpuset");
+
+	if (!SAFE_FORK()) {
+		SAFE_CGROUP_PRINTF(cg_child, "cgroup.procs", "%d", getpid());
+
+		if (SAFE_CGROUP_HAS(cg_child, "memory.swap"))
+			SAFE_CGROUP_SCANF(cg_child, "memory.swap.current", "%zu", &mem);
+		SAFE_CGROUP_READ(cg_child, "cpuset.mems", buf, sizeof(buf));
+
+		// Do something with cpuset.mems and memory.current values
+		...
+
+		exit(0);
+	}
+
+	tst_reap_children();
+	SAFE_CGROUP_PRINTF(cg_drain, "cgroup.procs", "%d", getpid());
+	cg_child = tst_cgroup_group_rm(cg_child);
 }
 
 static void setup(void)
 {
-	tst_cgroup_mount(TST_CGROUP_MEMCG, PATH_TMP_CG_MEM);
+	tst_cgroup_require("memory", NULL);
+	tst_cgroup_require("cpuset", NULL);
+
+	cg = tst_cgroup_get_test_group();
+	cg_drain = tst_cgroup_get_drain_group();
 }
 
 static void cleanup(void)
 {
-	tst_cgroup_umount(PATH_TMP_CG_MEM);
+	if (cg_child) {
+		SAFE_CGROUP_PRINTF(cg_drain, "cgroup.procs", "%d", getpid());
+		cg_child = tst_cgroup_group_rm(cg_child);
+	}
+
+	tst_cgroup_cleanup();
 }
 
 struct tst_test test = {
+	.setup = setup,
 	.test_all = run,
+	.cleanup = cleanup,
 	...
 };
 -------------------------------------------------------------------------------
 
+Starting with setup; we can see here that we also fetch the 'drain'
+CGroup. This is a shared group (between parallel tests) which may
+contain processes from other tests. It should have default settings and
+these should not be changed by the test. It can be used to remove
+processes from other CGroups incase the hierarchy root is not
+accessible.
+
+In 'run', we first create a child CGroup with 'tst_cgroup_mk'. As we
+create this CGroup in 'run' we should also remove it at the end of
+run. We also need to check if it exists and remove it in cleanup as
+well. Because there are 'SAFE_' functions which may jump to cleanup.
+
+We then move the main test process into the child CGroup. This is
+important as it means that before we destroy the child CGroup we have
+to move the main test process elsewhere. For that we use the 'drain'
+group.
+
+Next we enable the memory and cpuset controller configuration on the
+test CGroup's descendants (i.e. 'cg_child'). This allows each child to
+have its own settings. The file 'cgroup.subtree_control' does not
+exist on V1. Because it is possible to have both V1 and V2 active at
+the same time. We can not simply check if 'subtree_control' exists
+before writing to it. We have to check if a particular controller is
+on V2 before trying to add it to 'subtree_control'. Trying to add a V1
+controller will result in 'ENOENT'.
+
+We then fork a child process and add this to the child CGroup. Within
+the child process we try to read 'memory.swap.current'. It is possible
+that the memory controller was compiled without swap support, so it is
+necessary to check if 'memory.swap' is enabled. That is unless the
+test will never reach the point where 'memory.swap.*' are used without
+swap support.
+
+The parent process waits for the child process to be reaped before
+destroying the child CGroup. So there is no need to transfer the child
+to drain. However the parent process must be moved otherwise we will
+get 'EBUSY' when trying to remove the child CGroup.
+
+Another example of an edge case is the following.
+
+[source,c]
+-------------------------------------------------------------------------------
+	if (TST_CGROUP_VER(cg, "memory") == TST_CGROUP_V1)
+		SAFE_CGROUP_PRINTF(cg, "memory.swap.max", "%lu", ~0UL);
+	else
+		SAFE_CGROUP_PRINT(cg, "memory.swap.max", "max");
+-------------------------------------------------------------------------------
+
+CGroups V2 introduced a feature where 'memory[.swap].max' could be set to
+"max". This does not appear to work on V1 'limit_in_bytes' however. For most
+tests, simply using a large number is sufficient and there is no need to use
+"max". Importantly though, one should be careful to read both the V1 and V2
+kernel docs. The LTP library can not handle all edge cases. It does the minimal
+amount of work to make testing on both V1 and V2 feasible.
+
 2.2.37 Require minimum numbers of CPU for a testcase
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [LTP] [PATCH v6 6/7] mem: Convert tests to new CGroups API
  2021-05-04 13:40 [LTP] [PATCH v6 0/7] CGroup API rewrite Richard Palethorpe
                   ` (4 preceding siblings ...)
  2021-05-04 13:40 ` [LTP] [PATCH v6 5/7] docs: Update CGroups API Richard Palethorpe
@ 2021-05-04 13:40 ` Richard Palethorpe
  2021-05-04 13:41 ` [LTP] [PATCH v6 7/7] madvise06: Convert " Richard Palethorpe
  6 siblings, 0 replies; 11+ messages in thread
From: Richard Palethorpe @ 2021-05-04 13:40 UTC (permalink / raw)
  To: ltp

Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com>
Reviewed-by: Cyril Hrubis <chrubis@suse.cz>
---
 testcases/kernel/mem/cpuset/cpuset01.c | 34 ++++++++++++--------------
 testcases/kernel/mem/include/mem.h     |  2 +-
 testcases/kernel/mem/ksm/ksm02.c       | 14 ++++++++---
 testcases/kernel/mem/ksm/ksm03.c       | 12 ++++++---
 testcases/kernel/mem/ksm/ksm04.c       | 17 +++++++------
 testcases/kernel/mem/lib/mem.c         | 10 +++-----
 testcases/kernel/mem/oom/oom03.c       | 18 ++++++++------
 testcases/kernel/mem/oom/oom04.c       | 19 ++++++++------
 testcases/kernel/mem/oom/oom05.c       | 32 ++++++++++++++----------
 9 files changed, 88 insertions(+), 70 deletions(-)

diff --git a/testcases/kernel/mem/cpuset/cpuset01.c b/testcases/kernel/mem/cpuset/cpuset01.c
index 528c3eddd..66c18f6ab 100644
--- a/testcases/kernel/mem/cpuset/cpuset01.c
+++ b/testcases/kernel/mem/cpuset/cpuset01.c
@@ -35,6 +35,8 @@
 
 #ifdef HAVE_NUMA_V2
 
+static const struct tst_cgroup_group *cg;
+
 volatile int end;
 static int *nodes;
 static int nnodes;
@@ -47,15 +49,14 @@ static long count_cpu(void);
 
 static void test_cpuset(void)
 {
-	int child, i, status;
+	int child, i;
 	unsigned long nmask[MAXNODES / BITS_PER_LONG] = { 0 };
-	char mems[BUFSIZ], buf[BUFSIZ];
+	char buf[BUFSIZ];
 
-	tst_cgroup_cpuset_read_files(PATH_TMP_CG_CST, "cpus", buf, BUFSIZ);
-	tst_cgroup_cpuset_write_files(PATH_TMP_CG_CST, "cpus", buf);
-	tst_cgroup_cpuset_read_files(PATH_TMP_CG_CST, "mems", mems, BUFSIZ);
-	tst_cgroup_cpuset_write_files(PATH_TMP_CG_CST, "mems", mems);
-	tst_cgroup_move_current(PATH_TMP_CG_CST);
+	SAFE_CGROUP_READ(cg, "cpuset.cpus", buf, sizeof(buf));
+	SAFE_CGROUP_PRINT(cg, "cpuset.cpus", buf);
+	SAFE_CGROUP_READ(cg, "cpuset.mems", buf, sizeof(buf));
+	SAFE_CGROUP_PRINT(cg, "cpuset.mems", buf);
 
 	child = SAFE_FORK();
 	if (child == 0) {
@@ -69,33 +70,30 @@ static void test_cpuset(void)
 		exit(mem_hog_cpuset(ncpus > 1 ? ncpus : 1));
 	}
 
-	snprintf(buf, BUFSIZ, "%d", nodes[0]);
-	tst_cgroup_cpuset_write_files(PATH_TMP_CG_CST, "mems", buf);
-	snprintf(buf, BUFSIZ, "%d", nodes[1]);
-	tst_cgroup_cpuset_write_files(PATH_TMP_CG_CST, "mems", buf);
+	SAFE_CGROUP_PRINTF(cg, "cpuset.mems", "%d", nodes[0]);
+	SAFE_CGROUP_PRINTF(cg, "cpuset.mems", "%d", nodes[1]);
 
-	SAFE_WAITPID(child, &status, WUNTRACED | WCONTINUED);
-	if (WEXITSTATUS(status) != 0) {
-		tst_res(TFAIL, "child exit status is %d", WEXITSTATUS(status));
-		return;
-	}
+	tst_reap_children();
 
 	tst_res(TPASS, "cpuset test pass");
 }
 
 static void setup(void)
 {
-	tst_cgroup_mount(TST_CGROUP_CPUSET, PATH_TMP_CG_CST);
+	tst_cgroup_require("cpuset", NULL);
 	ncpus = count_cpu();
 	if (get_allowed_nodes_arr(NH_MEMS | NH_CPUS, &nnodes, &nodes) < 0)
 		tst_brk(TBROK | TERRNO, "get_allowed_nodes_arr");
 	if (nnodes <= 1)
 		tst_brk(TCONF, "requires a NUMA system.");
+
+	cg = tst_cgroup_get_test_group();
+	SAFE_CGROUP_PRINTF(cg, "cgroup.procs", "%d", getpid());
 }
 
 static void cleanup(void)
 {
-	tst_cgroup_umount(PATH_TMP_CG_CST);
+	tst_cgroup_cleanup();
 }
 
 static void sighandler(int signo LTP_ATTRIBUTE_UNUSED)
diff --git a/testcases/kernel/mem/include/mem.h b/testcases/kernel/mem/include/mem.h
index 42b12a230..10712cf0c 100644
--- a/testcases/kernel/mem/include/mem.h
+++ b/testcases/kernel/mem/include/mem.h
@@ -61,7 +61,7 @@ void check_hugepage(void);
 void write_memcg(void);
 
 /* cpuset/memcg - include/tst_cgroup.h */
-void write_cpusets(const char *cgroup_dir, long nd);
+void write_cpusets(const struct tst_cgroup_group *cg, long nd);
 
 /* shared */
 unsigned int get_a_numa_node(void);
diff --git a/testcases/kernel/mem/ksm/ksm02.c b/testcases/kernel/mem/ksm/ksm02.c
index 51f5d4cca..80017df66 100644
--- a/testcases/kernel/mem/ksm/ksm02.c
+++ b/testcases/kernel/mem/ksm/ksm02.c
@@ -59,6 +59,9 @@
 #ifdef HAVE_NUMA_V2
 #include <numaif.h>
 
+static const struct tst_cgroup_group *cg;
+static const struct tst_cgroup_group *cg_drain;
+
 static void verify_ksm(void)
 {
 	unsigned long nmask[MAXNODES / BITS_PER_LONG] = { 0 };
@@ -76,9 +79,10 @@ static void verify_ksm(void)
 	}
 	create_same_memory(size, num, unit);
 
-	write_cpusets(PATH_TMP_CG_CST, node);
-	tst_cgroup_move_current(PATH_TMP_CG_CST);
+	write_cpusets(cg, node);
+	SAFE_CGROUP_PRINTF(cg, "cgroup.procs", "%d", getpid());
 	create_same_memory(size, num, unit);
+	SAFE_CGROUP_PRINTF(cg_drain, "cgroup.procs", "%d", getpid());
 }
 
 static void cleanup(void)
@@ -87,7 +91,7 @@ static void cleanup(void)
 		FILE_PRINTF(PATH_KSM "merge_across_nodes",
 				 "%d", merge_across_nodes);
 
-	tst_cgroup_umount(PATH_TMP_CG_CST);
+	tst_cgroup_cleanup();
 }
 
 static void setup(void)
@@ -103,7 +107,9 @@ static void setup(void)
 		SAFE_FILE_PRINTF(PATH_KSM "merge_across_nodes", "1");
 	}
 
-	tst_cgroup_mount(TST_CGROUP_CPUSET, PATH_TMP_CG_CST);
+	tst_cgroup_require("cpuset", NULL);
+	cg = tst_cgroup_get_test_group();
+	cg_drain = tst_cgroup_get_drain_group();
 }
 
 static struct tst_test test = {
diff --git a/testcases/kernel/mem/ksm/ksm03.c b/testcases/kernel/mem/ksm/ksm03.c
index e9949280e..83b821c81 100644
--- a/testcases/kernel/mem/ksm/ksm03.c
+++ b/testcases/kernel/mem/ksm/ksm03.c
@@ -59,10 +59,10 @@
 #include "mem.h"
 #include "ksm_common.h"
 
+static const struct tst_cgroup_group *cg;
+
 static void verify_ksm(void)
 {
-	tst_cgroup_move_current(PATH_TMP_CG_MEM);
-	tst_cgroup_mem_set_maxbytes(PATH_TMP_CG_MEM, TESTMEM);
 	create_same_memory(size, num, unit);
 }
 
@@ -78,7 +78,11 @@ static void setup(void)
 	}
 
 	parse_ksm_options(opt_sizestr, &size, opt_numstr, &num, opt_unitstr, &unit);
-	tst_cgroup_mount(TST_CGROUP_MEMCG, PATH_TMP_CG_MEM);
+
+	tst_cgroup_require("memory", NULL);
+	cg = tst_cgroup_get_test_group();
+	SAFE_CGROUP_PRINTF(cg, "cgroup.procs", "%d", getpid());
+	SAFE_CGROUP_PRINTF(cg, "memory.max", "%lu", TESTMEM);
 }
 
 static void cleanup(void)
@@ -86,7 +90,7 @@ static void cleanup(void)
 	if (access(PATH_KSM "merge_across_nodes", F_OK) == 0)
 		FILE_PRINTF(PATH_KSM "merge_across_nodes",
 				 "%d", merge_across_nodes);
-	tst_cgroup_umount(PATH_TMP_CG_MEM);
+	tst_cgroup_cleanup();
 }
 
 static struct tst_test test = {
diff --git a/testcases/kernel/mem/ksm/ksm04.c b/testcases/kernel/mem/ksm/ksm04.c
index b4ad41831..65f7e6510 100644
--- a/testcases/kernel/mem/ksm/ksm04.c
+++ b/testcases/kernel/mem/ksm/ksm04.c
@@ -59,6 +59,8 @@
 #ifdef HAVE_NUMA_V2
 #include <numaif.h>
 
+static const struct tst_cgroup_group *cg;
+
 static void verify_ksm(void)
 {
 	unsigned long nmask[MAXNODES / BITS_PER_LONG] = { 0 };
@@ -67,8 +69,7 @@ static void verify_ksm(void)
 	node = get_a_numa_node();
 	set_node(nmask, node);
 
-	tst_cgroup_move_current(PATH_TMP_CG_MEM);
-	tst_cgroup_mem_set_maxbytes(PATH_TMP_CG_MEM, TESTMEM);
+	SAFE_CGROUP_PRINTF(cg, "memory.max", "%lu", TESTMEM);
 
 	if (set_mempolicy(MPOL_BIND, nmask, MAXNODES) == -1) {
 		if (errno != ENOSYS)
@@ -79,8 +80,7 @@ static void verify_ksm(void)
 	}
 	create_same_memory(size, num, unit);
 
-	write_cpusets(PATH_TMP_CG_CST, node);
-	tst_cgroup_move_current(PATH_TMP_CG_CST);
+	write_cpusets(cg, node);
 	create_same_memory(size, num, unit);
 }
 
@@ -90,8 +90,7 @@ static void cleanup(void)
 		FILE_PRINTF(PATH_KSM "merge_across_nodes",
 				 "%d", merge_across_nodes);
 
-	tst_cgroup_umount(PATH_TMP_CG_MEM);
-	tst_cgroup_umount(PATH_TMP_CG_CST);
+	tst_cgroup_cleanup();
 }
 
 static void setup(void)
@@ -107,8 +106,10 @@ static void setup(void)
 
 	parse_ksm_options(opt_sizestr, &size, opt_numstr, &num, opt_unitstr, &unit);
 
-	tst_cgroup_mount(TST_CGROUP_MEMCG, PATH_TMP_CG_MEM);
-	tst_cgroup_mount(TST_CGROUP_CPUSET, PATH_TMP_CG_CST);
+	tst_cgroup_require("memory", NULL);
+	tst_cgroup_require("cpuset", NULL);
+	cg = tst_cgroup_get_test_group();
+	SAFE_CGROUP_PRINTF(cg, "cgroup.procs", "%d", getpid());
 }
 
 static struct tst_test test = {
diff --git a/testcases/kernel/mem/lib/mem.c b/testcases/kernel/mem/lib/mem.c
index 2de3f83a6..9f946b5c9 100644
--- a/testcases/kernel/mem/lib/mem.c
+++ b/testcases/kernel/mem/lib/mem.c
@@ -629,13 +629,11 @@ static void gather_node_cpus(char *cpus, long nd)
 	cpus[strlen(cpus) - 1] = '\0';
 }
 
-void write_cpusets(const char *cgroup_dir, long nd)
+void write_cpusets(const struct tst_cgroup_group *cg, long nd)
 {
-	char buf[BUFSIZ];
 	char cpus[BUFSIZ] = "";
 
-	snprintf(buf, BUFSIZ, "%ld", nd);
-	tst_cgroup_cpuset_write_files(cgroup_dir, "mems", buf);
+	SAFE_CGROUP_PRINTF(cg, "cpuset.mems", "%ld", nd);
 
 	gather_node_cpus(cpus, nd);
 	/*
@@ -644,11 +642,11 @@ void write_cpusets(const char *cgroup_dir, long nd)
 	 * the value of cpuset.cpus.
 	 */
 	if (strlen(cpus) != 0) {
-		tst_cgroup_cpuset_write_files(cgroup_dir, "cpus", cpus);
+		SAFE_CGROUP_PRINT(cg, "cpuset.cpus", cpus);
 	} else {
 		tst_res(TINFO, "No CPUs in the node%ld; "
 				"using only CPU0", nd);
-		tst_cgroup_cpuset_write_files(cgroup_dir, "cpus", "0");
+		SAFE_CGROUP_PRINT(cg, "cpuset.cpus", "0");
 	}
 }
 
diff --git a/testcases/kernel/mem/oom/oom03.c b/testcases/kernel/mem/oom/oom03.c
index fc860c660..939413744 100644
--- a/testcases/kernel/mem/oom/oom03.c
+++ b/testcases/kernel/mem/oom/oom03.c
@@ -36,19 +36,17 @@
 
 #ifdef HAVE_NUMA_V2
 
+static const struct tst_cgroup_group *cg;
+
 static void verify_oom(void)
 {
 #ifdef TST_ABI32
 	tst_brk(TCONF, "test is not designed for 32-bit system.");
 #endif
-
-	tst_cgroup_move_current(PATH_TMP_CG_MEM);
-	tst_cgroup_mem_set_maxbytes(PATH_TMP_CG_MEM, TESTMEM);
-
 	testoom(0, 0, ENOMEM, 1);
 
-	if (tst_cgroup_mem_swapacct_enabled(PATH_TMP_CG_MEM)) {
-		tst_cgroup_mem_set_maxswap(PATH_TMP_CG_MEM, TESTMEM);
+	if (SAFE_CGROUP_HAS(cg, "memory.swap.max")) {
+		SAFE_CGROUP_PRINTF(cg, "memory.swap.max", "%lu", TESTMEM);
 		testoom(0, 1, ENOMEM, 1);
 	}
 
@@ -65,14 +63,18 @@ static void setup(void)
 {
 	overcommit = get_sys_tune("overcommit_memory");
 	set_sys_tune("overcommit_memory", 1, 1);
-	tst_cgroup_mount(TST_CGROUP_MEMCG, PATH_TMP_CG_MEM);
+
+	tst_cgroup_require("memory", NULL);
+	cg = tst_cgroup_get_test_group();
+	SAFE_CGROUP_PRINTF(cg, "cgroup.procs", "%d", getpid());
+	SAFE_CGROUP_PRINTF(cg, "memory.max", "%lu", TESTMEM);
 }
 
 static void cleanup(void)
 {
 	if (overcommit != -1)
 		set_sys_tune("overcommit_memory", overcommit, 0);
-	tst_cgroup_umount(PATH_TMP_CG_MEM);
+	tst_cgroup_cleanup();
 }
 
 static struct tst_test test = {
diff --git a/testcases/kernel/mem/oom/oom04.c b/testcases/kernel/mem/oom/oom04.c
index 582663275..f84328f5b 100644
--- a/testcases/kernel/mem/oom/oom04.c
+++ b/testcases/kernel/mem/oom/oom04.c
@@ -36,24 +36,25 @@
 
 #ifdef HAVE_NUMA_V2
 
+static const struct tst_cgroup_group *cg;
+
 static void verify_oom(void)
 {
 #ifdef TST_ABI32
 	tst_brk(TCONF, "test is not designed for 32-bit system.");
 #endif
-
-	tst_cgroup_move_current(PATH_TMP_CG_CST);
-
 	tst_res(TINFO, "OOM on CPUSET...");
 	testoom(0, 0, ENOMEM, 1);
 
-	if (is_numa(NULL, NH_MEMS, 2)) {
+	if (is_numa(NULL, NH_MEMS, 2) &&
+	    SAFE_CGROUP_HAS(cg, "cpuset.memory_migrate")) {
 		/*
 		 * Under NUMA system, the migration of cpuset's memory
 		 * is in charge of cpuset.memory_migrate, we can write
 		 * 1 to cpuset.memory_migrate to enable the migration.
 		 */
-		tst_cgroup_cpuset_write_files(PATH_TMP_CG_CST, "memory_migrate", "1");
+		SAFE_CGROUP_PRINT(cg, "cpuset.memory_migrate", "1");
+
 		tst_res(TINFO, "OOM on CPUSET with mem migrate:");
 		testoom(0, 0, ENOMEM, 1);
 	}
@@ -69,7 +70,8 @@ static void setup(void)
 	overcommit = get_sys_tune("overcommit_memory");
 	set_sys_tune("overcommit_memory", 1, 1);
 
-	tst_cgroup_mount(TST_CGROUP_CPUSET, PATH_TMP_CG_CST);
+	tst_cgroup_require("cpuset", NULL);
+	cg = tst_cgroup_get_test_group();
 
 	/*
 	 * Some nodes do not contain memory, so use
@@ -81,14 +83,15 @@ static void setup(void)
 	if (ret < 0)
 		tst_brk(TBROK, "Failed to get a memory node "
 			      "using get_allowed_nodes()");
-	write_cpusets(PATH_TMP_CG_CST, memnode);
+	write_cpusets(cg, memnode);
+	SAFE_CGROUP_PRINTF(cg, "cgroup.procs", "%d", getpid());
 }
 
 static void cleanup(void)
 {
 	if (overcommit != -1)
 		set_sys_tune("overcommit_memory", overcommit, 0);
-	tst_cgroup_umount(PATH_TMP_CG_CST);
+	tst_cgroup_cleanup();
 }
 
 static struct tst_test test = {
diff --git a/testcases/kernel/mem/oom/oom05.c b/testcases/kernel/mem/oom/oom05.c
index 871f302e3..9c9bba7f6 100644
--- a/testcases/kernel/mem/oom/oom05.c
+++ b/testcases/kernel/mem/oom/oom05.c
@@ -36,6 +36,8 @@
 
 #ifdef HAVE_NUMA_V2
 
+static const struct tst_cgroup_group *cg;
+
 static void verify_oom(void)
 {
 #ifdef TST_ABI32
@@ -43,9 +45,6 @@ static void verify_oom(void)
 #endif
 
 	tst_res(TINFO, "OOM on CPUSET & MEMCG...");
-	tst_cgroup_move_current(PATH_TMP_CG_MEM);
-	tst_cgroup_move_current(PATH_TMP_CG_CST);
-	tst_cgroup_mem_set_maxbytes(PATH_TMP_CG_MEM, TESTMEM);
 	testoom(0, 0, ENOMEM, 1);
 
 	/*
@@ -53,22 +52,26 @@ static void verify_oom(void)
 	 * is in charge of cpuset.memory_migrate, we can write
 	 * 1 to cpuset.memory_migrate to enable the migration.
 	 */
-	if (is_numa(NULL, NH_MEMS, 2)) {
-		tst_cgroup_cpuset_write_files(PATH_TMP_CG_CST, "memory_migrate", "1");
+	if (is_numa(NULL, NH_MEMS, 2) &&
+	    SAFE_CGROUP_HAS(cg, "cpuset.memory_migrate")) {
+		SAFE_CGROUP_PRINT(cg, "cpuset.memory_migrate", "1");
 		tst_res(TINFO, "OOM on CPUSET & MEMCG with "
 				"cpuset.memory_migrate=1");
 		testoom(0, 0, ENOMEM, 1);
 	}
 
-	if (tst_cgroup_mem_swapacct_enabled(PATH_TMP_CG_MEM)) {
+	if (SAFE_CGROUP_HAS(cg, "memory.swap.max")) {
 		tst_res(TINFO, "OOM on CPUSET & MEMCG with "
 				"special memswap limitation:");
-		tst_cgroup_mem_set_maxswap(PATH_TMP_CG_MEM, TESTMEM);
+		SAFE_CGROUP_PRINTF(cg, "memory.swap.max", "%lu", TESTMEM);
 		testoom(0, 0, ENOMEM, 1);
 
 		tst_res(TINFO, "OOM on CPUSET & MEMCG with "
 				"disabled memswap limitation:");
-		tst_cgroup_mem_set_maxswap(PATH_TMP_CG_MEM, -1);
+		if (TST_CGROUP_VER(cg, "memory") == TST_CGROUP_V1)
+			SAFE_CGROUP_PRINTF(cg, "memory.swap.max", "%lu", ~0UL);
+		else
+			SAFE_CGROUP_PRINT(cg, "memory.swap.max", "max");
 		testoom(0, 0, ENOMEM, 1);
 	}
 }
@@ -83,8 +86,9 @@ void setup(void)
 	overcommit = get_sys_tune("overcommit_memory");
 	set_sys_tune("overcommit_memory", 1, 1);
 
-	tst_cgroup_mount(TST_CGROUP_MEMCG, PATH_TMP_CG_MEM);
-	tst_cgroup_mount(TST_CGROUP_CPUSET, PATH_TMP_CG_CST);
+	tst_cgroup_require("memory", NULL);
+	tst_cgroup_require("cpuset", NULL);
+	cg = tst_cgroup_get_test_group();
 
 	/*
 	 * Some nodes do not contain memory, so use
@@ -96,15 +100,17 @@ void setup(void)
 	if (ret < 0)
 		tst_brk(TBROK, "Failed to get a memory node "
 			      "using get_allowed_nodes()");
-	write_cpusets(PATH_TMP_CG_CST, memnode);
+
+	write_cpusets(cg, memnode);
+	SAFE_CGROUP_PRINTF(cg, "cgroup.procs", "%d", getpid());
+	SAFE_CGROUP_PRINTF(cg, "memory.max", "%lu", TESTMEM);
 }
 
 void cleanup(void)
 {
 	if (overcommit != -1)
 		set_sys_tune("overcommit_memory", overcommit, 0);
-	tst_cgroup_umount(PATH_TMP_CG_MEM);
-	tst_cgroup_umount(PATH_TMP_CG_CST);
+	tst_cgroup_cleanup();
 }
 
 static struct tst_test test = {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [LTP] [PATCH v6 7/7] madvise06: Convert to new CGroups API
  2021-05-04 13:40 [LTP] [PATCH v6 0/7] CGroup API rewrite Richard Palethorpe
                   ` (5 preceding siblings ...)
  2021-05-04 13:40 ` [LTP] [PATCH v6 6/7] mem: Convert tests to new " Richard Palethorpe
@ 2021-05-04 13:41 ` Richard Palethorpe
  6 siblings, 0 replies; 11+ messages in thread
From: Richard Palethorpe @ 2021-05-04 13:41 UTC (permalink / raw)
  To: ltp

Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com>
Reviewed-by: Cyril Hrubis <chrubis@suse.cz>
---
 testcases/kernel/syscalls/madvise/madvise06.c | 129 ++++++++----------
 1 file changed, 60 insertions(+), 69 deletions(-)

diff --git a/testcases/kernel/syscalls/madvise/madvise06.c b/testcases/kernel/syscalls/madvise/madvise06.c
index 56e0700d6..63d8d5452 100644
--- a/testcases/kernel/syscalls/madvise/madvise06.c
+++ b/testcases/kernel/syscalls/madvise/madvise06.c
@@ -3,39 +3,38 @@
  * Copyright (c) 2016 Red Hat, Inc.
  */
 
-/*
- * DESCRIPTION
- *
- *   Page fault occurs in spite that madvise(WILLNEED) system call is called
- *   to prefetch the page. This issue is reproduced by running a program
- *   which sequentially accesses to a shared memory and calls madvise(WILLNEED)
- *   to the next page on a page fault.
+/*\
+ * [Description]
  *
- *   This bug is present in all RHEL7 versions. It looks like this was fixed in
- *   mainline kernel > v3.15 by the following patch:
+ * Page fault occurs in spite that madvise(WILLNEED) system call is called
+ * to prefetch the page. This issue is reproduced by running a program
+ * which sequentially accesses to a shared memory and calls madvise(WILLNEED)
+ * to the next page on a page fault.
  *
- *   commit 55231e5c898c5c03c14194001e349f40f59bd300
- *   Author: Johannes Weiner <hannes@cmpxchg.org>
- *   Date:   Thu May 22 11:54:17 2014 -0700
+ * This bug is present in all RHEL7 versions. It looks like this was fixed in
+ * mainline kernel > v3.15 by the following patch:
  *
- *       mm: madvise: fix MADV_WILLNEED on shmem swapouts
+ *  commit 55231e5c898c5c03c14194001e349f40f59bd300
+ *  Author: Johannes Weiner <hannes@cmpxchg.org>
+ *  Date:   Thu May 22 11:54:17 2014 -0700
  *
- *   Two checks are performed, the first looks at how SwapCache
- *   changes during madvise. When the pages are dirtied, about half
- *   will be accounted for under Cached and the other half will be
- *   moved into Swap. When madvise is run it will cause the pages
- *   under Cached to also be moved to Swap while rotating the pages
- *   already in Swap into SwapCached. So we expect that SwapCached has
- *   roughly MEM_LIMIT bytes added to it, but for reliability the
- *   PASS_THRESHOLD is much lower than that.
+ *     mm: madvise: fix MADV_WILLNEED on shmem swapouts
  *
- *   Secondly we run madvise again, but only on the first
- *   PASS_THRESHOLD bytes to ensure these are entirely in RAM. Then we
- *   dirty these pages and check there were (almost) no page
- *   faults. Two faults are allowed incase some tasklet or something
- *   else unexpected, but irrelevant procedure, registers a fault to
- *   our process.
+ * Two checks are performed, the first looks at how SwapCache
+ * changes during madvise. When the pages are dirtied, about half
+ * will be accounted for under Cached and the other half will be
+ * moved into Swap. When madvise is run it will cause the pages
+ * under Cached to also be moved to Swap while rotating the pages
+ * already in Swap into SwapCached. So we expect that SwapCached has
+ * roughly MEM_LIMIT bytes added to it, but for reliability the
+ * PASS_THRESHOLD is much lower than that.
  *
+ * Secondly we run madvise again, but only on the first
+ * PASS_THRESHOLD bytes to ensure these are entirely in RAM. Then we
+ * dirty these pages and check there were (almost) no page
+ * faults. Two faults are allowed incase some tasklet or something
+ * else unexpected, but irrelevant procedure, registers a fault to
+ * our process.
  */
 
 #include <errno.h>
@@ -43,6 +42,7 @@
 #include <sys/mount.h>
 #include <sys/sysinfo.h>
 #include "tst_test.h"
+#include "tst_cgroup.h"
 
 #define CHUNK_SZ (400*1024*1024L)
 #define MEM_LIMIT (CHUNK_SZ / 2)
@@ -50,8 +50,7 @@
 #define PASS_THRESHOLD (CHUNK_SZ / 4)
 #define PASS_THRESHOLD_KB (PASS_THRESHOLD / 1024)
 
-#define MNT_NAME "memory"
-#define GROUP_NAME "madvise06"
+static const struct tst_cgroup_group *cg;
 
 static const char drop_caches_fname[] = "/proc/sys/vm/drop_caches";
 static int pg_sz, stat_refresh_sup;
@@ -64,17 +63,19 @@ static void check_path(const char *path)
 		tst_brk(TCONF, "file needed: %s", path);
 }
 
-#define READ_CGMEM(item)						\
-	({long tst_rval = 0;						\
-	  const char *cgpath = MNT_NAME"/"GROUP_NAME"/memory."item;	\
-	  if (!access(cgpath, R_OK))					\
-		  SAFE_FILE_LINES_SCANF(cgpath, "%ld", &tst_rval);	\
-	  tst_rval;})
+static void print_cgmem(const char *name)
+{
+	long ret;
+
+	if (!SAFE_CGROUP_HAS(cg, name))
+		return;
+
+	SAFE_CGROUP_SCANF(cg, name, "%ld", &ret);
+	tst_res(TINFO, "\t%s: %ld Kb", name, ret / 1024);
+}
 
 static void meminfo_diag(const char *point)
 {
-	long rval;
-
 	if (stat_refresh_sup)
 		SAFE_FILE_PRINTF("/proc/sys/vm/stat_refresh", "1");
 
@@ -85,16 +86,10 @@ static void meminfo_diag(const char *point)
 		SAFE_READ_MEMINFO("SwapCached:") - init_swap_cached);
 	tst_res(TINFO, "\tCached: %ld Kb",
 		SAFE_READ_MEMINFO("Cached:") - init_cached);
-	tst_res(TINFO, "\tcgmem.usage_in_bytes: %ld Kb",
-		READ_CGMEM("usage_in_bytes") / 1024);
-
-	rval = READ_CGMEM("memsw.usage_in_bytes") / 1024;
-	if (rval)
-		tst_res(TINFO, "\tcgmem.memsw.usage_in_bytes: %ld Kb", rval);
 
-	rval = READ_CGMEM("kmem.usage_in_bytes") / 1024;
-	if (rval)
-		tst_res(TINFO, "\tcgmem.kmem.usage_in_bytes: %ld Kb", rval);
+	print_cgmem("memory.current");
+	print_cgmem("memory.swap.current");
+	print_cgmem("memory.kmem.usage_in_bytes");
 }
 
 static void setup(void)
@@ -117,28 +112,24 @@ static void setup(void)
 			2 * CHUNK_SZ);
 	}
 
-	SAFE_MKDIR(MNT_NAME, 0700);
-	if (mount("memory", MNT_NAME, "cgroup", 0, "memory") == -1) {
-		if (errno == ENODEV || errno == ENOENT)
-			tst_brk(TCONF, "memory cgroup needed");
-	}
-	SAFE_MKDIR(MNT_NAME"/"GROUP_NAME, 0700);
-
 	check_path("/proc/self/oom_score_adj");
-	check_path(MNT_NAME"/"GROUP_NAME"/memory.limit_in_bytes");
-	check_path(MNT_NAME"/"GROUP_NAME"/memory.swappiness");
-	check_path(MNT_NAME"/"GROUP_NAME"/tasks");
-
 	SAFE_FILE_PRINTF("/proc/self/oom_score_adj", "%d", -1000);
-	SAFE_FILE_PRINTF(MNT_NAME"/"GROUP_NAME"/memory.limit_in_bytes", "%ld\n",
-			 MEM_LIMIT);
 
-	if (!access(MNT_NAME"/"GROUP_NAME"/memory.memsw.limit_in_bytes", W_OK)) {
-		SAFE_FILE_PRINTF(MNT_NAME"/"GROUP_NAME"/memory.memsw.limit_in_bytes",
-				 "%ld\n", MEMSW_LIMIT);
+	tst_cgroup_require("memory", NULL);
+	cg = tst_cgroup_get_test_group();
+
+	SAFE_CGROUP_PRINTF(cg, "memory.max", "%ld", MEM_LIMIT);
+	if (SAFE_CGROUP_HAS(cg, "memory.swap.max"))
+		SAFE_CGROUP_PRINTF(cg, "memory.swap.max", "%ld", MEMSW_LIMIT);
+
+	if (SAFE_CGROUP_HAS(cg, "memory.swappiness")) {
+		SAFE_CGROUP_PRINT(cg, "memory.swappiness", "60");
+	} else {
+		check_path("/proc/sys/vm/swappiness");
+		SAFE_FILE_PRINTF("/proc/sys/vm/swappiness", "%d", 60);
 	}
-	SAFE_FILE_PRINTF(MNT_NAME"/"GROUP_NAME"/memory.swappiness", "60");
-	SAFE_FILE_PRINTF(MNT_NAME"/"GROUP_NAME"/tasks", "%d\n", getpid());
+
+	SAFE_CGROUP_PRINTF(cg, "cgroup.procs", "%d", getpid());
 
 	meminfo_diag("Initial meminfo, later values are relative to this (except memcg)");
 	init_swap = SAFE_READ_MEMINFO("SwapTotal:") - SAFE_READ_MEMINFO("SwapFree:");
@@ -154,11 +145,7 @@ static void setup(void)
 
 static void cleanup(void)
 {
-	if (!access(MNT_NAME"/tasks", F_OK)) {
-		SAFE_FILE_PRINTF(MNT_NAME"/tasks", "%d\n", getpid());
-		SAFE_RMDIR(MNT_NAME"/"GROUP_NAME);
-		SAFE_UMOUNT(MNT_NAME);
-	}
+	tst_cgroup_cleanup();
 }
 
 static void dirty_pages(char *ptr, long size)
@@ -248,6 +235,10 @@ static struct tst_test test = {
 	.min_kver = "3.10.0",
 	.needs_tmpdir = 1,
 	.needs_root = 1,
+	.save_restore = (const char * const[]) {
+		"?/proc/sys/vm/swappiness",
+		NULL
+	},
 	.tags = (const struct tst_tag[]) {
 		{"linux-git", "55231e5c898c"},
 		{"linux-git", "8de15e920dc8"},
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [LTP] [PATCH v6 3/7] Add new CGroups APIs
  2021-05-04 13:40 ` [LTP] [PATCH v6 3/7] Add new CGroups APIs Richard Palethorpe
@ 2021-05-06  7:56   ` Li Wang
  2021-05-06  8:06     ` Richard Palethorpe
  2021-05-06  8:44     ` Cyril Hrubis
  0 siblings, 2 replies; 11+ messages in thread
From: Li Wang @ 2021-05-06  7:56 UTC (permalink / raw)
  To: ltp

Hi Richard,

With my pleasure for patchset:
Reviewed-by: Li Wang <liwang@redhat.com>

Also, the below typos should be corrected (someone who merges these
can help modify).

--- a/lib/tst_cgroup.c
+++ b/lib/tst_cgroup.c
@@ -147,7 +147,7 @@ static const files_t cgroup_ctrl_files = {
        /* procs exists on V1, however it was read-only until kernel v3.0. */
        { "cgroup.procs", "tasks", 0 },
        { "cgroup.subtree_control", NULL, 0 },
-       { "cgroup.clone_children", "clone_children", 0 },
+       { "cgroup.clone_children", "cgroup.clone_children", 0 },
        { }
 };

@@ -158,7 +158,7 @@ static const files_t memory_ctrl_files = {
        { "memory.swap.current", "memory.memsw.usage_in_bytes", CTRL_MEMORY },
        { "memory.swap.max", "memory.memsw.limit_in_bytes", CTRL_MEMORY },
        { "memory.kmem.usage_in_bytes", "memory.kmem.usage_in_bytes",
CTRL_MEMORY },
-       { "memory.kmem.limit_in_bytes", "memory.kmem.usage_in_bytes",
CTRL_MEMORY },
+       { "memory.kmem.limit_in_bytes", "memory.kmem.limit_in_bytes",
CTRL_MEMORY },
        { }
 };


-- 
Regards,
Li Wang


^ permalink raw reply	[flat|nested] 11+ messages in thread

* [LTP] [PATCH v6 3/7] Add new CGroups APIs
  2021-05-06  7:56   ` Li Wang
@ 2021-05-06  8:06     ` Richard Palethorpe
  2021-05-06  8:44     ` Cyril Hrubis
  1 sibling, 0 replies; 11+ messages in thread
From: Richard Palethorpe @ 2021-05-06  8:06 UTC (permalink / raw)
  To: ltp

Hello Li,

Li Wang <liwang@redhat.com> writes:

> Hi Richard,
>
> With my pleasure for patchset:
> Reviewed-by: Li Wang <liwang@redhat.com>

Thanks for the reviews!

>
> Also, the below typos should be corrected (someone who merges these
> can help modify).
>
> --- a/lib/tst_cgroup.c
> +++ b/lib/tst_cgroup.c
> @@ -147,7 +147,7 @@ static const files_t cgroup_ctrl_files = {
>         /* procs exists on V1, however it was read-only until kernel v3.0. */
>         { "cgroup.procs", "tasks", 0 },
>         { "cgroup.subtree_control", NULL, 0 },
> -       { "cgroup.clone_children", "clone_children", 0 },
> +       { "cgroup.clone_children", "cgroup.clone_children", 0 },

+1

>         { }
>  };
>
> @@ -158,7 +158,7 @@ static const files_t memory_ctrl_files = {
>         { "memory.swap.current", "memory.memsw.usage_in_bytes", CTRL_MEMORY },
>         { "memory.swap.max", "memory.memsw.limit_in_bytes", CTRL_MEMORY },
>         { "memory.kmem.usage_in_bytes", "memory.kmem.usage_in_bytes",
> CTRL_MEMORY },
> -       { "memory.kmem.limit_in_bytes", "memory.kmem.usage_in_bytes",
> CTRL_MEMORY },
> +       { "memory.kmem.limit_in_bytes", "memory.kmem.limit_in_bytes",
> CTRL_MEMORY },
>         { }
>  };


-- 
Thank you,
Richard.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [LTP] [PATCH v6 3/7] Add new CGroups APIs
  2021-05-06  7:56   ` Li Wang
  2021-05-06  8:06     ` Richard Palethorpe
@ 2021-05-06  8:44     ` Cyril Hrubis
  1 sibling, 0 replies; 11+ messages in thread
From: Cyril Hrubis @ 2021-05-06  8:44 UTC (permalink / raw)
  To: ltp

Hi!
> With my pleasure for patchset:
> Reviewed-by: Li Wang <liwang@redhat.com>
>
> Also, the below typos should be corrected (someone who merges these
> can help modify).

Thanks for the review, fixed and pushed.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-05-06  8:44 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-04 13:40 [LTP] [PATCH v6 0/7] CGroup API rewrite Richard Palethorpe
2021-05-04 13:40 ` [LTP] [PATCH v6 1/7] API: Add safe openat, printfat, readat and unlinkat Richard Palethorpe
2021-05-04 13:40 ` [LTP] [PATCH v6 2/7] API: Make tst_count_scanf_conversions public Richard Palethorpe
2021-05-04 13:40 ` [LTP] [PATCH v6 3/7] Add new CGroups APIs Richard Palethorpe
2021-05-06  7:56   ` Li Wang
2021-05-06  8:06     ` Richard Palethorpe
2021-05-06  8:44     ` Cyril Hrubis
2021-05-04 13:40 ` [LTP] [PATCH v6 4/7] Add new CGroups API library tests Richard Palethorpe
2021-05-04 13:40 ` [LTP] [PATCH v6 5/7] docs: Update CGroups API Richard Palethorpe
2021-05-04 13:40 ` [LTP] [PATCH v6 6/7] mem: Convert tests to new " Richard Palethorpe
2021-05-04 13:41 ` [LTP] [PATCH v6 7/7] madvise06: Convert " Richard Palethorpe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.