All of lore.kernel.org
 help / color / mirror / Atom feed
* [LTP] [PATCH v3 0/7] CGroup API rewrite
@ 2021-04-12 14:54 Richard Palethorpe
  2021-04-12 14:55 ` [LTP] [PATCH v3 1/7] API: Add safe openat, printfat, readat and unlinkat Richard Palethorpe
                   ` (6 more replies)
  0 siblings, 7 replies; 22+ messages in thread
From: Richard Palethorpe @ 2021-04-12 14:54 UTC (permalink / raw)
  To: ltp

Hello,

This is a complete rewrite of the CGroups API. It has big changes even
compared to V2 of the patchset. To understand why this is so
complicated, please see the commments in tst_cgroup.h and
tst_cgroup.c.

V3:

* Replaced the object API with a string based lookup.

* Replaced tst_cgroup_css struct and name mapping functions with an
  item info tree.

* Merged the header files again as there is no longer much seperation
  between the core and item parts of the library.

* Rename some variables and functions to make them more consistent.

V2:

* Created an object (Item) API which looks a bit like the unified V2
  hierarchy. The implementation is quite verbose, but not complicated
  IMO.

* Add the ability to extend the LTP CGroup hierarchy with child
  groups. We already have a reproducer that requires such a hierarchy,
  but I have not had chance to turn it into a test case yet.

* Add documentation for the new API in test-writing-guidelines.txt.

* Convert madvise06 to the CGroups API

* Better error reporting for the *at functions. Add tst_decode_fd
  which tries to print the path an FD was opened with.

TODO/NOTES:

* There are other tests which mount CGroups in an ad-hoc way and need
  to be converted to the new API. This at least includes memcg_test_3
  and maybe cgroup_xattr.

Richard Palethorpe (7):
  API: Add safe openat, printfat, readat and unlinkat
  API: Add macro for the container_of trick
  Add new CGroups APIs
  Add new CGroups API library tests
  docs: Update CGroups API
  mem: Convert tests to new CGroups API
  madvise06: Convert to new CGroups API

 doc/test-writing-guidelines.txt               |  175 ++-
 include/tst_cgroup.h                          |  194 ++-
 include/tst_common.h                          |    5 +
 include/tst_safe_file_ops.h                   |   39 +
 include/tst_test.h                            |    1 -
 lib/Makefile                                  |    2 +
 lib/newlib_tests/.gitignore                   |    2 +
 lib/newlib_tests/test21.c                     |   45 +-
 lib/newlib_tests/tst_cgroup01.c               |   51 +
 lib/newlib_tests/tst_cgroup02.c               |   90 ++
 lib/tst_cgroup.c                              | 1205 ++++++++++++-----
 lib/tst_safe_file_ops.c                       |  171 +++
 testcases/kernel/mem/cpuset/cpuset01.c        |   34 +-
 testcases/kernel/mem/include/mem.h            |    2 +-
 testcases/kernel/mem/ksm/ksm02.c              |   13 +-
 testcases/kernel/mem/ksm/ksm03.c              |   12 +-
 testcases/kernel/mem/ksm/ksm04.c              |   17 +-
 testcases/kernel/mem/lib/mem.c                |   10 +-
 testcases/kernel/mem/oom/oom03.c              |   18 +-
 testcases/kernel/mem/oom/oom04.c              |   19 +-
 testcases/kernel/mem/oom/oom05.c              |   32 +-
 testcases/kernel/syscalls/madvise/madvise06.c |   82 +-
 22 files changed, 1720 insertions(+), 499 deletions(-)
 create mode 100644 lib/newlib_tests/tst_cgroup01.c
 create mode 100644 lib/newlib_tests/tst_cgroup02.c
 create mode 100644 lib/tst_safe_file_ops.c

-- 
2.30.2


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [LTP] [PATCH v3 1/7] API: Add safe openat, printfat, readat and unlinkat
  2021-04-12 14:54 [LTP] [PATCH v3 0/7] CGroup API rewrite Richard Palethorpe
@ 2021-04-12 14:55 ` Richard Palethorpe
  2021-04-16  6:59   ` Li Wang
  2021-04-12 14:55 ` [LTP] [PATCH v3 2/7] API: Add macro for the container_of trick Richard Palethorpe
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Richard Palethorpe @ 2021-04-12 14:55 UTC (permalink / raw)
  To: ltp

Add 'at' variants for a number of system calls and LTP SAFE API
functions. This avoids using sprintf everywhere to build paths.

Also adds tst_decode_fd which allows us to retrieve the path for an FD
for debugging purposes without having to store it ourselves. However
the proc symlink may not be available on some systems.

Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com>
---
 include/tst_safe_file_ops.h |  39 ++++++++
 lib/tst_safe_file_ops.c     | 171 ++++++++++++++++++++++++++++++++++++
 2 files changed, 210 insertions(+)
 create mode 100644 lib/tst_safe_file_ops.c

diff --git a/include/tst_safe_file_ops.h b/include/tst_safe_file_ops.h
index 223eddd1f..dff6a793c 100644
--- a/include/tst_safe_file_ops.h
+++ b/include/tst_safe_file_ops.h
@@ -57,4 +57,43 @@
 #define TST_MOUNT_OVERLAY() \
 	(mount_overlay(__FILE__, __LINE__, 0) == 0)
 
+#define SAFE_OPENAT(dirfd, path, oflags, ...)			\
+	safe_openat(__FILE__, __LINE__,				\
+		    (dirfd), (path), (oflags), ## __VA_ARGS__)
+
+#define SAFE_FILE_READAT(dirfd, path, buf, nbyte)			\
+	safe_file_readat(__FILE__, __LINE__,				\
+			 (dirfd), (path), (buf), (nbyte))
+
+
+#define SAFE_FILE_PRINTFAT(dirfd, path, fmt, ...)			\
+	safe_file_printfat(__FILE__, __LINE__,				\
+			   (dirfd), (path), (fmt), __VA_ARGS__)
+
+#define SAFE_UNLINKAT(dirfd, path, flags)				\
+	safe_unlinkat(__FILE__, __LINE__, (dirfd), (path), (flags))
+
+char *tst_decode_fd(int fd);
+
+int safe_openat(const char *file, const int lineno,
+		int dirfd, const char *path, int oflags, ...);
+
+ssize_t safe_file_readat(const char *file, const int lineno,
+			 int dirfd, const char *path, char *buf, size_t nbyte);
+
+int tst_file_vprintfat(int dirfd, const char *path, const char *fmt, va_list va);
+int tst_file_printfat(int dirfd, const char *path, const char *fmt, ...)
+			__attribute__ ((format (printf, 3, 4)));
+
+int safe_file_vprintfat(const char *file, const int lineno,
+			int dirfd, const char *path,
+			const char *fmt, va_list va);
+
+int safe_file_printfat(const char *file, const int lineno,
+		       int dirfd, const char *path, const char *fmt, ...)
+			__attribute__ ((format (printf, 5, 6)));
+
+int safe_unlinkat(const char *file, const int lineno,
+		  int dirfd, const char *path, int flags);
+
 #endif /* TST_SAFE_FILE_OPS */
diff --git a/lib/tst_safe_file_ops.c b/lib/tst_safe_file_ops.c
new file mode 100644
index 000000000..af4157476
--- /dev/null
+++ b/lib/tst_safe_file_ops.c
@@ -0,0 +1,171 @@
+#define _GNU_SOURCE
+#include <stdio.h>
+#include <stdlib.h>
+#include "lapi/fcntl.h"
+
+#define TST_NO_DEFAULT_MAIN
+#include "tst_test.h"
+
+char fd_path[PATH_MAX];
+
+char *tst_decode_fd(int fd)
+{
+	ssize_t ret;
+	char proc_path[32];
+
+	if (fd < 0)
+		return "!";
+
+	sprintf(proc_path, "/proc/self/fd/%d", fd);
+	ret = readlink(proc_path, fd_path, sizeof(fd_path));
+
+	if (ret < 0)
+		return "?";
+
+	fd_path[ret] = '\0';
+
+	return fd_path;
+}
+
+int safe_openat(const char *file, const int lineno,
+		int dirfd, const char *path, int oflags, ...)
+{
+	va_list ap;
+	int fd;
+	mode_t mode;
+
+	va_start(ap, oflags);
+	mode = va_arg(ap, int);
+	va_end(ap);
+
+	fd = openat(dirfd, path, oflags, mode);
+	if (fd > -1)
+		return fd;
+
+	tst_brk_(file, lineno, TBROK | TERRNO,
+		 "openat(%d<%s>, '%s', %o, %o)",
+		 dirfd, tst_decode_fd(dirfd), path, oflags, mode);
+
+	return fd;
+}
+
+ssize_t safe_file_readat(const char *file, const int lineno,
+			 int dirfd, const char *path, char *buf, size_t nbyte)
+{
+	int fd = safe_openat(file, lineno, dirfd, path, O_RDONLY);
+	ssize_t rval;
+
+	if (fd < 0)
+		return -1;
+
+	rval = safe_read(file, lineno, NULL, 0, fd, buf, nbyte - 1);
+	if (rval < 0)
+		return -1;
+
+	close(fd);
+	buf[rval] = '\0';
+
+	if (rval >= (ssize_t)nbyte - 1) {
+		tst_brk_(file, lineno, TBROK,
+			"Buffer length %zu too small to read %d<%s>/%s",
+			nbyte, dirfd, tst_decode_fd(dirfd), path);
+	}
+
+	return rval;
+}
+
+int tst_file_vprintfat(int dirfd, const char *path, const char *fmt, va_list va)
+{
+	int fd = openat(dirfd, path, O_WRONLY);
+
+	if (fd < 0)
+		return -1;
+
+	TEST(vdprintf(fd, fmt, va));
+	close(fd);
+
+	if (TST_RET < 0) {
+		errno = TST_ERR;
+		return -2;
+	}
+
+	return TST_RET;
+}
+
+int tst_file_printfat(int dirfd, const char *path, const char *fmt, ...)
+{
+	va_list va;
+	int rval;
+
+	va_start(va, fmt);
+	rval = tst_file_vprintfat(dirfd, path, fmt, va);
+	va_end(va);
+
+	return rval;
+}
+
+int safe_file_vprintfat(const char *file, const int lineno,
+			int dirfd, const char *path,
+			const char *fmt, va_list va)
+{
+	char buf[16];
+	va_list vac;
+	int rval;
+
+	va_copy(vac, va);
+
+	TEST(tst_file_vprintfat(dirfd, path, fmt, va));
+
+	if (TST_RET == -2) {
+		rval = vsnprintf(buf, sizeof(buf), fmt, vac);
+		va_end(vac);
+
+		if (rval >= (ssize_t)sizeof(buf))
+			strcpy(buf + sizeof(buf) - 5, "...");
+
+		tst_brk_(file, lineno, TBROK | TTERRNO,
+			 "vdprintf(%d<%s>, '%s', '%s'<%s>)",
+			 dirfd, tst_decode_fd(dirfd), path, fmt,
+			 rval > 0 ? buf : "???");
+		return -1;
+	}
+
+	va_end(vac);
+
+	if (TST_RET == -1) {
+		tst_brk_(file, lineno, TBROK | TTERRNO,
+			"openat(%d<%s>, '%s', O_WRONLY)",
+			dirfd, tst_decode_fd(dirfd), path);
+	}
+
+	return TST_RET;
+}
+
+int safe_file_printfat(const char *file, const int lineno,
+		       int dirfd, const char *path,
+		       const char *fmt, ...)
+{
+	va_list va;
+	int rval;
+
+	va_start(va, fmt);
+	rval = safe_file_vprintfat(file, lineno, dirfd, path, fmt, va);
+	va_end(va);
+
+	return rval;
+}
+
+int safe_unlinkat(const char *file, const int lineno,
+		  int dirfd, const char *path, int flags)
+{
+	int rval = unlinkat(dirfd, path, flags);
+
+	if (rval > -1)
+		return rval;
+
+	tst_brk_(file, lineno, TBROK | TERRNO,
+		 "unlinkat(%d<%s>, '%s', %s)", dirfd, tst_decode_fd(dirfd), path,
+		 flags == AT_REMOVEDIR ? "AT_REMOVEDIR" : (flags ? "?" : "0"));
+
+	return rval;
+}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [LTP] [PATCH v3 2/7] API: Add macro for the container_of trick
  2021-04-12 14:54 [LTP] [PATCH v3 0/7] CGroup API rewrite Richard Palethorpe
  2021-04-12 14:55 ` [LTP] [PATCH v3 1/7] API: Add safe openat, printfat, readat and unlinkat Richard Palethorpe
@ 2021-04-12 14:55 ` Richard Palethorpe
  2021-04-16  7:01   ` Li Wang
  2021-04-12 14:55 ` [LTP] [PATCH v3 3/7] Add new CGroups APIs Richard Palethorpe
                   ` (4 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Richard Palethorpe @ 2021-04-12 14:55 UTC (permalink / raw)
  To: ltp

Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com>
---
 include/tst_common.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/include/tst_common.h b/include/tst_common.h
index fd7a900d4..317925d1d 100644
--- a/include/tst_common.h
+++ b/include/tst_common.h
@@ -83,4 +83,9 @@
 #define TST_RES_SUPPORTS_TCONF_TFAIL_TINFO_TPASS_TWARN(condition) \
 	TST_BUILD_BUG_ON(condition)
 
+#define tst_container_of(ptr, type, member) ({		     \
+	const typeof( ((type *)0)->member ) *__mptr = (ptr); \
+	(type *)( (char *)__mptr - offsetof(type,member) );  \
+})
+
 #endif /* TST_COMMON_H__ */
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [LTP] [PATCH v3 3/7] Add new CGroups APIs
  2021-04-12 14:54 [LTP] [PATCH v3 0/7] CGroup API rewrite Richard Palethorpe
  2021-04-12 14:55 ` [LTP] [PATCH v3 1/7] API: Add safe openat, printfat, readat and unlinkat Richard Palethorpe
  2021-04-12 14:55 ` [LTP] [PATCH v3 2/7] API: Add macro for the container_of trick Richard Palethorpe
@ 2021-04-12 14:55 ` Richard Palethorpe
  2021-04-14 15:39   ` Cyril Hrubis
  2021-04-16  6:57   ` Li Wang
  2021-04-12 14:55 ` [LTP] [PATCH v3 4/7] Add new CGroups API library tests Richard Palethorpe
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 22+ messages in thread
From: Richard Palethorpe @ 2021-04-12 14:55 UTC (permalink / raw)
  To: ltp

Complete rewrite of the CGroups API which provides two layers of
indirection between the test author and the SUT's CGroup
configuration.

Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com>
---
 include/tst_cgroup.h |  194 ++++++-
 include/tst_test.h   |    1 -
 lib/Makefile         |    2 +
 lib/tst_cgroup.c     | 1205 +++++++++++++++++++++++++++++++-----------
 4 files changed, 1056 insertions(+), 346 deletions(-)

diff --git a/include/tst_cgroup.h b/include/tst_cgroup.h
index bfd848260..d6842d641 100644
--- a/include/tst_cgroup.h
+++ b/include/tst_cgroup.h
@@ -2,46 +2,194 @@
 /*
  * Copyright (c) 2020 Red Hat, Inc.
  * Copyright (c) 2020 Li Wang <liwang@redhat.com>
+ * Copyright (c) 2020-2021 SUSE LLC <rpalethorpe@suse.com>
  */
+/*\
+ * [DESCRIPTION]
+ *
+ * The LTP CGroups API tries to present a consistent interface to the
+ * many possible CGroup configurations a system could have.
+ *
+ * You may ask; "Why don't you just mount a simple CGroup hierarchy,
+ * instead of scanning the current setup?". The short answer is that
+ * it is not possible unless no CGroups are currently active and
+ * almost all of our users will have CGroups active. Even if
+ * unmounting the current CGroup hierarchy is a reasonable thing to do
+ * to the sytem manager, it is highly unlikely the CGroup hierarchy
+ * will be destroyed. So users would be forced to remove their CGroup
+ * configuration and reboot the system.
+ *
+ * The core library tries to ensure an LTP CGroup exists on each
+ * hierarchy root. Inside the LTP group it ensures a 'drain' group
+ * exists and creats a test group for the current test. In the worst
+ * case we end up with a set of hierarchies like the follwoing. Where
+ * existing system-manager-created CGroups have been omitted.
+ *
+ * 	(V2 Root)	(V1 Root 1)	...	(V1 Root N)
+ * 	    |		     |			     |
+ *	  (ltp)		   (ltp)	...	   (ltp)
+ *	 /     \	  /	\		  /	\
+ *  (drain) (test-n) (drain)  (test-n)  ...  (drain)  (test-n)
+ *
+ * V2 CGroup controllers use a single unified hierarchy on a single
+ * root. Two or more V1 controllers may share a root or have their own
+ * root. However there may exist only one instance of a controller.
+ * So you can not have the same V1 controller on multiple roots.
+ *
+ * It is possible to have both a V2 hierarchy and V1 hierarchies
+ * active at the same time. Which is what is shown above. Any
+ * controllers attached to V1 hierarchies will not be available in the
+ * V2 hierarchy. The reverse is also true.
+ *
+ * Note that a single hierarchy may be mounted multiple
+ * times. Allowing it to be accessed at different locations. However
+ * subsequent mount operations will fail if the mount options are
+ * different from the first.
+ *
+ * The user may pre-create the CGroup hierarchies and the ltp CGroup,
+ * otherwise the library will try to create them. If the ltp group
+ * already exists and has appropriate permissions, then admin
+ * privileges will not be required to run the tests.
+ *
+ * Because the test may not have access to the CGroup root(s), the
+ * drain CGroup is created. This can be used to store processes which
+ * would otherwise block the destruction of the individual test CGroup
+ * or one of its descendants.
+ *
+ * The test author may create child CGroups within the test CGroup
+ * using the CGroup Item API. The library will create the new CGroup
+ * in all the relevant hierarchies.
+ *
+ * There are many differences between the V1 and V2 CGroup APIs. If a
+ * controller is on both V1 and V2, it may have different parameters
+ * and control files. Some of these control files have a different
+ * name, but similar functionality. In this case the Item API uses
+ * the V2 names and aliases them to the V1 name when appropriate.
+ *
+ * Some control files only exist on one of the versions or they can be
+ * missing due to other reasons. The Item API allows the user to check
+ * if the file exists before trying to use it.
+ *
+ * Often a control file has almost the same functionality between V1
+ * and V2. Which means it can be used in the same way most of the
+ * time, but not all. For now this is handled by exposing the API
+ * version a controller is using to allow the test author to handle
+ * edge cases. (e.g. V2 memory.swap.max accepts "max", but V1
+ * memory.memsw.limit_in_bytes does not).
+\*/
 
 #ifndef TST_CGROUP_H
 #define TST_CGROUP_H
 
-#define PATH_TMP_CG_MEM	"/tmp/cgroup_mem"
-#define PATH_TMP_CG_CST	"/tmp/cgroup_cst"
-
+/* CGroups Kernel API version */
 enum tst_cgroup_ver {
 	TST_CGROUP_V1 = 1,
 	TST_CGROUP_V2 = 2,
 };
 
-enum tst_cgroup_ctrl {
-	TST_CGROUP_MEMCG = 1,
+/* Controller sub-systems */
+enum tst_cgroup_css {
+	TST_CGROUP_MEMORY = 1,
 	TST_CGROUP_CPUSET = 2,
-	/* add cgroup controller */
 };
+#define TST_CGROUP_MAX TST_CGROUP_CPUSET
+
+/* At most we can have one cgroup V1 tree for each controller and one
+ * (empty) v2 tree.
+ */
+#define TST_CGROUP_MAX_TREES (TST_CGROUP_MAX + 1)
+
+
+/* Used to specify CGroup hierarchy configuration options, allowing a
+ * test to request a particular CGroup structure.
+ */
+struct tst_cgroup_opts {
+	/* Only try to mount V1 CGroup controllers. This will not
+	 * prevent V2 from being used if it is already mounted, it
+	 * only indicates that we should mount V1 controllers if
+	 * nothing is present. By default we try to mount V2 first. */
+	int only_mount_v1:1;
+};
+
+struct tst_cgroup_tree;
+
+
+/* A Control Group in LTP's aggregated hierarchy */
+struct tst_cgroup {
+	const char *name;
+	/* Maps controller ID to the tree which contains it. The V2
+	 * tree is at zero even if it contains no controllers.
+	 */
+	struct tst_cgroup_tree *trees_by_css[TST_CGROUP_MAX_TREES];
+	/* NULL terminated list of trees */
+	struct tst_cgroup_tree *trees[TST_CGROUP_MAX_TREES + 1];
+};
+
+/* Search the system for mounted cgroups and available
+ * controllers. Called automatically by tst_cgroup_require.
+ */
+void tst_cgroup_scan(void);
+/* Print the config detected by tst_cgroup_scan */
+void tst_cgroup_print_config(void);
+
+/* Ensure the specified controller is available in the test's default
+ * CGroup, mounting/enabling it if necessary */
+void tst_cgroup_require(enum tst_cgroup_css type,
+			const struct tst_cgroup_opts *options);
+
+/* Tear down any CGroups created by calls to tst_cgroup_require */
+void tst_cgroup_cleanup(void);
+
+/* Get the default CGroup for the test. It allocates memory (in a
+ * guarded buffer) so should always be called from setup
+ */
+const struct tst_cgroup *tst_cgroup_get_test(void);
+/* Get the shared drain group. Also should be called from setup */
+const struct tst_cgroup *tst_cgroup_get_drain(void);
+/* Create a descendant CGroup */
+struct tst_cgroup *tst_cgroup_mk(const struct tst_cgroup *parent,
+				 const char *name);
+/* Remove a descendant CGroup */
+struct tst_cgroup *tst_cgroup_rm(struct tst_cgroup *cg);
+
+#define SAFE_CGROUP_VER(cg, name) \
+	safe_cgroup_ver(__FILE__, __LINE__, (cg), (name))
+
+enum tst_cgroup_ver safe_cgroup_ver(const char *file, const int lineno,
+				    const struct tst_cgroup *cg,
+				    const char *name);
+
+#define SAFE_CGROUP_HAS(cg, name) \
+	safe_cgroup_has(__FILE__, __LINE__, (cg), (name))
+
+int safe_cgroup_has(const char *file, const int lineno,
+		    const struct tst_cgroup *cg, const char *name);
+
+#define SAFE_CGROUP_READ(cg, name, out, len)				\
+	safe_cgroup_read(__FILE__, __LINE__, (cg), (name), (out), (len))
+
+ssize_t safe_cgroup_read(const char *file, const int lineno,
+			 const struct tst_cgroup *cg, const char *name,
+			 char *out, size_t len);
 
-enum tst_cgroup_ver tst_cgroup_version(void);
+#define SAFE_CGROUP_PRINTF(cg, name, fmt, ...)				\
+	safe_cgroup_printf(__FILE__, __LINE__, (cg), (name), (fmt), __VA_ARGS__)
 
-/* To mount/umount specified cgroup controller on 'cgroup_dir' path */
-void tst_cgroup_mount(enum tst_cgroup_ctrl ctrl, const char *cgroup_dir);
-void tst_cgroup_umount(const char *cgroup_dir);
+#define SAFE_CGROUP_PRINT(cg, name, str)				\
+	safe_cgroup_printf(__FILE__, __LINE__, (cg), (name), "%s", (str))
 
-/* To move current process PID to the mounted cgroup tasks */
-void tst_cgroup_move_current(const char *cgroup_dir);
+void safe_cgroup_printf(const char *file, const int lineno,
+			const struct tst_cgroup *cg, const char *name,
+			const char *fmt, ...)
+			__attribute__ ((format (printf, 5, 6)));
 
-/* To set cgroup controller knob with new value */
-void tst_cgroup_set_knob(const char *cgroup_dir, const char *knob, long value);
+#define SAFE_CGROUP_SCANF(cg, name, fmt, ...)				\
+	safe_cgroup_scanf(__FILE__, __LINE__, (cg), (name), (fmt), __VA_ARGS__)
 
-/* Set of functions to set knobs under the memory controller */
-void tst_cgroup_mem_set_maxbytes(const char *cgroup_dir, long memsz);
-int  tst_cgroup_mem_swapacct_enabled(const char *cgroup_dir);
-void tst_cgroup_mem_set_maxswap(const char *cgroup_dir, long memsz);
+void safe_cgroup_scanf(const char *file, const int lineno,
+		       const struct tst_cgroup *cg, const char *name,
+		       const char *fmt, ...)
+		       __attribute__ ((format (scanf, 5, 6)));
 
-/* Set of functions to read/write cpuset controller files content */
-void tst_cgroup_cpuset_read_files(const char *cgroup_dir, const char *filename,
-	char *retbuf, size_t retbuf_sz);
-void tst_cgroup_cpuset_write_files(const char *cgroup_dir, const char *filename,
-	const char *buf);
 
 #endif /* TST_CGROUP_H */
diff --git a/include/tst_test.h b/include/tst_test.h
index 1fbebe752..62ab2981f 100644
--- a/include/tst_test.h
+++ b/include/tst_test.h
@@ -39,7 +39,6 @@
 #include "tst_capability.h"
 #include "tst_hugepage.h"
 #include "tst_assert.h"
-#include "tst_cgroup.h"
 #include "tst_lockdown.h"
 #include "tst_fips.h"
 #include "tst_taint.h"
diff --git a/lib/Makefile b/lib/Makefile
index f019432e8..6f641ee9a 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -38,6 +38,8 @@ pc_file			:= $(DESTDIR)/$(datarootdir)/pkgconfig/ltp.pc
 
 INSTALL_TARGETS		:= $(pc_file)
 
+tst_cgroup.o: CFLAGS += -Wno-missing-field-initializers
+
 $(pc_file):
 	test -d "$(@D)" || mkdir -p "$(@D)"
 	install -m $(INSTALL_MODE) "$(builddir)/$(@F)" "$@"
diff --git a/lib/tst_cgroup.c b/lib/tst_cgroup.c
index 96c9524d2..40c9a9bec 100644
--- a/lib/tst_cgroup.c
+++ b/lib/tst_cgroup.c
@@ -2,453 +2,1014 @@
 /*
  * Copyright (c) 2020 Red Hat, Inc.
  * Copyright (c) 2020 Li Wang <liwang@redhat.com>
+ * Copyright (c) 2020-2021 SUSE LLC <rpalethorpe@suse.com>
  */
 
 #define TST_NO_DEFAULT_MAIN
 
 #include <stdio.h>
+#include <stddef.h>
 #include <stdlib.h>
+#include <mntent.h>
 #include <sys/mount.h>
-#include <fcntl.h>
-#include <unistd.h>
 
 #include "tst_test.h"
-#include "tst_safe_macros.h"
-#include "tst_safe_stdio.h"
+#include "lapi/fcntl.h"
+#include "lapi/mount.h"
+#include "lapi/mkdirat.h"
 #include "tst_cgroup.h"
-#include "tst_device.h"
 
-static enum tst_cgroup_ver tst_cg_ver;
-static int clone_children;
+/* CGroup Core Implementation
+ *
+ * CGroup Item Implementation is towards the bottom.
+ */
 
-static int tst_cgroup_check(const char *cgroup)
-{
-	char line[PATH_MAX];
-	FILE *file;
-	int cg_check = 0;
+struct cgroup_root;
 
-	file = SAFE_FOPEN("/proc/filesystems", "r");
-	while (fgets(line, sizeof(line), file)) {
-		if (strstr(line, cgroup) != NULL) {
-			cg_check = 1;
-			break;
+/* A node in a single CGroup hierarchy. It exists mainly for
+ * convenience so that we do not have to traverse the CGroup structure
+ * for frequent operations.
+ *
+ * This is actually a single-linked list not a tree. We only need to
+ * traverse from leaf towards root.
+ */
+struct tst_cgroup_tree {
+	const char *name;
+	const struct tst_cgroup_tree *parent;
+
+	/* Shortcut to root */
+	const struct cgroup_root *root;
+
+	/* Subsystems (controllers) bit field. Only controllers which
+	 * were required and configured for this node are added to
+	 * this field. So it may be different from root->css_field.
+	 */
+	uint32_t css_field;
+
+	/* In general we avoid having sprintfs everywhere and use
+	 * openat, linkat, etc.
+	 */
+	int dir;
+
+	int we_created_it:1;
+};
+
+/* The root of a CGroup hierarchy/tree */
+struct cgroup_root {
+	enum tst_cgroup_ver ver;
+	/* A mount path */
+	char path[PATH_MAX/2];
+	/* Subsystems (controllers) bit field. Includes all
+	 * controllers found while scanningthis root.
+	 */
+	uint32_t css_field;
+
+	/* CGroup hierarchy: mnt -> ltp -> {drain, test -> ??? } We
+	 * keep a flat reference to ltp, drain and test for
+	 * convenience.
+	 */
+
+	/* Mount directory */
+	struct tst_cgroup_tree mnt;
+	/* LTP CGroup directory, contains drain and test dirs */
+	struct tst_cgroup_tree ltp;
+	/* Drain CGroup, see cgroup_cleanup */
+	struct tst_cgroup_tree drain;
+	/* CGroup for current test. Which may have children. */
+	struct tst_cgroup_tree test;
+
+	int we_mounted_it:1;
+	/* cpuset is in compatability mode */
+	int no_prefix:1;
+};
+
+/* Always use first item for unified hierarchy */
+struct cgroup_root roots[TST_CGROUP_MAX_TREES + 1];
+
+/* Describes some things that are part of a CGroup
+ *
+ * Usually trunk nodes are controllers and leaves are files exported
+ * by the controllers. Sometimes trunk nodes are components of a
+ * controller (e.g. memory.swap).
+ *
+ * The primary purpose of this is to map V2 names to V1
+ * names. Secondarily we can map name prefixes to controller IDs and
+ * figure out which hierarchy the item should be present on and
+ * whether the current configuration requires yet further work arounds
+ * (e.g. if cpuset is mounted in compatablity mode).
+ */
+struct cgroup_item {
+	/* Canonical name. Is the V2 name unless an item is V1 only */
+	const char *name;
+	/* V1 name or NULL if item is V2 only */
+	const char *name_v1;
+	/* Array of child items or NULL */
+	struct cgroup_item *inner;
+
+	/* The controller this item belongs to or zero for
+	 * 'cgroup.<item>'. Leaf items are statically initialised as
+	 * zero then set at runtime.
+	 */
+	enum tst_cgroup_css css_indx;
+
+	struct cgroup_root *root;
+
+	int we_require_it:1;
+};
+
+/* Lookup tree for item names. */
+typedef struct cgroup_item items_t[];
+static items_t items = {
+	[0] = { "cgroup", .inner = (items_t){
+			{ "cgroup.procs", "tasks" },
+			{ "cgroup.subtree_control" },
+			{ "cgroup.clone_children", "clone_children" },
+			{ NULL }
 		}
-	}
-	SAFE_FCLOSE(file);
+	},
+	[TST_CGROUP_MEMORY] = { "memory", .inner = (items_t){
+			{ "memory.current", "memory.usage_in_bytes" },
+			{ "memory.max", "memory.limit_in_bytes" },
+			{ "memory.swappiness", "memory.swappiness" },
+			{ "memory.swap.current", "memory.memsw.usage_in_bytes" },
+			{ "memory.swap.max", "memory.memsw.limit_in_bytes" },
+			{ "memory.kmem.usage_in_bytes", "memory.kmem.usage_in_bytes" },
+			{ "memory.kmem.limit_in_bytes", "memory.kmem.usage_in_bytes" },
+			{ NULL }
+		},
+	  .css_indx = TST_CGROUP_MEMORY
+	},
+	[TST_CGROUP_CPUSET] = { "cpuset", .inner = (items_t){
+			{ "cpuset.cpus", "cpuset.cpus" },
+			{ "cpuset.mems", "cpuset.mems" },
+			{ "cpuset.memory_migrate", "cpuset.memory_migrate" },
+			{ NULL }
+		},
+	  .css_indx = TST_CGROUP_CPUSET
+	},
+	{ NULL }
+};
 
-	return cg_check;
+static const struct tst_cgroup_opts default_opts = { 0 };
+
+/* We should probably allow these to be set in environment
+ * variables */
+static const char *ltp_cgroup_dir = "ltp";
+static const char *ltp_cgroup_drain_dir = "drain";
+static char test_cgroup_dir[PATH_MAX/4];
+static const char *ltp_mount_prefix = "/tmp/cgroup_";
+static const char *ltp_v2_mount = "unified";
+
+#define first_root				\
+	(roots[0].ver ? roots : roots + 1)
+#define for_each_root(r)			\
+	for ((r) = first_root; (r)->ver; (r)++)
+#define for_each_v1_root(r)			\
+	for ((r) = roots + 1; (r)->ver; (r)++)
+#define for_each_css(css)			\
+	for ((css) = items + 1; (css)->name; (css)++)
+
+/* Controller items may only be in a single tree. So when (ss) > 0
+ * we only loop once.
+ */
+#define for_each_tree(cg, css, t)					\
+	for ((t) = (css) ? (cg)->trees_by_css + (css) : (cg)->trees;	\
+	     *(t);							\
+	     (t) = (css) ? (cg)->trees + TST_CGROUP_MAX_TREES : (t) + 1)
+
+static int has_css(uint32_t css_field, enum tst_cgroup_css type)
+{
+	return !!(css_field & (1 << type));
 }
 
-enum tst_cgroup_ver tst_cgroup_version(void)
+static void add_css(uint32_t *css_field, const struct cgroup_item *css)
 {
-        enum tst_cgroup_ver cg_ver;
+	*css_field |= 1 << css->css_indx;
+}
 
-        if (tst_cgroup_check("cgroup2")) {
-                if (!tst_is_mounted("cgroup2") && tst_is_mounted("cgroup"))
-                        cg_ver = TST_CGROUP_V1;
-                else
-                        cg_ver = TST_CGROUP_V2;
+struct cgroup_root *tst_cgroup_root_get(void)
+{
+	return roots[0].ver ? roots : roots + 1;
+}
 
-                goto out;
-        }
+static int cgroup_v2_mounted(void)
+{
+	return !!roots[0].ver;
+}
+
+static int cgroup_v1_mounted(void)
+{
+	return !!roots[1].ver;
+}
 
-        if (tst_cgroup_check("cgroup"))
-                cg_ver = TST_CGROUP_V1;
+static int cgroup_mounted(void)
+{
+	return cgroup_v2_mounted() || cgroup_v1_mounted();
+}
 
-        if (!cg_ver)
-                tst_brk(TCONF, "Cgroup is not configured");
+static struct cgroup_item *cgroup_get_css(enum tst_cgroup_css type)
+{
+	return items + type;
+}
 
-out:
-        return cg_ver;
+static int cgroup_css_on_v2(const struct cgroup_item *css)
+{
+	return css->root && css->root->ver == TST_CGROUP_V2;
 }
 
-static void tst_cgroup1_mount(const char *name, const char *option,
-			const char *mnt_path, const char *new_path)
+int tst_cgroup_tree_mk(const struct tst_cgroup_tree *parent,
+		       const char *name,
+		       struct tst_cgroup_tree *new)
 {
-	char knob_path[PATH_MAX];
-	if (tst_is_mounted(mnt_path))
-		goto out;
+	char *dpath;
 
-	SAFE_MKDIR(mnt_path, 0777);
-	if (mount(name, mnt_path, "cgroup", 0, option) == -1) {
-		if (errno == ENODEV) {
-			if (rmdir(mnt_path) == -1)
-				tst_res(TWARN | TERRNO, "rmdir %s failed", mnt_path);
-			tst_brk(TCONF,
-				 "Cgroup v1 is not configured in kernel");
-		}
-		tst_brk(TBROK | TERRNO, "mount %s", mnt_path);
+	new->root = parent->root;
+	new->name = name;
+	new->parent = parent;
+	new->css_field = parent->css_field;
+	new->we_created_it = 0;
+
+	if (!mkdirat(parent->dir, name, 0777)) {
+		new->we_created_it = 1;
+		goto opendir;
 	}
 
-	/*
-	 * We should assign one or more memory nodes to cpuset.mems and
-	 * cpuset.cpus, otherwise, echo $$ > tasks gives ?ENOSPC: no space
-	 * left on device? when trying to use cpuset.
-	 *
-	 * Or, setting cgroup.clone_children to 1 can help in automatically
-	 * inheriting memory and node setting from parent cgroup when a
-	 * child cgroup is created.
-	 */
-	if (strcmp(option, "cpuset") == 0) {
-		sprintf(knob_path, "%s/cgroup.clone_children", mnt_path);
-		SAFE_FILE_SCANF(knob_path, "%d", &clone_children);
-		SAFE_FILE_PRINTF(knob_path, "%d", 1);
+	if (errno == EEXIST)
+		goto opendir;
+
+	dpath = tst_decode_fd(parent->dir);
+
+	if (errno == EACCES) {
+		tst_brk(TCONF | TERRNO,
+			"Lack permission to make '%s/%s'; premake it or run as root",
+			dpath, name);
+	} else {
+		tst_brk(TBROK | TERRNO,
+			"mkdirat(%d<%s>, '%s', 0777)", parent->dir, dpath, name);
 	}
-out:
-	SAFE_MKDIR(new_path, 0777);
 
-	tst_res(TINFO, "Cgroup(%s) v1 mount at %s success", option, mnt_path);
+	return -1;
+opendir:
+	new->dir = SAFE_OPENAT(parent->dir, name, O_PATH | O_DIRECTORY);
+
+	return 0;
 }
 
-static void tst_cgroup2_mount(const char *mnt_path, const char *new_path)
+void tst_cgroup_print_config(void)
 {
-	if (tst_is_mounted(mnt_path))
-		goto out;
+	struct cgroup_root *t;
+	struct cgroup_item *css;
 
-	SAFE_MKDIR(mnt_path, 0777);
-	if (mount("cgroup2", mnt_path, "cgroup2", 0, NULL) == -1) {
-		if (errno == ENODEV) {
-			if (rmdir(mnt_path) == -1)
-				tst_res(TWARN | TERRNO, "rmdir %s failed", mnt_path);
-			tst_brk(TCONF,
-				 "Cgroup v2 is not configured in kernel");
-		}
-		tst_brk(TBROK | TERRNO, "mount %s", mnt_path);
+	tst_res(TINFO, "Detected Controllers:");
+
+	for_each_css(css) {
+		t = css->root;
+
+		if (!t)
+			continue;
+
+		tst_res(TINFO, "\t%.10s %s @ %s:%s",
+			css->name,
+			t->no_prefix ? "[noprefix]" : "",
+			t->ver == TST_CGROUP_V1 ? "V1" : "V2",
+			t->path);
+	}
+}
+
+static const struct cgroup_item *cgroup_find_css(const char *name)
+{
+	struct cgroup_item *next = items;
+
+	while (next->name && strcmp(name, next->name))
+		next++;
+
+	if (!next->name)
+		next = NULL;
+
+	return next;
+}
+
+/* Determine if a mounted cgroup hierarchy (tree) is unique and record it if so.
+ *
+ * For CGroups V2 this is very simple as there is only one
+ * hierarchy. We just record which controllers are available and check
+ * if this matches what we read from any previous mount points.
+ *
+ * For V1 the set of controllers S is partitioned into sets {P_1, P_2,
+ * ..., P_n} with one or more controllers in each partion. Each
+ * partition P_n can be mounted multiple times, but the same
+ * controller can not appear in more than one partition. Usually each
+ * partition contains a single controller, but, for example, cpu and
+ * cpuacct are often mounted together in the same partiion.
+ *
+ * Each controller partition has its own hierarchy which we must track
+ * and update independently.
+ */
+static void cgroup_root_scan(const char *type, const char *path, char *opts)
+{
+	struct cgroup_root *t = roots;
+	const struct cgroup_item *css;
+	struct cgroup_item *ss;
+	uint32_t css_field = 0;
+	int no_prefix = 0;
+	char buf[BUFSIZ];
+	char *tok;
+	int dfd = SAFE_OPEN(path, O_PATH | O_DIRECTORY);
+
+	if (!strcmp(type, "cgroup"))
+		goto v1;
+
+	SAFE_FILE_READAT(dfd, "cgroup.controllers", buf, sizeof(buf));
+
+	for (tok = strtok(buf, " "); tok; tok = strtok(NULL, " ")) {
+		if ((css = cgroup_find_css(tok)))
+			add_css(&css_field, css);
+	}
+
+	if (t->ver && css_field == t->css_field)
+		goto discard;
+
+	if (t->css_field)
+		tst_brk(TBROK, "Available V2 controllers are changing between scans?");
+
+	t->ver = TST_CGROUP_V2;
+
+	goto backref;
+
+v1:
+	for (tok = strtok(opts, ","); tok; tok = strtok(NULL, ",")) {
+		if ((css = cgroup_find_css(tok)))
+			add_css(&css_field, css);
+
+		no_prefix |= !strcmp("noprefix", tok);
 	}
 
-out:
-	SAFE_MKDIR(new_path, 0777);
+	if (!css_field)
+		goto discard;
 
-	tst_res(TINFO, "Cgroup v2 mount at %s success", mnt_path);
+	for_each_v1_root(t) {
+		if (!(css_field & t->css_field))
+			continue;
+
+		if (css_field == t->css_field)
+			goto discard;
+
+		tst_brk(TBROK,
+			"The intersection of two distinct sets of mounted controllers should be null?"
+			"Check '%s' and '%s'", t->path, path);
+	}
+
+	if (t >= roots + TST_CGROUP_MAX_TREES) {
+		tst_brk(TBROK, "Unique controller mounts have exceeded our limit %d?",
+			TST_CGROUP_MAX_TREES);
+	}
+
+	t->ver = TST_CGROUP_V1;
+
+backref:
+	strcpy(t->path, path);
+	t->mnt.root = t;
+	t->mnt.name = t->path;
+	t->mnt.dir = dfd;
+	t->css_field = css_field;
+	t->no_prefix = no_prefix;
+
+	for_each_css(ss) {
+		if (has_css(t->css_field, ss->css_indx))
+			ss->root = t;
+	}
+
+	return;
+
+discard:
+	close(dfd);
 }
 
-static void tst_cgroupN_umount(const char *mnt_path, const char *new_path)
+void tst_cgroup_scan(void)
 {
-	FILE *fp;
-	int fd;
-	char s_new[BUFSIZ], s[BUFSIZ], value[BUFSIZ];
-	char knob_path[PATH_MAX];
+	struct mntent *mnt;
+	FILE *f = setmntent("/proc/self/mounts", "r");
+
+	if (!f)
+		tst_brk(TBROK | TERRNO, "Can't open /proc/self/mounts");
+
+	mnt = getmntent(f);
+	if (!mnt)
+		tst_brk(TBROK | TERRNO, "Can't read mounts or no mounts?");
 
-	if (!tst_is_mounted(mnt_path))
+	do {
+		if (strncmp(mnt->mnt_type, "cgroup", 6))
+			continue;
+
+		cgroup_root_scan(mnt->mnt_type, mnt->mnt_dir, mnt->mnt_opts);
+	} while ((mnt = getmntent(f)));
+}
+
+static void cgroup_mount_v2(void)
+{
+	char path[PATH_MAX];
+
+	sprintf(path, "%s%s", ltp_mount_prefix, ltp_v2_mount);
+
+	if (!mkdir(path, 0777)) {
+		roots[0].mnt.we_created_it = 1;
+		goto mount;
+	}
+
+	if (errno == EEXIST)
+		goto mount;
+
+	if (errno == EACCES) {
+		tst_res(TINFO | TERRNO,
+			"Lack permission to make %s, premake it or run as root",
+			path);
 		return;
+	}
 
-	/* Move all processes in task(v2: cgroup.procs) to its parent node. */
-	if (tst_cg_ver & TST_CGROUP_V1)
-		sprintf(s, "%s/tasks", mnt_path);
-	if (tst_cg_ver & TST_CGROUP_V2)
-		sprintf(s, "%s/cgroup.procs", mnt_path);
-
-	fd = open(s, O_WRONLY);
-	if (fd == -1)
-		tst_res(TWARN | TERRNO, "open %s", s);
-
-	if (tst_cg_ver & TST_CGROUP_V1)
-		snprintf(s_new, BUFSIZ, "%s/tasks", new_path);
-	if (tst_cg_ver & TST_CGROUP_V2)
-		snprintf(s_new, BUFSIZ, "%s/cgroup.procs", new_path);
-
-	fp = fopen(s_new, "r");
-	if (fp == NULL)
-		tst_res(TWARN | TERRNO, "fopen %s", s_new);
-	if ((fd != -1) && (fp != NULL)) {
-		while (fgets(value, BUFSIZ, fp) != NULL)
-			if (write(fd, value, strlen(value) - 1)
-			    != (ssize_t)strlen(value) - 1)
-				tst_res(TWARN | TERRNO, "write %s", s);
-	}
-	if (tst_cg_ver & TST_CGROUP_V1) {
-		sprintf(knob_path, "%s/cpuset.cpus", mnt_path);
-		if (!access(knob_path, F_OK)) {
-			sprintf(knob_path, "%s/cgroup.clone_children", mnt_path);
-			SAFE_FILE_PRINTF(knob_path, "%d", clone_children);
-		}
+	tst_brk(TBROK | TERRNO, "mkdir(%s, 0777)", path);
+
+mount:
+	if (!mount("cgroup2", path, "cgroup2", 0, NULL)) {
+		tst_res(TINFO, "Mounted V2 CGroups on %s", path);
+		tst_cgroup_scan();
+		roots[0].we_mounted_it = 1;
+		return;
 	}
-	if (fd != -1)
-		close(fd);
-	if (fp != NULL)
-		fclose(fp);
-	if (rmdir(new_path) == -1)
-		tst_res(TWARN | TERRNO, "rmdir %s", new_path);
-	if (umount(mnt_path) == -1)
-		tst_res(TWARN | TERRNO, "umount %s", mnt_path);
-	if (rmdir(mnt_path) == -1)
-		tst_res(TWARN | TERRNO, "rmdir %s", mnt_path);
-
-	if (tst_cg_ver & TST_CGROUP_V1)
-		tst_res(TINFO, "Cgroup v1 unmount success");
-	if (tst_cg_ver & TST_CGROUP_V2)
-		tst_res(TINFO, "Cgroup v2 unmount success");
-}
-
-struct tst_cgroup_path {
-	char *mnt_path;
-	char *new_path;
-	struct tst_cgroup_path *next;
-};
 
-static struct tst_cgroup_path *tst_cgroup_paths;
+	tst_res(TINFO | TERRNO, "Could not mount V2 CGroups on %s", path);
 
-static void tst_cgroup_set_path(const char *cgroup_dir)
+	if (roots[0].mnt.we_created_it) {
+		roots[0].mnt.we_created_it = 0;
+		SAFE_RMDIR(path);
+	}
+}
+
+static void cgroup_mount_v1(enum tst_cgroup_css type)
 {
-	char cgroup_new_dir[PATH_MAX];
-	struct tst_cgroup_path *tst_cgroup_path, *a;
+	struct cgroup_item *css = cgroup_get_css(type);
+	char path[PATH_MAX];
+	int made_dir = 0;
 
-	if (!cgroup_dir)
-		tst_brk(TBROK, "Invalid cgroup dir, plese check cgroup_dir");
+	sprintf(path, "%s%s", ltp_mount_prefix, css->name);
 
-	sprintf(cgroup_new_dir, "%s/ltp_%d", cgroup_dir, rand());
+	if (!mkdir(path, 0777)) {
+		made_dir = 1;
+		goto mount;
+	}
 
-	/* To store cgroup path in the 'path' list */
-	tst_cgroup_path = SAFE_MALLOC(sizeof(struct tst_cgroup_path));
-	tst_cgroup_path->mnt_path = SAFE_MALLOC(strlen(cgroup_dir) + 1);
-	tst_cgroup_path->new_path = SAFE_MALLOC(strlen(cgroup_new_dir) + 1);
-	tst_cgroup_path->next = NULL;
+	if (errno == EEXIST)
+		goto mount;
 
-	if (!tst_cgroup_paths) {
-		tst_cgroup_paths = tst_cgroup_path;
-	} else {
-		a = tst_cgroup_paths;
-		do {
-			if (!a->next) {
-				a->next = tst_cgroup_path;
-				break;
-			}
-			a = a->next;
-		} while (a);
+	if (errno == EACCES) {
+		tst_res(TINFO | TERRNO,
+			"Lack permission to make %s, premake it or run as root",
+			path);
+		return;
+	}
+
+	tst_brk(TBROK | TERRNO, "mkdir(%s, 0777)", path);
+
+mount:
+	if (mount(css->name, path, "cgroup", 0, css->name)) {
+		tst_res(TINFO | TERRNO,
+			"Could not mount V1 CGroup on %s", path);
+
+		if (made_dir)
+			SAFE_RMDIR(path);
+		return;
 	}
 
-	sprintf(tst_cgroup_path->mnt_path, "%s", cgroup_dir);
-	sprintf(tst_cgroup_path->new_path, "%s", cgroup_new_dir);
+	tst_res(TINFO, "Mounted V1 %s CGroup on %s", css->name, path);
+	tst_cgroup_scan();
+	css->root->we_mounted_it = 1;
+	css->root->mnt.we_created_it = made_dir;
+
+	if (type == TST_CGROUP_MEMORY) {
+		SAFE_FILE_PRINTFAT(css->root->mnt.dir,
+				   "memory.use_hierarchy", "%d", 1);
+	}
 }
 
-static char *tst_cgroup_get_path(const char *cgroup_dir)
+static void cgroup_copy_cpuset(const struct cgroup_root *t)
 {
-	struct tst_cgroup_path *a;
+	char buf[BUFSIZ];
+	int i;
+	const char *n0[] = {"mems", "cpus"};
+	const char *n1[] = {"cpuset.mems", "cpuset.cpus"};
+	const char **fname = t->no_prefix ? n0 : n1;
+
+	for (i = 0; i < 2; i++) {
+		SAFE_FILE_READAT(t->mnt.dir, fname[i], buf, sizeof(buf));
+		SAFE_FILE_PRINTFAT(t->ltp.dir, fname[i], "%s", buf);
+	}
+}
 
-	if (!tst_cgroup_paths)
-		return NULL;
+/* Ensure the specified controller is available.
+ *
+ * First we check if the specified controller has a known mount point,
+ * if not then we scan the system. If we find it then we goto ensuring
+ * the LTP group exists in the hierarchy the controller is using.
+ *
+ * If we can't find the controller, then we try to create it. First we
+ * check if the V2 hierarchy/tree is mounted. If it isn't then we try
+ * mounting it and look for the controller. If it is already mounted
+ * then we know the controller is not available on V2 on this system.
+ *
+ * If we can't mount V2 or the controller is not on V2, then we try
+ * mounting it on its own V1 tree.
+ *
+ * Once we have mounted the controller somehow, we create a hierarchy
+ * of cgroups. If we are on V2 we first need to enable the controller
+ * for all children of root. Then we create hierarchy described in
+ * tst_cgroup.h.
+ *
+ * If we are using V1 cpuset then we copy the available mems and cpus
+ * from root to the ltp group and set clone_children on the ltp group
+ * to distribute these settings to the test cgroups. This means the
+ * test author does not have to copy these settings before using the
+ * cpuset.
+ *
+ */
+void tst_cgroup_require(enum tst_cgroup_css type,
+			const struct tst_cgroup_opts *options)
+{
+	const char *const cgsc = "cgroup.subtree_control";
+	struct cgroup_item *css = cgroup_get_css(type);
+	struct cgroup_root *t;
 
-	a = tst_cgroup_paths;
+	if (!options)
+		options = &default_opts;
 
-	while (strcmp(a->mnt_path, cgroup_dir) != 0){
-		if (!a->next) {
-			tst_res(TINFO, "%s is not found", cgroup_dir);
-			return NULL;
-		}
-		a = a->next;
-	};
+	if (css->we_require_it)
+		tst_res(TWARN, "Duplicate tst_cgroup_require(%s, )", css->name);
+	css->we_require_it = 1;
+
+	if (css->root)
+		goto mkdirs;
+
+	tst_cgroup_scan();
+
+	if (css->root)
+		goto mkdirs;
 
-	return a->new_path;
+	if (!cgroup_v2_mounted() && !options->only_mount_v1)
+		cgroup_mount_v2();
+
+	if (css->root)
+		goto mkdirs;
+
+	cgroup_mount_v1(type);
+
+	if (!css->root) {
+		tst_brk(TCONF,
+			"'%s' controller required, but not available", css->name);
+	}
+
+mkdirs:
+	t = css->root;
+	add_css(&t->mnt.css_field, css);
+
+	if (cgroup_css_on_v2(css)) {
+		if (t->we_mounted_it)
+			SAFE_FILE_PRINTFAT(t->mnt.dir, cgsc, "+%s", css->name);
+		else
+			tst_file_printfat(t->mnt.dir, cgsc, "+%s", css->name);
+	}
+
+	if (!t->ltp.dir)
+		tst_cgroup_tree_mk(&t->mnt, ltp_cgroup_dir, &t->ltp);
+	else
+		t->ltp.css_field |= t->mnt.css_field;
+
+	if (cgroup_css_on_v2(css)) {
+		SAFE_FILE_PRINTFAT(t->ltp.dir, cgsc, "+%s", css->name);
+	} else {
+		SAFE_FILE_PRINTFAT(t->ltp.dir, "cgroup.clone_children",
+				   "%d", 1);
+
+		if (type == TST_CGROUP_CPUSET)
+			cgroup_copy_cpuset(t);
+	}
+
+	tst_cgroup_tree_mk(&t->ltp, ltp_cgroup_drain_dir, &t->drain);
+
+	sprintf(test_cgroup_dir, "test-%d", getpid());
+	tst_cgroup_tree_mk(&t->ltp, test_cgroup_dir, &t->test);
 }
 
-static void tst_cgroup_del_path(const char *cgroup_dir)
+static void cgroup_drain(enum tst_cgroup_ver ver, int source, int dest)
 {
-	struct tst_cgroup_path *a, *b;
+	char buf[BUFSIZ];
+	char *tok;
+	const char *fname = ver == TST_CGROUP_V1 ? "tasks" : "cgroup.procs";
+	int fd;
+	ssize_t ret;
 
-	if (!tst_cgroup_paths)
+	ret = SAFE_FILE_READAT(source, fname, buf, sizeof(buf));
+	if (ret < 0)
 		return;
 
-	a = b = tst_cgroup_paths;
+	fd = SAFE_OPENAT(dest, fname, O_WRONLY);
+	if (fd < 0)
+		return;
 
-	while (strcmp(b->mnt_path, cgroup_dir) != 0) {
-		if (!b->next) {
-			tst_res(TINFO, "%s is not found", cgroup_dir);
-			return;
-		}
-		a = b;
-		b = b->next;
-	};
+	for (tok = strtok(buf, "\n"); tok; tok = strtok(NULL, "\n")) {
+		ret = dprintf(fd, "%s", tok);
 
-	if (b == tst_cgroup_paths)
-		tst_cgroup_paths = b->next;
-	else
-		a->next = b->next;
+		if (ret < (ssize_t)strlen(tok))
+			tst_brk(TBROK | TERRNO, "Failed to drain %s", tok);
+	}
+	SAFE_CLOSE(fd);
+}
 
-	free(b->mnt_path);
-	free(b->new_path);
-	free(b);
+static void close_path_fds(struct cgroup_root *t)
+{
+	if (t->test.dir > 0)
+		SAFE_CLOSE(t->test.dir);
+	if (t->ltp.dir > 0)
+		SAFE_CLOSE(t->ltp.dir);
+	if (t->drain.dir > 0)
+		SAFE_CLOSE(t->drain.dir);
+	if (t->mnt.dir > 0)
+		SAFE_CLOSE(t->mnt.dir);
 }
 
-void tst_cgroup_mount(enum tst_cgroup_ctrl ctrl, const char *cgroup_dir)
+/* Maybe remove CGroups used during testing and clear our data
+ *
+ * This will never remove CGroups we did not create to allow tests to
+ * be run in parallel (see enum tst_cgroup_cleanup).
+ *
+ * Each test process is given its own unique CGroup. Unless we want to
+ * stress test the CGroup system. We should at least remove these
+ * unique test CGroups.
+ *
+ * We probably also want to remove the LTP parent CGroup, although
+ * this may have been created by the system manager or another test
+ * (see notes on parallel testing).
+ *
+ * On systems with no initial CGroup setup we may try to destroy the
+ * CGroup roots we mounted so that they can be recreated by another
+ * test. Note that successfully unmounting a CGroup root does not
+ * necessarily indicate that it was destroyed.
+ *
+ * The ltp/drain CGroup is required for cleaning up test CGroups when
+ * we can not move them to the root CGroup. CGroups can only be
+ * removed when they have no members and only leaf or root CGroups may
+ * have processes within them. As test processes create and destroy
+ * their own CGroups they must move themselves either to root or
+ * another leaf CGroup. So we move them to drain while destroying the
+ * unique test CGroup.
+ *
+ * If we have access to root and created the LTP CGroup we then move
+ * the test process to root and destroy the drain and LTP
+ * CGroups. Otherwise we just leave the test process to die in the
+ * drain, much like many a unwanted terrapin.
+ *
+ * Finally we clear any data we have collected on CGroups. This will
+ * happen regardless of whether anything was removed.
+ */
+void tst_cgroup_cleanup(void)
 {
-	char *cgroup_new_dir;
-	char knob_path[PATH_MAX];
+	struct cgroup_root *t;
+	struct cgroup_item *css;
 
-	tst_cg_ver = tst_cgroup_version();
+	if (!cgroup_mounted())
+		goto clear_data;
 
-	tst_cgroup_set_path(cgroup_dir);
-	cgroup_new_dir = tst_cgroup_get_path(cgroup_dir);
+	for_each_root(t) {
+		if (!t->test.name)
+			continue;
 
-	if (tst_cg_ver & TST_CGROUP_V1) {
-		switch(ctrl) {
-		case TST_CGROUP_MEMCG:
-			tst_cgroup1_mount("memcg", "memory", cgroup_dir, cgroup_new_dir);
-		break;
-		case TST_CGROUP_CPUSET:
-			tst_cgroup1_mount("cpusetcg", "cpuset", cgroup_dir, cgroup_new_dir);
-		break;
-		default:
-			tst_brk(TBROK, "Invalid cgroup controller: %d", ctrl);
-		}
+		cgroup_drain(t->ver, t->test.dir, t->drain.dir);
+		SAFE_UNLINKAT(t->ltp.dir, t->test.name, AT_REMOVEDIR);
 	}
 
-	if (tst_cg_ver & TST_CGROUP_V2) {
-		tst_cgroup2_mount(cgroup_dir, cgroup_new_dir);
+	for_each_root(t) {
+		if (!t->ltp.we_created_it)
+			continue;
 
-		switch(ctrl) {
-		case TST_CGROUP_MEMCG:
-			sprintf(knob_path, "%s/cgroup.subtree_control", cgroup_dir);
-			SAFE_FILE_PRINTF(knob_path, "%s", "+memory");
-		break;
-		case TST_CGROUP_CPUSET:
-			tst_brk(TCONF, "Cgroup v2 hasn't achieve cpuset subsystem");
-		break;
-		default:
-			tst_brk(TBROK, "Invalid cgroup controller: %d", ctrl);
-		}
+		cgroup_drain(t->ver, t->drain.dir, t->mnt.dir);
+
+		if (t->drain.name)
+			SAFE_UNLINKAT(t->ltp.dir, t->drain.name, AT_REMOVEDIR);
+
+		if (t->ltp.name)
+			SAFE_UNLINKAT(t->mnt.dir, t->ltp.name, AT_REMOVEDIR);
+	}
+
+	for_each_css(css) {
+		if (!cgroup_css_on_v2(css) || !css->root->we_mounted_it)
+			continue;
+
+		SAFE_FILE_PRINTFAT(css->root->mnt.dir, "cgroup.subtree_control",
+				   "-%s", css->name);
+	}
+
+	for_each_root(t) {
+		if (!t->we_mounted_it)
+			continue;
+
+		/* This probably does not result in the CGroup root
+		 * being destroyed */
+		if (umount2(t->path, MNT_DETACH))
+			continue;
+
+		SAFE_RMDIR(t->path);
+	}
+
+clear_data:
+	for_each_css(css) {
+		css->root = NULL;
+		css->we_require_it = 0;
 	}
+
+	for_each_root(t)
+		close_path_fds(t);
+
+	memset(roots, 0, sizeof(roots));
 }
 
-void tst_cgroup_umount(const char *cgroup_dir)
+static void cgroup_init(struct tst_cgroup *cg, const char *name)
 {
-	char *cgroup_new_dir;
+	memset(cg, 0, sizeof(*cg));
+	cg->name = name;
+}
+
+static void cgroup_add(struct tst_cgroup *cg,
+		       struct tst_cgroup_tree *tree)
+{
+	const struct cgroup_item *css;
+	int i;
+
+	if (tree->root->ver == TST_CGROUP_V2)
+		cg->trees_by_css[0] = tree;
 
-	cgroup_new_dir = tst_cgroup_get_path(cgroup_dir);
-	tst_cgroupN_umount(cgroup_dir, cgroup_new_dir);
-	tst_cgroup_del_path(cgroup_dir);
+	for_each_css(css) {
+		if (has_css(tree->css_field, css->css_indx))
+			cg->trees_by_css[css->css_indx] = tree;
+	}
+
+	for (i = 0; cg->trees[i]; i++);
+	cg->trees[i] = tree;
 }
 
-void tst_cgroup_set_knob(const char *cgroup_dir, const char *knob, long value)
+struct tst_cgroup *tst_cgroup_mk(const struct tst_cgroup *parent,
+				 const char *name)
 {
-	char *cgroup_new_dir;
-	char knob_path[PATH_MAX];
+	struct tst_cgroup *cg;
+	struct tst_cgroup_tree *const *t;
+	struct tst_cgroup_tree *nt;
 
-	cgroup_new_dir = tst_cgroup_get_path(cgroup_dir);
-	sprintf(knob_path, "%s/%s", cgroup_new_dir, knob);
-	SAFE_FILE_PRINTF(knob_path, "%ld", value);
+	cg = SAFE_MALLOC(sizeof(*cg));
+	cgroup_init(cg, name);
+
+	for_each_tree(parent, 0, t) {
+		nt = SAFE_MALLOC(sizeof(*nt));
+		tst_cgroup_tree_mk(*t, name, nt);
+		cgroup_add(cg, nt);
+	}
+
+	return cg;
 }
 
-void tst_cgroup_move_current(const char *cgroup_dir)
+struct tst_cgroup *tst_cgroup_rm(struct tst_cgroup *cg)
 {
-	if (tst_cg_ver & TST_CGROUP_V1)
-		tst_cgroup_set_knob(cgroup_dir, "tasks", getpid());
+	struct tst_cgroup_tree **t;
+
+	for_each_tree(cg, 0, t) {
+		close((*t)->dir);
+		SAFE_UNLINKAT((*t)->parent->dir, (*t)->name, AT_REMOVEDIR);
+		free(*t);
+	}
 
-	if (tst_cg_ver & TST_CGROUP_V2)
-		tst_cgroup_set_knob(cgroup_dir, "cgroup.procs", getpid());
+	free(cg);
+	return NULL;
 }
 
-void tst_cgroup_mem_set_maxbytes(const char *cgroup_dir, long memsz)
+/* Traverse the item tree to find an item. Also fixes up the indx field. */
+static const struct cgroup_item *
+cgroup_item_find(const char *file, const int lineno, const char *name)
 {
-	if (tst_cg_ver & TST_CGROUP_V1)
-		tst_cgroup_set_knob(cgroup_dir, "memory.limit_in_bytes", memsz);
+	struct cgroup_item *item;
+	const struct cgroup_item *css;
+	char buf[32];
+	const char *mem_name;
+	size_t len = MIN(sizeof(buf) - 1, strcspn(name, "."));
 
-	if (tst_cg_ver & TST_CGROUP_V2)
-		tst_cgroup_set_knob(cgroup_dir, "memory.max", memsz);
+	memcpy(buf, name, len);
+	buf[len] = '\0';
+
+	css = cgroup_find_css(buf);
+
+	if (!css) {
+		tst_brk_(file, lineno, TBROK,
+			 "Did not find controller '%s'\n", buf);
+		return NULL;
+	}
+
+	name += len + 1;
+
+	for (item = css->inner; item->name; item++) {
+		mem_name = item->name + len + 1;
+
+		if (!strcmp(name, mem_name))
+			break;
+	}
+
+	if (!item->name) {
+		tst_brk_(file, lineno, TBROK,
+			 "Did not find '%s' in '%s'\n", name, css->name);
+		return NULL;
+	}
+
+	item->css_indx = css->css_indx;
+
+	return item;
 }
 
-int tst_cgroup_mem_swapacct_enabled(const char *cgroup_dir)
+enum tst_cgroup_ver safe_cgroup_ver(const char *file, const int lineno,
+				    const struct tst_cgroup *cg,
+				    const char *name)
 {
-	char *cgroup_new_dir;
-	char knob_path[PATH_MAX];
+	const struct cgroup_item *const it = cgroup_find_css(name);
+	const struct tst_cgroup_tree *t;
+
+	if (!strcmp(name, "cgroup")) {
+		tst_brk_(file, lineno,
+			 TBROK,
+			 "cgroup may be present on both V1 and V2 hierarchies");
+		return 0;
+	}
 
-	cgroup_new_dir = tst_cgroup_get_path(cgroup_dir);
+	if (!it) {
+		tst_brk_(file, lineno,
+			 TBROK, "Unknown controller '%s'", name);
+		return 0;
+	}
 
-	if (tst_cg_ver & TST_CGROUP_V1) {
-		sprintf(knob_path, "%s/%s",
-				cgroup_new_dir, "/memory.memsw.limit_in_bytes");
+	t = cg->trees_by_css[it->css_indx];
 
-		if ((access(knob_path, F_OK) == -1)) {
-			if (errno == ENOENT)
-				tst_res(TCONF, "memcg swap accounting is disabled");
-			else
-				tst_brk(TBROK | TERRNO, "failed to access %s", knob_path);
-		} else {
-			return 1;
-		}
+	if (!t) {
+		tst_brk_(file, lineno,
+			 TBROK, "%s controller not attached to CGroup %s",
+			 name, cg->name);
+		return 0;
 	}
 
-	if (tst_cg_ver & TST_CGROUP_V2) {
-		sprintf(knob_path, "%s/%s",
-				cgroup_new_dir, "/memory.swap.max");
+	return t->root->ver;
+}
+
+static const char *cgroup_item_alias(const struct cgroup_item *const it,
+				     const struct tst_cgroup_tree *const tree)
+{
+	if (tree->root->ver != TST_CGROUP_V1)
+		return it->name;
+
+	if (it->css_indx == TST_CGROUP_CPUSET && tree->root->no_prefix)
+		return strchr(it->name_v1, '.') + 1;
+
+	return it->name_v1;
+}
+
+int safe_cgroup_has(const char *file, const int lineno,
+		    const struct tst_cgroup *cg, const char *name)
+{
+	const struct cgroup_item *const it =
+		cgroup_item_find(file, lineno, name);
+	struct tst_cgroup_tree *const *t;
+	const char *alias;
+
+	if (!it)
+		return 0;
+
+	for_each_tree(cg, it->css_indx, t) {
+		if (!(alias = cgroup_item_alias(it, *t)))
+		    continue;
 
-		if ((access(knob_path, F_OK) == -1)) {
-			if (errno == ENOENT)
-				tst_res(TCONF, "memcg swap accounting is disabled");
-			else
-				tst_brk(TBROK | TERRNO, "failed to access %s", knob_path);
-		} else {
+		if (!faccessat((*t)->dir, name, F_OK, 0))
 			return 1;
-		}
+
+		if (errno == ENOENT)
+			continue;
+
+		tst_brk_(file, lineno, TBROK | TERRNO,
+			 "faccessat(%d<%s>, %s, F_OK, 0)",
+			 (*t)->dir, tst_decode_fd((*t)->dir), alias);
 	}
 
 	return 0;
 }
 
-void tst_cgroup_mem_set_maxswap(const char *cgroup_dir, long memsz)
+static struct tst_cgroup *cgroup_from_roots(size_t tree_off)
 {
-	if (tst_cg_ver & TST_CGROUP_V1)
-		tst_cgroup_set_knob(cgroup_dir, "memory.memsw.limit_in_bytes", memsz);
+	struct cgroup_root *r;
+	struct tst_cgroup_tree *t;
+	struct tst_cgroup *cg;
 
-	if (tst_cg_ver & TST_CGROUP_V2)
-		tst_cgroup_set_knob(cgroup_dir, "memory.swap.max", memsz);
-}
+	cg = tst_alloc(sizeof(*cg));
+	cgroup_init(cg, NULL);
 
-void tst_cgroup_cpuset_read_files(const char *cgroup_dir, const char *filename,
-	char *retbuf, size_t retbuf_sz)
-{
-	int fd;
-	char *cgroup_new_dir;
-	char knob_path[PATH_MAX];
+	for_each_root(r) {
+		t = (typeof(t))(((char *)r) + tree_off);
 
-	cgroup_new_dir = tst_cgroup_get_path(cgroup_dir);
+		if (t->css_field)
+			cgroup_add(cg, t);
+	}
 
-	/*
-	 * try either '/dev/cpuset/XXXX' or '/dev/cpuset/cpuset.XXXX'
-	 * please see Documentation/cgroups/cpusets.txt from kernel src
-	 * for details
-	 */
-	sprintf(knob_path, "%s/%s", cgroup_new_dir, filename);
-	fd = open(knob_path, O_RDONLY);
-	if (fd == -1) {
-		if (errno == ENOENT) {
-			sprintf(knob_path, "%s/cpuset.%s",
-					cgroup_new_dir, filename);
-			fd = SAFE_OPEN(knob_path, O_RDONLY);
-		} else
-			tst_brk(TBROK | TERRNO, "open %s", knob_path);
+	if (cg->trees[0]) {
+		cg->name = cg->trees[0]->name;
+		return cg;
 	}
 
-	memset(retbuf, 0, retbuf_sz);
-	if (read(fd, retbuf, retbuf_sz) < 0)
-		tst_brk(TBROK | TERRNO, "read %s", knob_path);
+	tst_brk(TBROK,
+		"No CGroups found; maybe you forgot to call tst_cgroup_require?");
 
-	close(fd);
+	return cg;
 }
 
-void tst_cgroup_cpuset_write_files(const char *cgroup_dir, const char *filename, const char *buf)
+const struct tst_cgroup *tst_cgroup_get_test(void)
 {
-	int fd;
-	char *cgroup_new_dir;
-	char knob_path[PATH_MAX];
+	return cgroup_from_roots(offsetof(struct cgroup_root, test));
+}
 
-	cgroup_new_dir = tst_cgroup_get_path(cgroup_dir);
+const struct tst_cgroup *tst_cgroup_get_drain(void)
+{
+	return cgroup_from_roots(offsetof(struct cgroup_root, drain));
+}
 
-	/*
-	 * try either '/dev/cpuset/XXXX' or '/dev/cpuset/cpuset.XXXX'
-	 * please see Documentation/cgroups/cpusets.txt from kernel src
-	 * for details
-	 */
-	sprintf(knob_path, "%s/%s", cgroup_new_dir, filename);
-	fd = open(knob_path, O_WRONLY);
-	if (fd == -1) {
-		if (errno == ENOENT) {
-			sprintf(knob_path, "%s/cpuset.%s", cgroup_new_dir, filename);
-			fd = SAFE_OPEN(knob_path, O_WRONLY);
-		} else
-			tst_brk(TBROK | TERRNO, "open %s", knob_path);
+ssize_t safe_cgroup_read(const char *file, const int lineno,
+			 const struct tst_cgroup *cg, const char *name,
+			 char *out, size_t len)
+{
+	size_t prev_len = 0;
+	const struct cgroup_item *const it =
+		cgroup_item_find(file, lineno, name);
+	struct tst_cgroup_tree *const *t;
+	const char *alias;
+	char buf[BUFSIZ];
+
+	for_each_tree(cg, it->css_indx, t) {
+		if (!(alias = cgroup_item_alias(it, *t)))
+			continue;
+
+		if (prev_len)
+			memcpy(buf, out, prev_len);
+
+		TEST(safe_file_readat(file, lineno,
+				      (*t)->dir, alias, out, len));
+		if (TST_RET < 0)
+			continue;
+
+		if (prev_len && memcmp(out, buf, prev_len)) {
+			tst_brk_(file, lineno, TBROK,
+				 "%s has different value across roots",
+				 name);
+			break;
+		}
+
+		prev_len = MIN(sizeof(buf), (size_t)TST_RET);
+	}
+
+	out[MAX(TST_RET, 0)] = '\0';
+
+	return TST_RET;
+}
+
+void safe_cgroup_printf(const char *file, const int lineno,
+			const struct tst_cgroup *cg, const char *name,
+			const char *fmt, ...)
+{
+	const struct cgroup_item *const it =
+		cgroup_item_find(file, lineno, name);
+	struct tst_cgroup_tree *const *t;
+	const char *alias;
+	va_list va;
+
+	for_each_tree(cg, it->css_indx, t) {
+		if (!(alias = cgroup_item_alias(it, *t)))
+		    continue;
+
+		va_start(va, fmt);
+		safe_file_vprintfat(file, lineno, (*t)->dir, alias, fmt, va);
+		va_end(va);
 	}
+}
 
-	SAFE_WRITE(1, fd, buf, strlen(buf));
+void safe_cgroup_scanf(const char *file, const int lineno,
+		       const struct tst_cgroup *cg, const char *name,
+		       const char *fmt, ...)
+{
+	va_list va;
+	char buf[BUFSIZ];
+	ssize_t len = safe_cgroup_read(file, lineno, cg, name, buf, sizeof(buf));
 
-	close(fd);
+	if (len < 1)
+		return;
+
+	va_start(va, fmt);
+	if (vsscanf(buf, fmt, va) < 1) {
+		tst_brk_(file, lineno, TBROK | TERRNO,
+			 "'%s': vsscanf('%s', '%s', ...)", name, buf, fmt);
+	}
+	va_end(va);
 }
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [LTP] [PATCH v3 4/7] Add new CGroups API library tests
  2021-04-12 14:54 [LTP] [PATCH v3 0/7] CGroup API rewrite Richard Palethorpe
                   ` (2 preceding siblings ...)
  2021-04-12 14:55 ` [LTP] [PATCH v3 3/7] Add new CGroups APIs Richard Palethorpe
@ 2021-04-12 14:55 ` Richard Palethorpe
  2021-04-16  7:22   ` Li Wang
  2021-04-12 14:55 ` [LTP] [PATCH v3 5/7] docs: Update CGroups API Richard Palethorpe
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 22+ messages in thread
From: Richard Palethorpe @ 2021-04-12 14:55 UTC (permalink / raw)
  To: ltp

Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com>
---
 lib/newlib_tests/.gitignore     |  2 +
 lib/newlib_tests/test21.c       | 45 ++++++++---------
 lib/newlib_tests/tst_cgroup01.c | 51 +++++++++++++++++++
 lib/newlib_tests/tst_cgroup02.c | 90 +++++++++++++++++++++++++++++++++
 4 files changed, 163 insertions(+), 25 deletions(-)
 create mode 100644 lib/newlib_tests/tst_cgroup01.c
 create mode 100644 lib/newlib_tests/tst_cgroup02.c

diff --git a/lib/newlib_tests/.gitignore b/lib/newlib_tests/.gitignore
index 6c2612259..98611e65d 100644
--- a/lib/newlib_tests/.gitignore
+++ b/lib/newlib_tests/.gitignore
@@ -16,6 +16,8 @@ test15
 test16
 tst_capability01
 tst_capability02
+tst_cgroup01
+tst_cgroup02
 tst_device
 tst_safe_fileops
 tst_res_hexd
diff --git a/lib/newlib_tests/test21.c b/lib/newlib_tests/test21.c
index f29a2f702..643cdfd5b 100644
--- a/lib/newlib_tests/test21.c
+++ b/lib/newlib_tests/test21.c
@@ -8,54 +8,49 @@
  * Tests tst_cgroup.h APIs
  */
 
+#include <stdlib.h>
+
 #include "tst_test.h"
 #include "tst_cgroup.h"
 
-#define PATH_CGROUP1 "/mnt/liwang1"
-#define PATH_CGROUP2 "/mnt/liwang2"
 #define MEMSIZE 1024 * 1024
 
+static const struct tst_cgroup *cg;
+
 static void do_test(void)
 {
 	pid_t pid = SAFE_FORK();
 
 	switch (pid) {
 	case 0:
-		tst_cgroup_move_current(PATH_CGROUP1);
-		tst_cgroup_mem_set_maxbytes(PATH_CGROUP1, MEMSIZE);
-		tst_cgroup_mem_set_maxswap(PATH_CGROUP1, MEMSIZE);
-
-		tst_cgroup_move_current(PATH_CGROUP2);
-
-	break;
+		SAFE_CGROUP_PRINTF(cg, "cgroup.procs", "%d", getpid());
+		SAFE_CGROUP_PRINTF(cg, "memory.max", "%d", MEMSIZE);
+		if (SAFE_CGROUP_HAS(cg, "memory.swap.max")) {
+			SAFE_CGROUP_PRINTF(cg, "memory.swap.max",
+					   "%d", MEMSIZE);
+		}
+		exit(0);
+		break;
 	default:
-		tst_cgroup_move_current(PATH_TMP_CG_CST);
-
-		tst_cgroup_move_current(PATH_TMP_CG_MEM);
-		tst_cgroup_mem_set_maxbytes(PATH_TMP_CG_MEM, MEMSIZE);
-		tst_cgroup_mem_set_maxswap(PATH_TMP_CG_MEM, MEMSIZE);
+		SAFE_CGROUP_PRINTF(cg, "cgroup.procs", "%d", getpid());
 	break;
 	}
 
-	tst_res(TPASS, "Cgroup mount test");
+	tst_reap_children();
+	tst_res(TPASS, "Cgroup test");
 }
 
 static void setup(void)
 {
-	tst_cgroup_mount(TST_CGROUP_MEMCG, PATH_TMP_CG_MEM);
-	tst_cgroup_mount(TST_CGROUP_MEMCG, PATH_CGROUP1);
-
-	tst_cgroup_mount(TST_CGROUP_CPUSET, PATH_TMP_CG_CST);
-	tst_cgroup_mount(TST_CGROUP_CPUSET, PATH_CGROUP2);
+	/* Omitting the below causes a warning */
+	/* tst_cgroup_require(TST_CGROUP_CPUSET, NULL); */
+	tst_cgroup_require(TST_CGROUP_MEMORY, NULL);
+	cg = tst_cgroup_get_test();
 }
 
 static void cleanup(void)
 {
-	tst_cgroup_umount(PATH_TMP_CG_MEM);
-	tst_cgroup_umount(PATH_CGROUP1);
-
-	tst_cgroup_umount(PATH_TMP_CG_CST);
-	tst_cgroup_umount(PATH_CGROUP2);
+	tst_cgroup_cleanup();
 }
 
 static struct tst_test test = {
diff --git a/lib/newlib_tests/tst_cgroup01.c b/lib/newlib_tests/tst_cgroup01.c
new file mode 100644
index 000000000..ec80c6d5c
--- /dev/null
+++ b/lib/newlib_tests/tst_cgroup01.c
@@ -0,0 +1,51 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/* Copyright (c) 2021 SUSE LLC */
+
+#include <stdio.h>
+
+#include "tst_test.h"
+#include "tst_cgroup.h"
+
+static char *only_mount_v1;
+static char *no_cleanup;
+static struct tst_option opts[] = {
+	{"v", &only_mount_v1, "-v\tOnly try to mount CGroups V1"},
+	{"n", &no_cleanup, "-n\tLeave CGroups created by test"},
+	{NULL, NULL, NULL},
+};
+struct tst_cgroup_opts cgopts;
+
+static void do_test(void)
+{
+	tst_res(TPASS, "pass");
+}
+
+static void setup(void)
+{
+	cgopts.only_mount_v1 = !!only_mount_v1,
+
+	tst_cgroup_scan();
+	tst_cgroup_print_config();
+
+	tst_cgroup_require(TST_CGROUP_MEMORY, &cgopts);
+	tst_cgroup_print_config();
+	tst_cgroup_require(TST_CGROUP_CPUSET, &cgopts);
+	tst_cgroup_print_config();
+}
+
+static void cleanup(void)
+{
+	if (no_cleanup) {
+		tst_res(TINFO, "no cleanup");
+	} else {
+		tst_res(TINFO, "cleanup");
+		tst_cgroup_cleanup();
+	}
+}
+
+static struct tst_test test = {
+	.test_all = do_test,
+	.setup = setup,
+	.cleanup = cleanup,
+	.options = opts,
+};
diff --git a/lib/newlib_tests/tst_cgroup02.c b/lib/newlib_tests/tst_cgroup02.c
new file mode 100644
index 000000000..be5bea1d8
--- /dev/null
+++ b/lib/newlib_tests/tst_cgroup02.c
@@ -0,0 +1,90 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/* Copyright (c) 2021 SUSE LLC */
+
+#include <stdlib.h>
+#include <stdio.h>
+
+#include "tst_test.h"
+#include "tst_cgroup.h"
+
+static char *only_mount_v1;
+static char *no_cleanup;
+static struct tst_option opts[] = {
+	{"v", &only_mount_v1, "-v\tOnly try to mount CGroups V1"},
+	{"n", &no_cleanup, "-n\tLeave CGroups created by test"},
+	{NULL, NULL, NULL},
+};
+static struct tst_cgroup_opts cgopts;
+static const struct tst_cgroup *cg;
+static const struct tst_cgroup *cg_drain;
+static struct tst_cgroup *cg_child;
+
+static void do_test(void)
+{
+	char buf[BUFSIZ];
+	size_t mem;
+
+	if (SAFE_CGROUP_VER(cg, "memory") != TST_CGROUP_V1)
+		SAFE_CGROUP_PRINT(cg, "cgroup.subtree_control", "+memory");
+	if (SAFE_CGROUP_VER(cg, "cpuset") != TST_CGROUP_V1)
+		SAFE_CGROUP_PRINT(cg, "cgroup.subtree_control", "+cpuset");
+
+	cg_child = tst_cgroup_mk(cg, "child");
+	if (!SAFE_FORK()) {
+		SAFE_CGROUP_PRINTF(cg_child, "cgroup.procs", "%d", getpid());
+
+		SAFE_CGROUP_SCANF(cg_child, "memory.current", "%zu", &mem);
+		tst_res(TPASS, "child/memory.current = %zu", mem);
+		SAFE_CGROUP_PRINTF(cg_child, "memory.max",
+				   "%zu", (1UL << 24) - 1);
+		SAFE_CGROUP_PRINTF(cg_child, "memory.swap.max",
+				   "%zu", 1UL << 31);
+
+		SAFE_CGROUP_READ(cg_child, "cpuset.mems", buf, sizeof(buf));
+		tst_res(TPASS, "child/cpuset.mems = %s", buf);
+		SAFE_CGROUP_PRINT(cg_child, "cpuset.mems", buf);
+
+		exit(0);
+	}
+
+	SAFE_CGROUP_PRINTF(cg, "memory.max", "%zu", (1UL << 24) - 1);
+	SAFE_CGROUP_PRINTF(cg_child, "cgroup.procs", "%d", getpid());
+	SAFE_CGROUP_SCANF(cg, "memory.current", "%zu", &mem);
+	tst_res(TPASS, "memory.current = %zu", mem);
+
+	tst_reap_children();
+	SAFE_CGROUP_PRINTF(cg_drain, "cgroup.procs", "%d", getpid());
+	cg_child = tst_cgroup_rm(cg_child);
+}
+
+static void setup(void)
+{
+	cgopts.only_mount_v1 = !!only_mount_v1,
+
+	tst_cgroup_scan();
+	tst_cgroup_print_config();
+
+	tst_cgroup_require(TST_CGROUP_MEMORY, &cgopts);
+	tst_cgroup_require(TST_CGROUP_CPUSET, &cgopts);
+
+	cg = tst_cgroup_get_test();
+	cg_drain = tst_cgroup_get_drain();
+}
+
+static void cleanup(void)
+{
+	if (cg_child) {
+		SAFE_CGROUP_PRINTF(cg_drain, "cgroup.procs", "%d", getpid());
+		cg_child = tst_cgroup_rm(cg_child);
+	}
+	if (!no_cleanup)
+		tst_cgroup_cleanup();
+}
+
+static struct tst_test test = {
+	.test_all = do_test,
+	.setup = setup,
+	.cleanup = cleanup,
+	.options = opts,
+	.forks_child = 1,
+};
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [LTP] [PATCH v3 5/7] docs: Update CGroups API
  2021-04-12 14:54 [LTP] [PATCH v3 0/7] CGroup API rewrite Richard Palethorpe
                   ` (3 preceding siblings ...)
  2021-04-12 14:55 ` [LTP] [PATCH v3 4/7] Add new CGroups API library tests Richard Palethorpe
@ 2021-04-12 14:55 ` Richard Palethorpe
  2021-04-16  8:11   ` Li Wang
  2021-04-12 14:55 ` [LTP] [PATCH v3 6/7] mem: Convert tests to new " Richard Palethorpe
  2021-04-12 14:55 ` [LTP] [PATCH v3 7/7] madvise06: Convert " Richard Palethorpe
  6 siblings, 1 reply; 22+ messages in thread
From: Richard Palethorpe @ 2021-04-12 14:55 UTC (permalink / raw)
  To: ltp

Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com>
---
 doc/test-writing-guidelines.txt | 175 +++++++++++++++++++++++++++++---
 1 file changed, 162 insertions(+), 13 deletions(-)

diff --git a/doc/test-writing-guidelines.txt b/doc/test-writing-guidelines.txt
index 469df1184..9c94a282b 100644
--- a/doc/test-writing-guidelines.txt
+++ b/doc/test-writing-guidelines.txt
@@ -2167,48 +2167,197 @@ the field value of file.
 
 2.2.36 Using Control Group
 ^^^^^^^^^^^^^^^^^^^^^^^^^^
-Some of LTP tests need Control Group in their configuration, tst_cgroup.h provides
-APIs to make cgroup unified mounting at setup phase to be possible. The method?is
-extracted from mem.h with the purpose of?extendible for further developing, and
-trying to compatible the current two versions of cgroup.
 
-Considering there are many differences between cgroup v1 and v2, here we capsulate
-the detail of cgroup mounting in high-level functions, which will be easier to use
-cgroup without caring about too much technical thing.? ?
+Some LTP tests need specific Control Group configurations. tst_cgroup.h provides
+APIs to discover and use CGroups. There are many differences between CGroups API
+V1 and V2. We encapsulate the details of configuring CGroups in high-level
+functions which follow the V2 kernel API. Allowing one to use CGroups without
+caring too much about the current system's configuration.
 
-Also, we do cgroup mount/umount work for the different hierarchy automatically.
+Also, the LTP library will automatically mount/umount and configure the CGroup
+hierarchies if that is required (e.g. if you run the tests from init with no
+system manager).
 
 [source,c]
 -------------------------------------------------------------------------------
 #include "tst_test.h"
+#include "tst_cgroup.h"
+
+static const struct tst_cgroup *cg;
 
 static void run(void)
 {
 	...
+	// do test under cgroup
+	...
+}
+
+static void setup(void)
+{
+	tst_cgroup_require(TST_CGROUP_MEMORY, NULL);
+	cg = tst_cgroup_get_test();
+	SAFE_CGROUP_PRINTF(cg, "cgroup.procs", "%d", getpid());
+	SAFE_CGROUP_PRINTF(cg, "memory.max", "%lu", MEMSIZE);
+	if (SAFE_CGROUP_HAS(cg, "memory.swap.max"))
+		SAFE_CGROUP_PRINTF(cg, "memory.swap.max", "%zu", memsw);
+}
 
-	tst_cgroup_move_current(PATH_TMP_CG_MEM);
-	tst_cgroup_mem_set_maxbytes(PATH_TMP_CG_MEM, MEMSIZE);
+static void cleanup(void)
+{
+	tst_cgroup_cleanup();
+}
 
-	// do test under cgroup
+struct tst_test test = {
+	.setup = setup,
+	.test_all = run,
+	.cleanup = cleanup,
 	...
+};
+-------------------------------------------------------------------------------
+
+Above, we first ensure the memory controller is available on the
+test's CGroup with 'tst_cgroup_require'. We then get a structure,
+'cg', which represents the test's CGroup. Note that
+'tst_cgroup_get_test' should not be called many times, as it is
+allocated in a guarded buffer (See section 2.2.31). Therefor it is
+best to call it once in 'setup' and not 'run' because 'run' may be
+repeated with the '-i' option.
+
+We then write the current processes PID into 'cgroup.procs', which
+moves the current process into the test's CGroup. After which we set
+the maximum memory size by writing to 'memory.max'. If the memory
+controller is mounted on CGroups V1 then the library will actually
+write to 'memory.limit_in_bytes'. As a general rule, if a file exists
+on both CGroup versions, then we use the V2 naming.
+
+Some controller features, such as 'memory.swap', can be
+disabled. Therefor we need to check if they exist before accessing
+them. This can be done with 'TST_CGROUP_HAS' which can be called on
+any control file or feature.
+
+Most tests only require setting a few limits similar to the above. In
+such cases the differences between V1 and V2 are hidden. Setup and
+cleanup is also mostly hidden. However things can get much worse.
+
+[source,c]
+-------------------------------------------------------------------------------
+static const struct tst_cgroup *cg;
+static const struct tst_cgroup *cg_drain;
+static struct tst_cgroup *cg_child;
+
+static void run(void)
+{
+	char buf[BUFSIZ];
+	size_t mem = 0;
+
+	cg_child = tst_cgroup_mk(cg, "child");
+	SAFE_CGROUP_PRINTF(cg_child, "cgroup.procs", "%d", getpid());
+
+	if (SAFE_CGROUP_VER(cg, "memory") != TST_CGROUP_V1)
+		SAFE_CGROUP_PRINT(cg, "cgroup.subtree_control", "+memory");
+	if (SAFE_CGROUP_VER(cg, "cpuset") != TST_CGROUP_V1)
+		SAFE_CGROUP_PRINT(cg, "cgroup.subtree_control", "+cpuset");
+
+	if (!SAFE_FORK()) {
+		SAFE_CGROUP_PRINTF(cg_child, "cgroup.procs", "%d", getpid());
+
+		if (SAFE_CGROUP_HAS(cg_child, "memory.swap"))
+			SAFE_CGROUP_SCANF(cg_child, "memory.swap.current", "%zu", &mem);
+		SAFE_CGROUP_READ(cg_child, "cpuset.mems", buf, sizeof(buf));
+
+		// Do something with cpuset.mems and memory.current values
+		...
+
+		exit(0);
+	}
+
+	tst_reap_children();
+	SAFE_CGROUP_PRINTF(cg_drain, "cgroup.procs", "%d", getpid());
+	cg_child = tst_cgroup_rm(cg_child);
 }
 
 static void setup(void)
 {
-	tst_cgroup_mount(TST_CGROUP_MEMCG, PATH_TMP_CG_MEM);
+	tst_cgroup_require(TST_CGROUP_MEMORY, NULL);
+	tst_cgroup_require(TST_CGROUP_CPUSET, NULL);
+
+	cg = tst_cgroup_get_test();
+	cg_drain = tst_cgroup_get_drain();
 }
 
 static void cleanup(void)
 {
-	tst_cgroup_umount(PATH_TMP_CG_MEM);
+	if (cg_child) {
+		SAFE_CGROUP_PRINTF(cg_drain, "cgroup.procs", "%d", getpid());
+		cg_child = tst_cgroup_rm(cg_child);
+	}
+
+	tst_cgroup_cleanup();
 }
 
 struct tst_test test = {
+	.setup = setup,
 	.test_all = run,
+	.cleanup = cleanup,
 	...
 };
 -------------------------------------------------------------------------------
 
+Starting with setup; we can see here that we also fetch the 'drain'
+CGroup. This is a shared group (between parallel tests) which may
+contain processes from other tests. It should have default settings and
+these should not be changed by the test. It can be used to remove
+processes from other CGroups incase the hierarchy root is not
+accessible.
+
+In 'run', we first create a child CGroup with 'tst_cgroup_mk'. As we
+create this CGroup in 'run' we should also remove it at the end of
+run. We also need to check if it exists and remove it in cleanup as
+well. Because there are 'SAFE_' functions which may jump to cleanup.
+
+We then move the main test process into the child CGroup. This is
+important as it means that before we destroy the child CGroup we have
+to move the main test process elsewhere. For that we use the 'drain'
+group.
+
+Next we enable the memory and cpuset controller configuration on the
+test CGroup's descendants (i.e. 'cg_child'). This allows each child to
+have its own settings. The file 'cgroup.subtree_control' does not
+exist on V1. Because it is possible to have both V1 and V2 active at
+the same time. We can not simply check if 'subtree_control' exists
+before writing to it. We have to check if a particular controller is
+on V2 before trying to add it to 'subtree_control'. Trying to add a V1
+controller will result in 'ENOENT'.
+
+We then fork a child process and add this to the child CGroup. Within
+the child process we try to read 'memory.swap.current'. It is possible
+that the memory controller was compiled without swap support, so it is
+necessary to check if 'memory.swap' is enabled. That is unless the
+test will never reach the point where 'memory.swap.*' are used without
+swap support.
+
+The parent process waits for the child process to be reaped before
+destroying the child CGroup. So there is no need to transfer the child
+to drain. However the parent process must be moved otherwise we will
+get 'EBUSY' when trying to remove the child CGroup.
+
+Another example of an edge case is the following.
+
+[source,c]
+-------------------------------------------------------------------------------
+	if (tst_cgroup_ver(cg, "memory") == TST_CGROUP_V1)
+		SAFE_CGROUP_PRINTF(cg, "memory.swap.max", "%lu", ~0UL);
+	else
+		SAFE_CGROUP_PRINT(cg, "memory.swap.max", "max");
+-------------------------------------------------------------------------------
+
+CGroups V2 introduced a feature where 'memory[.swap].max' could be set to
+"max". This does not appear to work on V1 'limit_in_bytes' however. For most
+tests, simply using a large number is sufficient and there is no need to use
+"max". Importantly though, one should be careful to read both the V1 and V2
+kernel docs. The LTP library can not handle all edge cases. It does the minimal
+amount of work to make testing on both V1 and V2 feasible.
+
 2.2.37 Require minimum numbers of CPU for a testcase
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [LTP] [PATCH v3 6/7] mem: Convert tests to new CGroups API
  2021-04-12 14:54 [LTP] [PATCH v3 0/7] CGroup API rewrite Richard Palethorpe
                   ` (4 preceding siblings ...)
  2021-04-12 14:55 ` [LTP] [PATCH v3 5/7] docs: Update CGroups API Richard Palethorpe
@ 2021-04-12 14:55 ` Richard Palethorpe
  2021-04-12 14:55 ` [LTP] [PATCH v3 7/7] madvise06: Convert " Richard Palethorpe
  6 siblings, 0 replies; 22+ messages in thread
From: Richard Palethorpe @ 2021-04-12 14:55 UTC (permalink / raw)
  To: ltp

Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com>
---
 testcases/kernel/mem/cpuset/cpuset01.c | 34 ++++++++++++--------------
 testcases/kernel/mem/include/mem.h     |  2 +-
 testcases/kernel/mem/ksm/ksm02.c       | 13 +++++++---
 testcases/kernel/mem/ksm/ksm03.c       | 12 ++++++---
 testcases/kernel/mem/ksm/ksm04.c       | 17 +++++++------
 testcases/kernel/mem/lib/mem.c         | 10 +++-----
 testcases/kernel/mem/oom/oom03.c       | 18 ++++++++------
 testcases/kernel/mem/oom/oom04.c       | 19 ++++++++------
 testcases/kernel/mem/oom/oom05.c       | 32 ++++++++++++++----------
 9 files changed, 87 insertions(+), 70 deletions(-)

diff --git a/testcases/kernel/mem/cpuset/cpuset01.c b/testcases/kernel/mem/cpuset/cpuset01.c
index 528c3eddd..5a14a0a6b 100644
--- a/testcases/kernel/mem/cpuset/cpuset01.c
+++ b/testcases/kernel/mem/cpuset/cpuset01.c
@@ -35,6 +35,8 @@
 
 #ifdef HAVE_NUMA_V2
 
+static const struct tst_cgroup *cg;
+
 volatile int end;
 static int *nodes;
 static int nnodes;
@@ -47,15 +49,14 @@ static long count_cpu(void);
 
 static void test_cpuset(void)
 {
-	int child, i, status;
+	int child, i;
 	unsigned long nmask[MAXNODES / BITS_PER_LONG] = { 0 };
-	char mems[BUFSIZ], buf[BUFSIZ];
+	char buf[BUFSIZ];
 
-	tst_cgroup_cpuset_read_files(PATH_TMP_CG_CST, "cpus", buf, BUFSIZ);
-	tst_cgroup_cpuset_write_files(PATH_TMP_CG_CST, "cpus", buf);
-	tst_cgroup_cpuset_read_files(PATH_TMP_CG_CST, "mems", mems, BUFSIZ);
-	tst_cgroup_cpuset_write_files(PATH_TMP_CG_CST, "mems", mems);
-	tst_cgroup_move_current(PATH_TMP_CG_CST);
+	SAFE_CGROUP_READ(cg, "cpuset.cpus", buf, sizeof(buf));
+	SAFE_CGROUP_PRINT(cg, "cpuset.cpus", buf);
+	SAFE_CGROUP_READ(cg, "cpuset.mems", buf, sizeof(buf));
+	SAFE_CGROUP_PRINT(cg, "cpuset.mems", buf);
 
 	child = SAFE_FORK();
 	if (child == 0) {
@@ -69,33 +70,30 @@ static void test_cpuset(void)
 		exit(mem_hog_cpuset(ncpus > 1 ? ncpus : 1));
 	}
 
-	snprintf(buf, BUFSIZ, "%d", nodes[0]);
-	tst_cgroup_cpuset_write_files(PATH_TMP_CG_CST, "mems", buf);
-	snprintf(buf, BUFSIZ, "%d", nodes[1]);
-	tst_cgroup_cpuset_write_files(PATH_TMP_CG_CST, "mems", buf);
+	SAFE_CGROUP_PRINTF(cg, "cpuset.mems", "%d", nodes[0]);
+	SAFE_CGROUP_PRINTF(cg, "cpuset.mems", "%d", nodes[1]);
 
-	SAFE_WAITPID(child, &status, WUNTRACED | WCONTINUED);
-	if (WEXITSTATUS(status) != 0) {
-		tst_res(TFAIL, "child exit status is %d", WEXITSTATUS(status));
-		return;
-	}
+	tst_reap_children();
 
 	tst_res(TPASS, "cpuset test pass");
 }
 
 static void setup(void)
 {
-	tst_cgroup_mount(TST_CGROUP_CPUSET, PATH_TMP_CG_CST);
+	tst_cgroup_require(TST_CGROUP_CPUSET, NULL);
 	ncpus = count_cpu();
 	if (get_allowed_nodes_arr(NH_MEMS | NH_CPUS, &nnodes, &nodes) < 0)
 		tst_brk(TBROK | TERRNO, "get_allowed_nodes_arr");
 	if (nnodes <= 1)
 		tst_brk(TCONF, "requires a NUMA system.");
+
+	cg = tst_cgroup_get_test();
+	SAFE_CGROUP_PRINTF(cg, "cgroup.procs", "%d", getpid());
 }
 
 static void cleanup(void)
 {
-	tst_cgroup_umount(PATH_TMP_CG_CST);
+	tst_cgroup_cleanup();
 }
 
 static void sighandler(int signo LTP_ATTRIBUTE_UNUSED)
diff --git a/testcases/kernel/mem/include/mem.h b/testcases/kernel/mem/include/mem.h
index 42b12a230..549507b0d 100644
--- a/testcases/kernel/mem/include/mem.h
+++ b/testcases/kernel/mem/include/mem.h
@@ -61,7 +61,7 @@ void check_hugepage(void);
 void write_memcg(void);
 
 /* cpuset/memcg - include/tst_cgroup.h */
-void write_cpusets(const char *cgroup_dir, long nd);
+void write_cpusets(const struct tst_cgroup *cg, long nd);
 
 /* shared */
 unsigned int get_a_numa_node(void);
diff --git a/testcases/kernel/mem/ksm/ksm02.c b/testcases/kernel/mem/ksm/ksm02.c
index 51f5d4cca..b4c584bab 100644
--- a/testcases/kernel/mem/ksm/ksm02.c
+++ b/testcases/kernel/mem/ksm/ksm02.c
@@ -59,6 +59,9 @@
 #ifdef HAVE_NUMA_V2
 #include <numaif.h>
 
+static const struct tst_cgroup *cg;
+static const struct tst_cgroup *cg_drain;
+
 static void verify_ksm(void)
 {
 	unsigned long nmask[MAXNODES / BITS_PER_LONG] = { 0 };
@@ -76,9 +79,10 @@ static void verify_ksm(void)
 	}
 	create_same_memory(size, num, unit);
 
-	write_cpusets(PATH_TMP_CG_CST, node);
-	tst_cgroup_move_current(PATH_TMP_CG_CST);
+	write_cpusets(cg, node);
+	SAFE_CGROUP_PRINTF(cg, "cgroup.procs", "%d", getpid());
 	create_same_memory(size, num, unit);
+	SAFE_CGROUP_PRINTF(cg_drain, "cgroup.procs", "%d", getpid());
 }
 
 static void cleanup(void)
@@ -87,7 +91,7 @@ static void cleanup(void)
 		FILE_PRINTF(PATH_KSM "merge_across_nodes",
 				 "%d", merge_across_nodes);
 
-	tst_cgroup_umount(PATH_TMP_CG_CST);
+	tst_cgroup_cleanup();
 }
 
 static void setup(void)
@@ -103,7 +107,8 @@ static void setup(void)
 		SAFE_FILE_PRINTF(PATH_KSM "merge_across_nodes", "1");
 	}
 
-	tst_cgroup_mount(TST_CGROUP_CPUSET, PATH_TMP_CG_CST);
+	tst_cgroup_require(TST_CGROUP_CPUSET, NULL);
+	cg = tst_cgroup_get_test();
 }
 
 static struct tst_test test = {
diff --git a/testcases/kernel/mem/ksm/ksm03.c b/testcases/kernel/mem/ksm/ksm03.c
index e9949280e..032bb607e 100644
--- a/testcases/kernel/mem/ksm/ksm03.c
+++ b/testcases/kernel/mem/ksm/ksm03.c
@@ -59,10 +59,10 @@
 #include "mem.h"
 #include "ksm_common.h"
 
+static const struct tst_cgroup *cg;
+
 static void verify_ksm(void)
 {
-	tst_cgroup_move_current(PATH_TMP_CG_MEM);
-	tst_cgroup_mem_set_maxbytes(PATH_TMP_CG_MEM, TESTMEM);
 	create_same_memory(size, num, unit);
 }
 
@@ -78,7 +78,11 @@ static void setup(void)
 	}
 
 	parse_ksm_options(opt_sizestr, &size, opt_numstr, &num, opt_unitstr, &unit);
-	tst_cgroup_mount(TST_CGROUP_MEMCG, PATH_TMP_CG_MEM);
+
+	tst_cgroup_require(TST_CGROUP_MEMORY, NULL);
+	cg = tst_cgroup_get_test();
+	SAFE_CGROUP_PRINTF(cg, "cgroup.procs", "%d", getpid());
+	SAFE_CGROUP_PRINTF(cg, "memory.max", "%lu", TESTMEM);
 }
 
 static void cleanup(void)
@@ -86,7 +90,7 @@ static void cleanup(void)
 	if (access(PATH_KSM "merge_across_nodes", F_OK) == 0)
 		FILE_PRINTF(PATH_KSM "merge_across_nodes",
 				 "%d", merge_across_nodes);
-	tst_cgroup_umount(PATH_TMP_CG_MEM);
+	tst_cgroup_cleanup();
 }
 
 static struct tst_test test = {
diff --git a/testcases/kernel/mem/ksm/ksm04.c b/testcases/kernel/mem/ksm/ksm04.c
index b4ad41831..980e7c99c 100644
--- a/testcases/kernel/mem/ksm/ksm04.c
+++ b/testcases/kernel/mem/ksm/ksm04.c
@@ -59,6 +59,8 @@
 #ifdef HAVE_NUMA_V2
 #include <numaif.h>
 
+static const struct tst_cgroup *cg;
+
 static void verify_ksm(void)
 {
 	unsigned long nmask[MAXNODES / BITS_PER_LONG] = { 0 };
@@ -67,8 +69,7 @@ static void verify_ksm(void)
 	node = get_a_numa_node();
 	set_node(nmask, node);
 
-	tst_cgroup_move_current(PATH_TMP_CG_MEM);
-	tst_cgroup_mem_set_maxbytes(PATH_TMP_CG_MEM, TESTMEM);
+	SAFE_CGROUP_PRINTF(cg, "memory.max", "%lu", TESTMEM);
 
 	if (set_mempolicy(MPOL_BIND, nmask, MAXNODES) == -1) {
 		if (errno != ENOSYS)
@@ -79,8 +80,7 @@ static void verify_ksm(void)
 	}
 	create_same_memory(size, num, unit);
 
-	write_cpusets(PATH_TMP_CG_CST, node);
-	tst_cgroup_move_current(PATH_TMP_CG_CST);
+	write_cpusets(cg, node);
 	create_same_memory(size, num, unit);
 }
 
@@ -90,8 +90,7 @@ static void cleanup(void)
 		FILE_PRINTF(PATH_KSM "merge_across_nodes",
 				 "%d", merge_across_nodes);
 
-	tst_cgroup_umount(PATH_TMP_CG_MEM);
-	tst_cgroup_umount(PATH_TMP_CG_CST);
+	tst_cgroup_cleanup();
 }
 
 static void setup(void)
@@ -107,8 +106,10 @@ static void setup(void)
 
 	parse_ksm_options(opt_sizestr, &size, opt_numstr, &num, opt_unitstr, &unit);
 
-	tst_cgroup_mount(TST_CGROUP_MEMCG, PATH_TMP_CG_MEM);
-	tst_cgroup_mount(TST_CGROUP_CPUSET, PATH_TMP_CG_CST);
+	tst_cgroup_require(TST_CGROUP_MEMORY, NULL);
+	tst_cgroup_require(TST_CGROUP_CPUSET, NULL);
+	cg = tst_cgroup_get_test();
+	SAFE_CGROUP_PRINTF(cg, "cgroup.procs", "%d", getpid());
 }
 
 static struct tst_test test = {
diff --git a/testcases/kernel/mem/lib/mem.c b/testcases/kernel/mem/lib/mem.c
index 2de3f83a6..2af2ef20c 100644
--- a/testcases/kernel/mem/lib/mem.c
+++ b/testcases/kernel/mem/lib/mem.c
@@ -629,13 +629,11 @@ static void gather_node_cpus(char *cpus, long nd)
 	cpus[strlen(cpus) - 1] = '\0';
 }
 
-void write_cpusets(const char *cgroup_dir, long nd)
+void write_cpusets(const struct tst_cgroup *cg, long nd)
 {
-	char buf[BUFSIZ];
 	char cpus[BUFSIZ] = "";
 
-	snprintf(buf, BUFSIZ, "%ld", nd);
-	tst_cgroup_cpuset_write_files(cgroup_dir, "mems", buf);
+	SAFE_CGROUP_PRINTF(cg, "cpuset.mems", "%ld", nd);
 
 	gather_node_cpus(cpus, nd);
 	/*
@@ -644,11 +642,11 @@ void write_cpusets(const char *cgroup_dir, long nd)
 	 * the value of cpuset.cpus.
 	 */
 	if (strlen(cpus) != 0) {
-		tst_cgroup_cpuset_write_files(cgroup_dir, "cpus", cpus);
+		SAFE_CGROUP_PRINT(cg, "cpuset.cpus", cpus);
 	} else {
 		tst_res(TINFO, "No CPUs in the node%ld; "
 				"using only CPU0", nd);
-		tst_cgroup_cpuset_write_files(cgroup_dir, "cpus", "0");
+		SAFE_CGROUP_PRINT(cg, "cpuset.cpus", "0");
 	}
 }
 
diff --git a/testcases/kernel/mem/oom/oom03.c b/testcases/kernel/mem/oom/oom03.c
index fc860c660..cd927ba11 100644
--- a/testcases/kernel/mem/oom/oom03.c
+++ b/testcases/kernel/mem/oom/oom03.c
@@ -36,19 +36,17 @@
 
 #ifdef HAVE_NUMA_V2
 
+static const struct tst_cgroup *cg;
+
 static void verify_oom(void)
 {
 #ifdef TST_ABI32
 	tst_brk(TCONF, "test is not designed for 32-bit system.");
 #endif
-
-	tst_cgroup_move_current(PATH_TMP_CG_MEM);
-	tst_cgroup_mem_set_maxbytes(PATH_TMP_CG_MEM, TESTMEM);
-
 	testoom(0, 0, ENOMEM, 1);
 
-	if (tst_cgroup_mem_swapacct_enabled(PATH_TMP_CG_MEM)) {
-		tst_cgroup_mem_set_maxswap(PATH_TMP_CG_MEM, TESTMEM);
+	if (SAFE_CGROUP_HAS(cg, "memory.swap.max")) {
+		SAFE_CGROUP_PRINTF(cg, "memory.swap.max", "%lu", TESTMEM);
 		testoom(0, 1, ENOMEM, 1);
 	}
 
@@ -65,14 +63,18 @@ static void setup(void)
 {
 	overcommit = get_sys_tune("overcommit_memory");
 	set_sys_tune("overcommit_memory", 1, 1);
-	tst_cgroup_mount(TST_CGROUP_MEMCG, PATH_TMP_CG_MEM);
+
+	tst_cgroup_require(TST_CGROUP_MEMORY, NULL);
+	cg = tst_cgroup_get_test();
+	SAFE_CGROUP_PRINTF(cg, "cgroup.procs", "%d", getpid());
+	SAFE_CGROUP_PRINTF(cg, "memory.max", "%lu", TESTMEM);
 }
 
 static void cleanup(void)
 {
 	if (overcommit != -1)
 		set_sys_tune("overcommit_memory", overcommit, 0);
-	tst_cgroup_umount(PATH_TMP_CG_MEM);
+	tst_cgroup_cleanup();
 }
 
 static struct tst_test test = {
diff --git a/testcases/kernel/mem/oom/oom04.c b/testcases/kernel/mem/oom/oom04.c
index 582663275..bad7b582b 100644
--- a/testcases/kernel/mem/oom/oom04.c
+++ b/testcases/kernel/mem/oom/oom04.c
@@ -36,24 +36,25 @@
 
 #ifdef HAVE_NUMA_V2
 
+static const struct tst_cgroup *cg;
+
 static void verify_oom(void)
 {
 #ifdef TST_ABI32
 	tst_brk(TCONF, "test is not designed for 32-bit system.");
 #endif
-
-	tst_cgroup_move_current(PATH_TMP_CG_CST);
-
 	tst_res(TINFO, "OOM on CPUSET...");
 	testoom(0, 0, ENOMEM, 1);
 
-	if (is_numa(NULL, NH_MEMS, 2)) {
+	if (is_numa(NULL, NH_MEMS, 2) &&
+	    SAFE_CGROUP_HAS(cg, "cpuset.memory_migrate")) {
 		/*
 		 * Under NUMA system, the migration of cpuset's memory
 		 * is in charge of cpuset.memory_migrate, we can write
 		 * 1 to cpuset.memory_migrate to enable the migration.
 		 */
-		tst_cgroup_cpuset_write_files(PATH_TMP_CG_CST, "memory_migrate", "1");
+		SAFE_CGROUP_PRINT(cg, "cpuset.memory_migrate", "1");
+
 		tst_res(TINFO, "OOM on CPUSET with mem migrate:");
 		testoom(0, 0, ENOMEM, 1);
 	}
@@ -69,7 +70,8 @@ static void setup(void)
 	overcommit = get_sys_tune("overcommit_memory");
 	set_sys_tune("overcommit_memory", 1, 1);
 
-	tst_cgroup_mount(TST_CGROUP_CPUSET, PATH_TMP_CG_CST);
+	tst_cgroup_require(TST_CGROUP_CPUSET, NULL);
+	cg = tst_cgroup_get_test();
 
 	/*
 	 * Some nodes do not contain memory, so use
@@ -81,14 +83,15 @@ static void setup(void)
 	if (ret < 0)
 		tst_brk(TBROK, "Failed to get a memory node "
 			      "using get_allowed_nodes()");
-	write_cpusets(PATH_TMP_CG_CST, memnode);
+	write_cpusets(cg, memnode);
+	SAFE_CGROUP_PRINTF(cg, "cgroup.procs", "%d", getpid());
 }
 
 static void cleanup(void)
 {
 	if (overcommit != -1)
 		set_sys_tune("overcommit_memory", overcommit, 0);
-	tst_cgroup_umount(PATH_TMP_CG_CST);
+	tst_cgroup_cleanup();
 }
 
 static struct tst_test test = {
diff --git a/testcases/kernel/mem/oom/oom05.c b/testcases/kernel/mem/oom/oom05.c
index 871f302e3..3be765df9 100644
--- a/testcases/kernel/mem/oom/oom05.c
+++ b/testcases/kernel/mem/oom/oom05.c
@@ -36,6 +36,8 @@
 
 #ifdef HAVE_NUMA_V2
 
+static const struct tst_cgroup *cg;
+
 static void verify_oom(void)
 {
 #ifdef TST_ABI32
@@ -43,9 +45,6 @@ static void verify_oom(void)
 #endif
 
 	tst_res(TINFO, "OOM on CPUSET & MEMCG...");
-	tst_cgroup_move_current(PATH_TMP_CG_MEM);
-	tst_cgroup_move_current(PATH_TMP_CG_CST);
-	tst_cgroup_mem_set_maxbytes(PATH_TMP_CG_MEM, TESTMEM);
 	testoom(0, 0, ENOMEM, 1);
 
 	/*
@@ -53,22 +52,26 @@ static void verify_oom(void)
 	 * is in charge of cpuset.memory_migrate, we can write
 	 * 1 to cpuset.memory_migrate to enable the migration.
 	 */
-	if (is_numa(NULL, NH_MEMS, 2)) {
-		tst_cgroup_cpuset_write_files(PATH_TMP_CG_CST, "memory_migrate", "1");
+	if (is_numa(NULL, NH_MEMS, 2) &&
+	    SAFE_CGROUP_HAS(cg, "cpuset.memory_migrate")) {
+		SAFE_CGROUP_PRINT(cg, "cpuset.memory_migrate", "1");
 		tst_res(TINFO, "OOM on CPUSET & MEMCG with "
 				"cpuset.memory_migrate=1");
 		testoom(0, 0, ENOMEM, 1);
 	}
 
-	if (tst_cgroup_mem_swapacct_enabled(PATH_TMP_CG_MEM)) {
+	if (SAFE_CGROUP_HAS(cg, "memory.swap.max")) {
 		tst_res(TINFO, "OOM on CPUSET & MEMCG with "
 				"special memswap limitation:");
-		tst_cgroup_mem_set_maxswap(PATH_TMP_CG_MEM, TESTMEM);
+		SAFE_CGROUP_PRINTF(cg, "memory.swap.max", "%lu", TESTMEM);
 		testoom(0, 0, ENOMEM, 1);
 
 		tst_res(TINFO, "OOM on CPUSET & MEMCG with "
 				"disabled memswap limitation:");
-		tst_cgroup_mem_set_maxswap(PATH_TMP_CG_MEM, -1);
+		if (SAFE_CGROUP_VER(cg, "memory") == TST_CGROUP_V1)
+			SAFE_CGROUP_PRINTF(cg, "memory.swap.max", "%lu", ~0UL);
+		else
+			SAFE_CGROUP_PRINT(cg, "memory.swap.max", "max");
 		testoom(0, 0, ENOMEM, 1);
 	}
 }
@@ -83,8 +86,9 @@ void setup(void)
 	overcommit = get_sys_tune("overcommit_memory");
 	set_sys_tune("overcommit_memory", 1, 1);
 
-	tst_cgroup_mount(TST_CGROUP_MEMCG, PATH_TMP_CG_MEM);
-	tst_cgroup_mount(TST_CGROUP_CPUSET, PATH_TMP_CG_CST);
+	tst_cgroup_require(TST_CGROUP_MEMORY, NULL);
+	tst_cgroup_require(TST_CGROUP_CPUSET, NULL);
+	cg = tst_cgroup_get_test();
 
 	/*
 	 * Some nodes do not contain memory, so use
@@ -96,15 +100,17 @@ void setup(void)
 	if (ret < 0)
 		tst_brk(TBROK, "Failed to get a memory node "
 			      "using get_allowed_nodes()");
-	write_cpusets(PATH_TMP_CG_CST, memnode);
+
+	write_cpusets(cg, memnode);
+	SAFE_CGROUP_PRINTF(cg, "cgroup.procs", "%d", getpid());
+	SAFE_CGROUP_PRINTF(cg, "memory.max", "%lu", TESTMEM);
 }
 
 void cleanup(void)
 {
 	if (overcommit != -1)
 		set_sys_tune("overcommit_memory", overcommit, 0);
-	tst_cgroup_umount(PATH_TMP_CG_MEM);
-	tst_cgroup_umount(PATH_TMP_CG_CST);
+	tst_cgroup_cleanup();
 }
 
 static struct tst_test test = {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [LTP] [PATCH v3 7/7] madvise06: Convert to new CGroups API
  2021-04-12 14:54 [LTP] [PATCH v3 0/7] CGroup API rewrite Richard Palethorpe
                   ` (5 preceding siblings ...)
  2021-04-12 14:55 ` [LTP] [PATCH v3 6/7] mem: Convert tests to new " Richard Palethorpe
@ 2021-04-12 14:55 ` Richard Palethorpe
  6 siblings, 0 replies; 22+ messages in thread
From: Richard Palethorpe @ 2021-04-12 14:55 UTC (permalink / raw)
  To: ltp

Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com>
---
 testcases/kernel/syscalls/madvise/madvise06.c | 82 +++++++++----------
 1 file changed, 37 insertions(+), 45 deletions(-)

diff --git a/testcases/kernel/syscalls/madvise/madvise06.c b/testcases/kernel/syscalls/madvise/madvise06.c
index 56e0700d6..47642faf4 100644
--- a/testcases/kernel/syscalls/madvise/madvise06.c
+++ b/testcases/kernel/syscalls/madvise/madvise06.c
@@ -3,8 +3,8 @@
  * Copyright (c) 2016 Red Hat, Inc.
  */
 
-/*
- * DESCRIPTION
+/*\
+ * [DESCRIPTION]
  *
  *   Page fault occurs in spite that madvise(WILLNEED) system call is called
  *   to prefetch the page. This issue is reproduced by running a program
@@ -36,13 +36,14 @@
  *   else unexpected, but irrelevant procedure, registers a fault to
  *   our process.
  *
- */
+\*/
 
 #include <errno.h>
 #include <stdio.h>
 #include <sys/mount.h>
 #include <sys/sysinfo.h>
 #include "tst_test.h"
+#include "tst_cgroup.h"
 
 #define CHUNK_SZ (400*1024*1024L)
 #define MEM_LIMIT (CHUNK_SZ / 2)
@@ -50,8 +51,7 @@
 #define PASS_THRESHOLD (CHUNK_SZ / 4)
 #define PASS_THRESHOLD_KB (PASS_THRESHOLD / 1024)
 
-#define MNT_NAME "memory"
-#define GROUP_NAME "madvise06"
+static const struct tst_cgroup *cg;
 
 static const char drop_caches_fname[] = "/proc/sys/vm/drop_caches";
 static int pg_sz, stat_refresh_sup;
@@ -64,17 +64,19 @@ static void check_path(const char *path)
 		tst_brk(TCONF, "file needed: %s", path);
 }
 
-#define READ_CGMEM(item)						\
-	({long tst_rval = 0;						\
-	  const char *cgpath = MNT_NAME"/"GROUP_NAME"/memory."item;	\
-	  if (!access(cgpath, R_OK))					\
-		  SAFE_FILE_LINES_SCANF(cgpath, "%ld", &tst_rval);	\
-	  tst_rval;})
+static void print_cgmem(const char *name)
+{
+	long ret;
+
+	if (!SAFE_CGROUP_HAS(cg, name))
+		return;
+
+	SAFE_CGROUP_SCANF(cg, name, "%ld", &ret);
+	tst_res(TINFO, "\t%s: %ld Kb", name, ret / 1024);
+}
 
 static void meminfo_diag(const char *point)
 {
-	long rval;
-
 	if (stat_refresh_sup)
 		SAFE_FILE_PRINTF("/proc/sys/vm/stat_refresh", "1");
 
@@ -85,16 +87,10 @@ static void meminfo_diag(const char *point)
 		SAFE_READ_MEMINFO("SwapCached:") - init_swap_cached);
 	tst_res(TINFO, "\tCached: %ld Kb",
 		SAFE_READ_MEMINFO("Cached:") - init_cached);
-	tst_res(TINFO, "\tcgmem.usage_in_bytes: %ld Kb",
-		READ_CGMEM("usage_in_bytes") / 1024);
-
-	rval = READ_CGMEM("memsw.usage_in_bytes") / 1024;
-	if (rval)
-		tst_res(TINFO, "\tcgmem.memsw.usage_in_bytes: %ld Kb", rval);
 
-	rval = READ_CGMEM("kmem.usage_in_bytes") / 1024;
-	if (rval)
-		tst_res(TINFO, "\tcgmem.kmem.usage_in_bytes: %ld Kb", rval);
+	print_cgmem("memory.current");
+	print_cgmem("memory.swap.current");
+	print_cgmem("memory.kmem.usage_in_bytes");
 }
 
 static void setup(void)
@@ -117,28 +113,24 @@ static void setup(void)
 			2 * CHUNK_SZ);
 	}
 
-	SAFE_MKDIR(MNT_NAME, 0700);
-	if (mount("memory", MNT_NAME, "cgroup", 0, "memory") == -1) {
-		if (errno == ENODEV || errno == ENOENT)
-			tst_brk(TCONF, "memory cgroup needed");
-	}
-	SAFE_MKDIR(MNT_NAME"/"GROUP_NAME, 0700);
-
 	check_path("/proc/self/oom_score_adj");
-	check_path(MNT_NAME"/"GROUP_NAME"/memory.limit_in_bytes");
-	check_path(MNT_NAME"/"GROUP_NAME"/memory.swappiness");
-	check_path(MNT_NAME"/"GROUP_NAME"/tasks");
-
 	SAFE_FILE_PRINTF("/proc/self/oom_score_adj", "%d", -1000);
-	SAFE_FILE_PRINTF(MNT_NAME"/"GROUP_NAME"/memory.limit_in_bytes", "%ld\n",
-			 MEM_LIMIT);
 
-	if (!access(MNT_NAME"/"GROUP_NAME"/memory.memsw.limit_in_bytes", W_OK)) {
-		SAFE_FILE_PRINTF(MNT_NAME"/"GROUP_NAME"/memory.memsw.limit_in_bytes",
-				 "%ld\n", MEMSW_LIMIT);
+	tst_cgroup_require(TST_CGROUP_MEMORY, NULL);
+	cg = tst_cgroup_get_test();
+
+	SAFE_CGROUP_PRINTF(cg, "memory.max", "%ld", MEM_LIMIT);
+	if (SAFE_CGROUP_HAS(cg, "memory.swap.max"))
+		SAFE_CGROUP_PRINTF(cg, "memory.swap.max", "%ld", MEMSW_LIMIT);
+
+	if (SAFE_CGROUP_HAS(cg, "memory.swappiness")) {
+		SAFE_CGROUP_PRINT(cg, "memory.swappiness", "60");
+	} else {
+		check_path("/proc/sys/vm/swappiness");
+		SAFE_FILE_PRINTF("/proc/sys/vm/swappiness", "%d", 60);
 	}
-	SAFE_FILE_PRINTF(MNT_NAME"/"GROUP_NAME"/memory.swappiness", "60");
-	SAFE_FILE_PRINTF(MNT_NAME"/"GROUP_NAME"/tasks", "%d\n", getpid());
+
+	SAFE_CGROUP_PRINTF(cg, "cgroup.procs", "%d", getpid());
 
 	meminfo_diag("Initial meminfo, later values are relative to this (except memcg)");
 	init_swap = SAFE_READ_MEMINFO("SwapTotal:") - SAFE_READ_MEMINFO("SwapFree:");
@@ -154,11 +146,7 @@ static void setup(void)
 
 static void cleanup(void)
 {
-	if (!access(MNT_NAME"/tasks", F_OK)) {
-		SAFE_FILE_PRINTF(MNT_NAME"/tasks", "%d\n", getpid());
-		SAFE_RMDIR(MNT_NAME"/"GROUP_NAME);
-		SAFE_UMOUNT(MNT_NAME);
-	}
+	tst_cgroup_cleanup();
 }
 
 static void dirty_pages(char *ptr, long size)
@@ -248,6 +236,10 @@ static struct tst_test test = {
 	.min_kver = "3.10.0",
 	.needs_tmpdir = 1,
 	.needs_root = 1,
+	.save_restore = (const char * const[]) {
+		"?/proc/sys/vm/swappiness",
+		NULL
+	},
 	.tags = (const struct tst_tag[]) {
 		{"linux-git", "55231e5c898c"},
 		{"linux-git", "8de15e920dc8"},
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [LTP] [PATCH v3 3/7] Add new CGroups APIs
  2021-04-12 14:55 ` [LTP] [PATCH v3 3/7] Add new CGroups APIs Richard Palethorpe
@ 2021-04-14 15:39   ` Cyril Hrubis
  2021-04-15 13:10     ` Richard Palethorpe
  2021-04-16  6:57   ` Li Wang
  1 sibling, 1 reply; 22+ messages in thread
From: Cyril Hrubis @ 2021-04-14 15:39 UTC (permalink / raw)
  To: ltp

Hi!
> Complete rewrite of the CGroups API which provides two layers of
> indirection between the test author and the SUT's CGroup
> configuration.

I've spend quite some time reading the code and I think that the
generall idea and implementation is good.

There is quite a bit of minor details that I think may be done better,
see comments.

> Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com>
> ---
>  include/tst_cgroup.h |  194 ++++++-
>  include/tst_test.h   |    1 -
>  lib/Makefile         |    2 +
>  lib/tst_cgroup.c     | 1205 +++++++++++++++++++++++++++++++-----------
>  4 files changed, 1056 insertions(+), 346 deletions(-)
> 
> diff --git a/include/tst_cgroup.h b/include/tst_cgroup.h
> index bfd848260..d6842d641 100644
> --- a/include/tst_cgroup.h
> +++ b/include/tst_cgroup.h
> @@ -2,46 +2,194 @@
>  /*
>   * Copyright (c) 2020 Red Hat, Inc.
>   * Copyright (c) 2020 Li Wang <liwang@redhat.com>
> + * Copyright (c) 2020-2021 SUSE LLC <rpalethorpe@suse.com>
>   */
> +/*\
> + * [DESCRIPTION]
> + *
> + * The LTP CGroups API tries to present a consistent interface to the
> + * many possible CGroup configurations a system could have.
> + *
> + * You may ask; "Why don't you just mount a simple CGroup hierarchy,
> + * instead of scanning the current setup?". The short answer is that
> + * it is not possible unless no CGroups are currently active and
> + * almost all of our users will have CGroups active. Even if
> + * unmounting the current CGroup hierarchy is a reasonable thing to do
> + * to the sytem manager, it is highly unlikely the CGroup hierarchy
> + * will be destroyed. So users would be forced to remove their CGroup
> + * configuration and reboot the system.
> + *
> + * The core library tries to ensure an LTP CGroup exists on each
> + * hierarchy root. Inside the LTP group it ensures a 'drain' group
> + * exists and creats a test group for the current test. In the worst
> + * case we end up with a set of hierarchies like the follwoing. Where
> + * existing system-manager-created CGroups have been omitted.
> + *
> + * 	(V2 Root)	(V1 Root 1)	...	(V1 Root N)
> + * 	    |		     |			     |
> + *	  (ltp)		   (ltp)	...	   (ltp)
> + *	 /     \	  /	\		  /	\
> + *  (drain) (test-n) (drain)  (test-n)  ...  (drain)  (test-n)
> + *
> + * V2 CGroup controllers use a single unified hierarchy on a single
> + * root. Two or more V1 controllers may share a root or have their own
> + * root. However there may exist only one instance of a controller.
> + * So you can not have the same V1 controller on multiple roots.
> + *
> + * It is possible to have both a V2 hierarchy and V1 hierarchies
> + * active at the same time. Which is what is shown above. Any
> + * controllers attached to V1 hierarchies will not be available in the
> + * V2 hierarchy. The reverse is also true.
> + *
> + * Note that a single hierarchy may be mounted multiple
> + * times. Allowing it to be accessed at different locations. However
> + * subsequent mount operations will fail if the mount options are
> + * different from the first.
> + *
> + * The user may pre-create the CGroup hierarchies and the ltp CGroup,
> + * otherwise the library will try to create them. If the ltp group
> + * already exists and has appropriate permissions, then admin
> + * privileges will not be required to run the tests.
> + *
> + * Because the test may not have access to the CGroup root(s), the
> + * drain CGroup is created. This can be used to store processes which
> + * would otherwise block the destruction of the individual test CGroup
> + * or one of its descendants.
> + *
> + * The test author may create child CGroups within the test CGroup
> + * using the CGroup Item API. The library will create the new CGroup
> + * in all the relevant hierarchies.
> + *
> + * There are many differences between the V1 and V2 CGroup APIs. If a
> + * controller is on both V1 and V2, it may have different parameters
> + * and control files. Some of these control files have a different
> + * name, but similar functionality. In this case the Item API uses
> + * the V2 names and aliases them to the V1 name when appropriate.
> + *
> + * Some control files only exist on one of the versions or they can be
> + * missing due to other reasons. The Item API allows the user to check
> + * if the file exists before trying to use it.
> + *
> + * Often a control file has almost the same functionality between V1
> + * and V2. Which means it can be used in the same way most of the
> + * time, but not all. For now this is handled by exposing the API
> + * version a controller is using to allow the test author to handle
> + * edge cases. (e.g. V2 memory.swap.max accepts "max", but V1
> + * memory.memsw.limit_in_bytes does not).
> +\*/
>  
>  #ifndef TST_CGROUP_H
>  #define TST_CGROUP_H
>  
> -#define PATH_TMP_CG_MEM	"/tmp/cgroup_mem"
> -#define PATH_TMP_CG_CST	"/tmp/cgroup_cst"
> -
> +/* CGroups Kernel API version */
>  enum tst_cgroup_ver {
>  	TST_CGROUP_V1 = 1,
>  	TST_CGROUP_V2 = 2,
>  };
>  
> -enum tst_cgroup_ctrl {
> -	TST_CGROUP_MEMCG = 1,
> +/* Controller sub-systems */
> +enum tst_cgroup_css {
> +	TST_CGROUP_MEMORY = 1,
>  	TST_CGROUP_CPUSET = 2,
> -	/* add cgroup controller */
>  };

I spend a bit of time figuring out what css means, can't we just use
controler in the code instead? It's a bit longer but more obvious.

Or is this consitent with kernel naming?

> +#define TST_CGROUP_MAX TST_CGROUP_CPUSET
> +
> +/* At most we can have one cgroup V1 tree for each controller and one
> + * (empty) v2 tree.
> + */
> +#define TST_CGROUP_MAX_TREES (TST_CGROUP_MAX + 1)
> +
> +
> +/* Used to specify CGroup hierarchy configuration options, allowing a
> + * test to request a particular CGroup structure.
> + */
> +struct tst_cgroup_opts {
> +	/* Only try to mount V1 CGroup controllers. This will not
> +	 * prevent V2 from being used if it is already mounted, it
> +	 * only indicates that we should mount V1 controllers if
> +	 * nothing is present. By default we try to mount V2 first. */
> +	int only_mount_v1:1;
> +};

Do we need to pass the flags in a structure?

This is not an API that has to be future proof, when we find out we need
more than a few bitflags we can always change it.

> +struct tst_cgroup_tree;
> +
> +
> +/* A Control Group in LTP's aggregated hierarchy */
> +struct tst_cgroup {
> +	const char *name;
> +	/* Maps controller ID to the tree which contains it. The V2
> +	 * tree is at zero even if it contains no controllers.
> +	 */
> +	struct tst_cgroup_tree *trees_by_css[TST_CGROUP_MAX_TREES];
> +	/* NULL terminated list of trees */
> +	struct tst_cgroup_tree *trees[TST_CGROUP_MAX_TREES + 1];
> +};

So this is an array of directories in different trees, can we please
name the strucuture better. What about tst_cgroup_nodes or
tst_cgroup_dirs?

> +/* Search the system for mounted cgroups and available
> + * controllers. Called automatically by tst_cgroup_require.
> + */
> +void tst_cgroup_scan(void);
> +/* Print the config detected by tst_cgroup_scan */
> +void tst_cgroup_print_config(void);
> +
> +/* Ensure the specified controller is available in the test's default
> + * CGroup, mounting/enabling it if necessary */
> +void tst_cgroup_require(enum tst_cgroup_css type,
> +			const struct tst_cgroup_opts *options);

This is the only function in the API that uses the enum cgroup_css. I
think that it may as well be better if we just passed a string to this
function as well and made the enum strictly internal.

I guess that we do not want to split the knob names so that we would
have to pass the controller as enum and knob name to make the API
consistently use the enum so this would be a better option.

> +/* Tear down any CGroups created by calls to tst_cgroup_require */
> +void tst_cgroup_cleanup(void);
> +
> +/* Get the default CGroup for the test. It allocates memory (in a
> + * guarded buffer) so should always be called from setup
> + */
> +const struct tst_cgroup *tst_cgroup_get_test(void);
> +/* Get the shared drain group. Also should be called from setup */
> +const struct tst_cgroup *tst_cgroup_get_drain(void);

I think that these functions could be named better, it's mildly
confusing, "get test what?".

Maybe tst_cgroup_get_test_group() ?

Or tst_cgroup_get(enum tst_cgroup_group group) ?

> +/* Create a descendant CGroup */
> +struct tst_cgroup *tst_cgroup_mk(const struct tst_cgroup *parent,
> +				 const char *name);

The name parameter should probably be called node_name or dir_name, it
could be easily confused with the name parameter rest of the function
use to describe a "controller.knob" pair.

> +/* Remove a descendant CGroup */
> +struct tst_cgroup *tst_cgroup_rm(struct tst_cgroup *cg);
> +
> +#define SAFE_CGROUP_VER(cg, name) \
> +	safe_cgroup_ver(__FILE__, __LINE__, (cg), (name))

> +enum tst_cgroup_ver safe_cgroup_ver(const char *file, const int lineno,
> +				    const struct tst_cgroup *cg,
> +				    const char *name);

The 'name' parameter should really be controller_name or just
controller.

Also I was a bit confused why version request may even fail.

I guess that it's fine if we tst_brk() on invalid controller name even
without calling this function SAFE_*.

We haven't defined that anywhere but SAFE_FUNCTIONS() are things that
may fail at runtime because of failures propagated from the operating
system itself.

> +#define SAFE_CGROUP_HAS(cg, name) \
> +	safe_cgroup_has(__FILE__, __LINE__, (cg), (name))
> +
> +int safe_cgroup_has(const char *file, const int lineno,
> +		    const struct tst_cgroup *cg, const char *name);
> +
> +#define SAFE_CGROUP_READ(cg, name, out, len)				\
> +	safe_cgroup_read(__FILE__, __LINE__, (cg), (name), (out), (len))
> +
> +ssize_t safe_cgroup_read(const char *file, const int lineno,
> +			 const struct tst_cgroup *cg, const char *name,
> +			 char *out, size_t len);
>  
> -enum tst_cgroup_ver tst_cgroup_version(void);
> +#define SAFE_CGROUP_PRINTF(cg, name, fmt, ...)				\
> +	safe_cgroup_printf(__FILE__, __LINE__, (cg), (name), (fmt), __VA_ARGS__)
>  
> -/* To mount/umount specified cgroup controller on 'cgroup_dir' path */
> -void tst_cgroup_mount(enum tst_cgroup_ctrl ctrl, const char *cgroup_dir);
> -void tst_cgroup_umount(const char *cgroup_dir);
> +#define SAFE_CGROUP_PRINT(cg, name, str)				\
> +	safe_cgroup_printf(__FILE__, __LINE__, (cg), (name), "%s", (str))
>  
> -/* To move current process PID to the mounted cgroup tasks */
> -void tst_cgroup_move_current(const char *cgroup_dir);
> +void safe_cgroup_printf(const char *file, const int lineno,
> +			const struct tst_cgroup *cg, const char *name,
> +			const char *fmt, ...)
> +			__attribute__ ((format (printf, 5, 6)));
>  
> -/* To set cgroup controller knob with new value */
> -void tst_cgroup_set_knob(const char *cgroup_dir, const char *knob, long value);
> +#define SAFE_CGROUP_SCANF(cg, name, fmt, ...)				\
> +	safe_cgroup_scanf(__FILE__, __LINE__, (cg), (name), (fmt), __VA_ARGS__)
>  
> -/* Set of functions to set knobs under the memory controller */
> -void tst_cgroup_mem_set_maxbytes(const char *cgroup_dir, long memsz);
> -int  tst_cgroup_mem_swapacct_enabled(const char *cgroup_dir);
> -void tst_cgroup_mem_set_maxswap(const char *cgroup_dir, long memsz);
> +void safe_cgroup_scanf(const char *file, const int lineno,
> +		       const struct tst_cgroup *cg, const char *name,
> +		       const char *fmt, ...)
> +		       __attribute__ ((format (scanf, 5, 6)));

And for all the read/printf/scanf functions the name parameter should be
named knob or something that describes that we are operating on
individual files in a directory somewhere in the tree.

> -/* Set of functions to read/write cpuset controller files content */
> -void tst_cgroup_cpuset_read_files(const char *cgroup_dir, const char *filename,
> -	char *retbuf, size_t retbuf_sz);
> -void tst_cgroup_cpuset_write_files(const char *cgroup_dir, const char *filename,
> -	const char *buf);
>  
>  #endif /* TST_CGROUP_H */
> diff --git a/include/tst_test.h b/include/tst_test.h
> index 1fbebe752..62ab2981f 100644
> --- a/include/tst_test.h
> +++ b/include/tst_test.h
> @@ -39,7 +39,6 @@
>  #include "tst_capability.h"
>  #include "tst_hugepage.h"
>  #include "tst_assert.h"
> -#include "tst_cgroup.h"
>  #include "tst_lockdown.h"
>  #include "tst_fips.h"
>  #include "tst_taint.h"
> diff --git a/lib/Makefile b/lib/Makefile
> index f019432e8..6f641ee9a 100644
> --- a/lib/Makefile
> +++ b/lib/Makefile
> @@ -38,6 +38,8 @@ pc_file			:= $(DESTDIR)/$(datarootdir)/pkgconfig/ltp.pc
>  
>  INSTALL_TARGETS		:= $(pc_file)
>  
> +tst_cgroup.o: CFLAGS += -Wno-missing-field-initializers

I'm not sure we want to work around warnings like this.

>  $(pc_file):
>  	test -d "$(@D)" || mkdir -p "$(@D)"
>  	install -m $(INSTALL_MODE) "$(builddir)/$(@F)" "$@"
> diff --git a/lib/tst_cgroup.c b/lib/tst_cgroup.c
> index 96c9524d2..40c9a9bec 100644
> --- a/lib/tst_cgroup.c
> +++ b/lib/tst_cgroup.c
> @@ -2,453 +2,1014 @@
>  /*
>   * Copyright (c) 2020 Red Hat, Inc.
>   * Copyright (c) 2020 Li Wang <liwang@redhat.com>
> + * Copyright (c) 2020-2021 SUSE LLC <rpalethorpe@suse.com>
>   */
>  
>  #define TST_NO_DEFAULT_MAIN
>  
>  #include <stdio.h>
> +#include <stddef.h>
>  #include <stdlib.h>
> +#include <mntent.h>
>  #include <sys/mount.h>
> -#include <fcntl.h>
> -#include <unistd.h>
>  
>  #include "tst_test.h"
> -#include "tst_safe_macros.h"
> -#include "tst_safe_stdio.h"
> +#include "lapi/fcntl.h"
> +#include "lapi/mount.h"
> +#include "lapi/mkdirat.h"
>  #include "tst_cgroup.h"
> -#include "tst_device.h"
>  
> -static enum tst_cgroup_ver tst_cg_ver;
> -static int clone_children;
> +/* CGroup Core Implementation
> + *
> + * CGroup Item Implementation is towards the bottom.
> + */
>  
> -static int tst_cgroup_check(const char *cgroup)
> -{
> -	char line[PATH_MAX];
> -	FILE *file;
> -	int cg_check = 0;
> +struct cgroup_root;
>  
> -	file = SAFE_FOPEN("/proc/filesystems", "r");
> -	while (fgets(line, sizeof(line), file)) {
> -		if (strstr(line, cgroup) != NULL) {
> -			cg_check = 1;
> -			break;
> +/* A node in a single CGroup hierarchy. It exists mainly for
> + * convenience so that we do not have to traverse the CGroup structure
> + * for frequent operations.
> + *
> + * This is actually a single-linked list not a tree. We only need to
> + * traverse from leaf towards root.
> + */
> +struct tst_cgroup_tree {

Why isn't this called node or dir? Either of these would be more
fitting.

Also the tst_cgroup structure could use a better name, the tst_cgroup is
actually an array of pointers to directories.

> +	const char *name;

This should be dir_name, can we please avoid having all string pointers
called just 'name' I find it horribly confusing...

> +	const struct tst_cgroup_tree *parent;
> +
> +	/* Shortcut to root */
> +	const struct cgroup_root *root;
> +
> +	/* Subsystems (controllers) bit field. Only controllers which
> +	 * were required and configured for this node are added to
> +	 * this field. So it may be different from root->css_field.
> +	 */
> +	uint32_t css_field;
> +
> +	/* In general we avoid having sprintfs everywhere and use
> +	 * openat, linkat, etc.
> +	 */
> +	int dir;

Can we name this dir_fd so it's obvious what it is?

> +	int we_created_it:1;
> +};
> +
> +/* The root of a CGroup hierarchy/tree */
> +struct cgroup_root {
> +	enum tst_cgroup_ver ver;
> +	/* A mount path */
> +	char path[PATH_MAX/2];

Why PATH_MAX/2 ?

Also why don't we call it mount_path[] instead of having a lousy comment
that people have to look up when reading code?

> +	/* Subsystems (controllers) bit field. Includes all
> +	 * controllers found while scanningthis root.
> +	 */
> +	uint32_t css_field;
> +
> +	/* CGroup hierarchy: mnt -> ltp -> {drain, test -> ??? } We
> +	 * keep a flat reference to ltp, drain and test for
> +	 * convenience.
> +	 */
> +
> +	/* Mount directory */
> +	struct tst_cgroup_tree mnt;
> +	/* LTP CGroup directory, contains drain and test dirs */
> +	struct tst_cgroup_tree ltp;
> +	/* Drain CGroup, see cgroup_cleanup */
> +	struct tst_cgroup_tree drain;
> +	/* CGroup for current test. Which may have children. */
> +	struct tst_cgroup_tree test;

Here as well, it's sligtly easier to understand if we suffix the names
with what it actually is. If we rename struct tst_cgroup_tree to
tst_cgroup_node this should be ltp_node, drain_node, etc.

> +	int we_mounted_it:1;
> +	/* cpuset is in compatability mode */
> +	int no_prefix:1;
> +};
> +
> +/* Always use first item for unified hierarchy */
> +struct cgroup_root roots[TST_CGROUP_MAX_TREES + 1];
> +
> +/* Describes some things that are part of a CGroup
> + *
> + * Usually trunk nodes are controllers and leaves are files exported
> + * by the controllers. Sometimes trunk nodes are components of a
> + * controller (e.g. memory.swap).
> + *
> + * The primary purpose of this is to map V2 names to V1
> + * names. Secondarily we can map name prefixes to controller IDs and
> + * figure out which hierarchy the item should be present on and
> + * whether the current configuration requires yet further work arounds
> + * (e.g. if cpuset is mounted in compatablity mode).
> + */
> +struct cgroup_item {
> +	/* Canonical name. Is the V2 name unless an item is V1 only */
> +	const char *name;
> +	/* V1 name or NULL if item is V2 only */
> +	const char *name_v1;
> +	/* Array of child items or NULL */
> +	struct cgroup_item *inner;
> +
> +	/* The controller this item belongs to or zero for
> +	 * 'cgroup.<item>'. Leaf items are statically initialised as
> +	 * zero then set at runtime.
> +	 */
> +	enum tst_cgroup_css css_indx;
> +
> +	struct cgroup_root *root;
> +
> +	int we_require_it:1;
> +};
> +
> +/* Lookup tree for item names. */
> +typedef struct cgroup_item items_t[];
> +static items_t items = {
> +	[0] = { "cgroup", .inner = (items_t){
> +			{ "cgroup.procs", "tasks" },
> +			{ "cgroup.subtree_control" },
> +			{ "cgroup.clone_children", "clone_children" },
> +			{ NULL }
>  		}
> -	}
> -	SAFE_FCLOSE(file);
> +	},
> +	[TST_CGROUP_MEMORY] = { "memory", .inner = (items_t){
> +			{ "memory.current", "memory.usage_in_bytes" },
> +			{ "memory.max", "memory.limit_in_bytes" },
> +			{ "memory.swappiness", "memory.swappiness" },
> +			{ "memory.swap.current", "memory.memsw.usage_in_bytes" },
> +			{ "memory.swap.max", "memory.memsw.limit_in_bytes" },
> +			{ "memory.kmem.usage_in_bytes", "memory.kmem.usage_in_bytes" },
> +			{ "memory.kmem.limit_in_bytes", "memory.kmem.usage_in_bytes" },
> +			{ NULL }
> +		},
> +	  .css_indx = TST_CGROUP_MEMORY
> +	},
> +	[TST_CGROUP_CPUSET] = { "cpuset", .inner = (items_t){
> +			{ "cpuset.cpus", "cpuset.cpus" },
> +			{ "cpuset.mems", "cpuset.mems" },
> +			{ "cpuset.memory_migrate", "cpuset.memory_migrate" },
> +			{ NULL }
> +		},
> +	  .css_indx = TST_CGROUP_CPUSET
> +	},
> +	{ NULL }
> +};

Item is a very generic word, this is a list of known knobs and map
between v1 and v2. So maybe cgroup_knob_map or just cgroup_knobs ?

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [LTP] [PATCH v3 3/7] Add new CGroups APIs
  2021-04-14 15:39   ` Cyril Hrubis
@ 2021-04-15 13:10     ` Richard Palethorpe
  2021-04-16  5:00       ` Li Wang
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Palethorpe @ 2021-04-15 13:10 UTC (permalink / raw)
  To: ltp

Hello,

Cyril Hrubis <chrubis@suse.cz> writes:

> Hi!
>> Complete rewrite of the CGroups API which provides two layers of
>> indirection between the test author and the SUT's CGroup
>> configuration.
>
> I've spend quite some time reading the code and I think that the
> generall idea and implementation is good.
>
> There is quite a bit of minor details that I think may be done better,
> see comments.

Thanks for the review.
>>  
>> -#define PATH_TMP_CG_MEM	"/tmp/cgroup_mem"
>> -#define PATH_TMP_CG_CST	"/tmp/cgroup_cst"
>> -
>> +/* CGroups Kernel API version */
>>  enum tst_cgroup_ver {
>>  	TST_CGROUP_V1 = 1,
>>  	TST_CGROUP_V2 = 2,
>>  };
>>  
>> -enum tst_cgroup_ctrl {
>> -	TST_CGROUP_MEMCG = 1,
>> +/* Controller sub-systems */
>> +enum tst_cgroup_css {
>> +	TST_CGROUP_MEMORY = 1,
>>  	TST_CGROUP_CPUSET = 2,
>> -	/* add cgroup controller */
>>  };
>
> I spend a bit of time figuring out what css means, can't we just use
> controler in the code instead? It's a bit longer but more obvious.
>
> Or is this consitent with kernel naming?

The kernel seems to refer to controllers as subsystems and css appears
to mean "CGroup subsystem". I'm not sure if subsystems are necessarily
controllers, but it seems that way. Also "controller" has too many
letters in it. Indeed it makes some function names very long and I found
that harder to read.

I suppose to be totally consistent with the kernel we should randomly
name some things css and others subsys. :-p

>
>> +#define TST_CGROUP_MAX TST_CGROUP_CPUSET
>> +
>> +/* At most we can have one cgroup V1 tree for each controller and one
>> + * (empty) v2 tree.
>> + */
>> +#define TST_CGROUP_MAX_TREES (TST_CGROUP_MAX + 1)
>> +
>> +
>> +/* Used to specify CGroup hierarchy configuration options, allowing a
>> + * test to request a particular CGroup structure.
>> + */
>> +struct tst_cgroup_opts {
>> +	/* Only try to mount V1 CGroup controllers. This will not
>> +	 * prevent V2 from being used if it is already mounted, it
>> +	 * only indicates that we should mount V1 controllers if
>> +	 * nothing is present. By default we try to mount V2 first. */
>> +	int only_mount_v1:1;
>> +};
>
> Do we need to pass the flags in a structure?
>
> This is not an API that has to be future proof, when we find out we need
> more than a few bitflags we can always change it.

We may need to add xattr to this and there are many other
options. However most tests will just have NULL or 0, 0... So we could
move it into a vararg or have a tst_require2?

>
>> +struct tst_cgroup_tree;
>> +
>> +
>> +/* A Control Group in LTP's aggregated hierarchy */
>> +struct tst_cgroup {
>> +	const char *name;
>> +	/* Maps controller ID to the tree which contains it. The V2
>> +	 * tree is at zero even if it contains no controllers.
>> +	 */
>> +	struct tst_cgroup_tree *trees_by_css[TST_CGROUP_MAX_TREES];
>> +	/* NULL terminated list of trees */
>> +	struct tst_cgroup_tree *trees[TST_CGROUP_MAX_TREES + 1];
>> +};
>
> So this is an array of directories in different trees, can we please
> name the strucuture better. What about tst_cgroup_nodes or
> tst_cgroup_dirs?

I guess tst_cgroup_tree always represents a directory. So I would go
with tst_cgroup_dir.

>
>> +/* Search the system for mounted cgroups and available
>> + * controllers. Called automatically by tst_cgroup_require.
>> + */
>> +void tst_cgroup_scan(void);
>> +/* Print the config detected by tst_cgroup_scan */
>> +void tst_cgroup_print_config(void);
>> +
>> +/* Ensure the specified controller is available in the test's default
>> + * CGroup, mounting/enabling it if necessary */
>> +void tst_cgroup_require(enum tst_cgroup_css type,
>> +			const struct tst_cgroup_opts *options);
>
> This is the only function in the API that uses the enum cgroup_css. I
> think that it may as well be better if we just passed a string to this
> function as well and made the enum strictly internal.
>
> I guess that we do not want to split the knob names so that we would
> have to pass the controller as enum and knob name to make the API
> consistently use the enum so this would be a better option.

+1

>
>> +/* Tear down any CGroups created by calls to tst_cgroup_require */
>> +void tst_cgroup_cleanup(void);
>> +
>> +/* Get the default CGroup for the test. It allocates memory (in a
>> + * guarded buffer) so should always be called from setup
>> + */
>> +const struct tst_cgroup *tst_cgroup_get_test(void);
>> +/* Get the shared drain group. Also should be called from setup */
>> +const struct tst_cgroup *tst_cgroup_get_drain(void);
>
> I think that these functions could be named better, it's mildly
> confusing, "get test what?".
>
> Maybe tst_cgroup_get_test_group() ?

+1

>
> Or tst_cgroup_get(enum tst_cgroup_group group) ?
>
>> +/* Create a descendant CGroup */
>> +struct tst_cgroup *tst_cgroup_mk(const struct tst_cgroup *parent,
>> +				 const char *name);
>
> The name parameter should probably be called node_name or dir_name, it
> could be easily confused with the name parameter rest of the function
> use to describe a "controller.knob" pair.

+1

>
>> +/* Remove a descendant CGroup */
>> +struct tst_cgroup *tst_cgroup_rm(struct tst_cgroup *cg);
>> +
>> +#define SAFE_CGROUP_VER(cg, name) \
>> +	safe_cgroup_ver(__FILE__, __LINE__, (cg), (name))
>
>> +enum tst_cgroup_ver safe_cgroup_ver(const char *file, const int lineno,
>> +				    const struct tst_cgroup *cg,
>> +				    const char *name);
>
> The 'name' parameter should really be controller_name or just
> controller.
>
> Also I was a bit confused why version request may even fail.
>
> I guess that it's fine if we tst_brk() on invalid controller name even
> without calling this function SAFE_*.
>
> We haven't defined that anywhere but SAFE_FUNCTIONS() are things that
> may fail at runtime because of failures propagated from the operating
> system itself.

+1
>
>> +#define SAFE_CGROUP_HAS(cg, name) \
>> +	safe_cgroup_has(__FILE__, __LINE__, (cg), (name))
>> +
>> +int safe_cgroup_has(const char *file, const int lineno,
>> +		    const struct tst_cgroup *cg, const char *name);
>> +
>> +#define SAFE_CGROUP_READ(cg, name, out, len)				\
>> +	safe_cgroup_read(__FILE__, __LINE__, (cg), (name), (out), (len))
>> +
>> +ssize_t safe_cgroup_read(const char *file, const int lineno,
>> +			 const struct tst_cgroup *cg, const char *name,
>> +			 char *out, size_t len);
>>  
>> -enum tst_cgroup_ver tst_cgroup_version(void);
>> +#define SAFE_CGROUP_PRINTF(cg, name, fmt, ...)				\
>> +	safe_cgroup_printf(__FILE__, __LINE__, (cg), (name), (fmt), __VA_ARGS__)
>>  
>> -/* To mount/umount specified cgroup controller on 'cgroup_dir' path */
>> -void tst_cgroup_mount(enum tst_cgroup_ctrl ctrl, const char *cgroup_dir);
>> -void tst_cgroup_umount(const char *cgroup_dir);
>> +#define SAFE_CGROUP_PRINT(cg, name, str)				\
>> +	safe_cgroup_printf(__FILE__, __LINE__, (cg), (name), "%s", (str))
>>  
>> -/* To move current process PID to the mounted cgroup tasks */
>> -void tst_cgroup_move_current(const char *cgroup_dir);
>> +void safe_cgroup_printf(const char *file, const int lineno,
>> +			const struct tst_cgroup *cg, const char *name,
>> +			const char *fmt, ...)
>> +			__attribute__ ((format (printf, 5, 6)));
>>  
>> -/* To set cgroup controller knob with new value */
>> -void tst_cgroup_set_knob(const char *cgroup_dir, const char *knob, long value);
>> +#define SAFE_CGROUP_SCANF(cg, name, fmt, ...)				\
>> +	safe_cgroup_scanf(__FILE__, __LINE__, (cg), (name), (fmt), __VA_ARGS__)
>>  
>> -/* Set of functions to set knobs under the memory controller */
>> -void tst_cgroup_mem_set_maxbytes(const char *cgroup_dir, long memsz);
>> -int  tst_cgroup_mem_swapacct_enabled(const char *cgroup_dir);
>> -void tst_cgroup_mem_set_maxswap(const char *cgroup_dir, long memsz);
>> +void safe_cgroup_scanf(const char *file, const int lineno,
>> +		       const struct tst_cgroup *cg, const char *name,
>> +		       const char *fmt, ...)
>> +		       __attribute__ ((format (scanf, 5, 6)));
>
> And for all the read/printf/scanf functions the name parameter should be
> named knob or something that describes that we are operating on
> individual files in a directory somewhere in the tree.

+1

>
>> -/* Set of functions to read/write cpuset controller files content */
>> -void tst_cgroup_cpuset_read_files(const char *cgroup_dir, const char *filename,
>> -	char *retbuf, size_t retbuf_sz);
>> -void tst_cgroup_cpuset_write_files(const char *cgroup_dir, const char *filename,
>> -	const char *buf);
>>  
>>  #endif /* TST_CGROUP_H */
>> diff --git a/include/tst_test.h b/include/tst_test.h
>> index 1fbebe752..62ab2981f 100644
>> --- a/include/tst_test.h
>> +++ b/include/tst_test.h
>> @@ -39,7 +39,6 @@
>>  #include "tst_capability.h"
>>  #include "tst_hugepage.h"
>>  #include "tst_assert.h"
>> -#include "tst_cgroup.h"
>>  #include "tst_lockdown.h"
>>  #include "tst_fips.h"
>>  #include "tst_taint.h"
>> diff --git a/lib/Makefile b/lib/Makefile
>> index f019432e8..6f641ee9a 100644
>> --- a/lib/Makefile
>> +++ b/lib/Makefile
>> @@ -38,6 +38,8 @@ pc_file			:= $(DESTDIR)/$(datarootdir)/pkgconfig/ltp.pc
>>  
>>  INSTALL_TARGETS		:= $(pc_file)
>>  
>> +tst_cgroup.o: CFLAGS += -Wno-missing-field-initializers
>
> I'm not sure we want to work around warnings like this.

I suppose this is a result of cgroup_item being overly generic.

>
>>  $(pc_file):
>>  	test -d "$(@D)" || mkdir -p "$(@D)"
>>  	install -m $(INSTALL_MODE) "$(builddir)/$(@F)" "$@"
>> diff --git a/lib/tst_cgroup.c b/lib/tst_cgroup.c
>> index 96c9524d2..40c9a9bec 100644
>> --- a/lib/tst_cgroup.c
>> +++ b/lib/tst_cgroup.c
>> @@ -2,453 +2,1014 @@
>>  /*
>>   * Copyright (c) 2020 Red Hat, Inc.
>>   * Copyright (c) 2020 Li Wang <liwang@redhat.com>
>> + * Copyright (c) 2020-2021 SUSE LLC <rpalethorpe@suse.com>
>>   */
>>  
>>  #define TST_NO_DEFAULT_MAIN
>>  
>>  #include <stdio.h>
>> +#include <stddef.h>
>>  #include <stdlib.h>
>> +#include <mntent.h>
>>  #include <sys/mount.h>
>> -#include <fcntl.h>
>> -#include <unistd.h>
>>  
>>  #include "tst_test.h"
>> -#include "tst_safe_macros.h"
>> -#include "tst_safe_stdio.h"
>> +#include "lapi/fcntl.h"
>> +#include "lapi/mount.h"
>> +#include "lapi/mkdirat.h"
>>  #include "tst_cgroup.h"
>> -#include "tst_device.h"
>>  
>> -static enum tst_cgroup_ver tst_cg_ver;
>> -static int clone_children;
>> +/* CGroup Core Implementation
>> + *
>> + * CGroup Item Implementation is towards the bottom.
>> + */
>>  
>> -static int tst_cgroup_check(const char *cgroup)
>> -{
>> -	char line[PATH_MAX];
>> -	FILE *file;
>> -	int cg_check = 0;
>> +struct cgroup_root;
>>  
>> -	file = SAFE_FOPEN("/proc/filesystems", "r");
>> -	while (fgets(line, sizeof(line), file)) {
>> -		if (strstr(line, cgroup) != NULL) {
>> -			cg_check = 1;
>> -			break;
>> +/* A node in a single CGroup hierarchy. It exists mainly for
>> + * convenience so that we do not have to traverse the CGroup structure
>> + * for frequent operations.
>> + *
>> + * This is actually a single-linked list not a tree. We only need to
>> + * traverse from leaf towards root.
>> + */
>> +struct tst_cgroup_tree {
>
> Why isn't this called node or dir? Either of these would be more
> fitting.
>
> Also the tst_cgroup structure could use a better name, the tst_cgroup is
> actually an array of pointers to directories.

As far as the test author is concerned it represents an actual "Control
Group". We keep arrays of directories to join together multiple V1
controllers on different hierarchies, but that is an implementation
detail.

If anything it should be called tst_cgroup_group, but I don't like the
repetition. :-p

>
>> +	const char *name;
>
> This should be dir_name, can we please avoid having all string pointers
> called just 'name' I find it horribly confusing...
>
>> +	const struct tst_cgroup_tree *parent;
>> +
>> +	/* Shortcut to root */
>> +	const struct cgroup_root *root;
>> +
>> +	/* Subsystems (controllers) bit field. Only controllers which
>> +	 * were required and configured for this node are added to
>> +	 * this field. So it may be different from root->css_field.
>> +	 */
>> +	uint32_t css_field;
>> +
>> +	/* In general we avoid having sprintfs everywhere and use
>> +	 * openat, linkat, etc.
>> +	 */
>> +	int dir;
>
> Can we name this dir_fd so it's obvious what it is?
>
>> +	int we_created_it:1;
>> +};
>> +
>> +/* The root of a CGroup hierarchy/tree */
>> +struct cgroup_root {
>> +	enum tst_cgroup_ver ver;
>> +	/* A mount path */
>> +	char path[PATH_MAX/2];
>
> Why PATH_MAX/2 ?

This was to prevent static anlyses warnings with sprintf IIRC. It can
not be PATH_MAX unless the mount point contents are only expected to be
accessed with openat. I suppose that since I removed the sprintfs we can
use the full PATH_MAX to be consistent.

>> +/* Lookup tree for item names. */
>> +typedef struct cgroup_item items_t[];
>> +static items_t items = {
>> +	[0] = { "cgroup", .inner = (items_t){
>> +			{ "cgroup.procs", "tasks" },
>> +			{ "cgroup.subtree_control" },
>> +			{ "cgroup.clone_children", "clone_children" },
>> +			{ NULL }
>>  		}
>> -	}
>> -	SAFE_FCLOSE(file);
>> +	},
>> +	[TST_CGROUP_MEMORY] = { "memory", .inner = (items_t){
>> +			{ "memory.current", "memory.usage_in_bytes" },
>> +			{ "memory.max", "memory.limit_in_bytes" },
>> +			{ "memory.swappiness", "memory.swappiness" },
>> +			{ "memory.swap.current", "memory.memsw.usage_in_bytes" },
>> +			{ "memory.swap.max", "memory.memsw.limit_in_bytes" },
>> +			{ "memory.kmem.usage_in_bytes", "memory.kmem.usage_in_bytes" },
>> +			{ "memory.kmem.limit_in_bytes", "memory.kmem.usage_in_bytes" },
>> +			{ NULL }
>> +		},
>> +	  .css_indx = TST_CGROUP_MEMORY
>> +	},
>> +	[TST_CGROUP_CPUSET] = { "cpuset", .inner = (items_t){
>> +			{ "cpuset.cpus", "cpuset.cpus" },
>> +			{ "cpuset.mems", "cpuset.mems" },
>> +			{ "cpuset.memory_migrate", "cpuset.memory_migrate" },
>> +			{ NULL }
>> +		},
>> +	  .css_indx = TST_CGROUP_CPUSET
>> +	},
>> +	{ NULL }
>> +};
>
> Item is a very generic word, this is a list of known knobs and map
> between v1 and v2. So maybe cgroup_knob_map or just cgroup_knobs ?

I suppose that cgroup_item can be seperated into just two structs now
that I removed controller sub items like "memory.swap". This might help
solve the missing initializer warnings as well.

I would not describe "memory.current" as a knob. It is a read only file
and I think of knobs as something you can write to. So I would prefer
cgroup_file and cgroup_subsys.

-- 
Thank you,
Richard.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [LTP] [PATCH v3 3/7] Add new CGroups APIs
  2021-04-15 13:10     ` Richard Palethorpe
@ 2021-04-16  5:00       ` Li Wang
  2021-04-26 16:39         ` Richard Palethorpe
  0 siblings, 1 reply; 22+ messages in thread
From: Li Wang @ 2021-04-16  5:00 UTC (permalink / raw)
  To: ltp

Hi,

>> -enum tst_cgroup_ctrl {
> >> -    TST_CGROUP_MEMCG = 1,
> >> +/* Controller sub-systems */
> >> +enum tst_cgroup_css {
> >> +    TST_CGROUP_MEMORY = 1,
> >>      TST_CGROUP_CPUSET = 2,
> >> -    /* add cgroup controller */
> >>  };
> >
> > I spend a bit of time figuring out what css means, can't we just use
> > controler in the code instead? It's a bit longer but more obvious.
> >
> > Or is this consitent with kernel naming?
>
> The kernel seems to refer to controllers as subsystems and css appears
> to mean "CGroup subsystem". I'm not sure if subsystems are necessarily
> controllers, but it seems that way. Also "controller" has too many
> letters in it. Indeed it makes some function names very long and I found
> that harder to read.


I also feel css looks oddly, if "controller" is too long we can go back
with "ctrl",
then we might need a one-line comment for the name abbreviation.

e.g.  previous "/* add cgroup controller */" is make no sense but a hint
for people
to know this is cgroup controller and could be extended in the future.


> I suppose to be totally consistent with the kernel we should randomly
> name some things css and others subsys. :-p
>

Not very necessary, this test API is facing for userland,
understanding easily
for users is more important I think.


>> +struct tst_cgroup_opts {
> >> +    /* Only try to mount V1 CGroup controllers. This will not
> >> +     * prevent V2 from being used if it is already mounted, it
> >> +     * only indicates that we should mount V1 controllers if
> >> +     * nothing is present. By default we try to mount V2 first. */
> >> +    int only_mount_v1:1;
> >> +};
> >
> > Do we need to pass the flags in a structure?
> >
> > This is not an API that has to be future proof, when we find out we need
> > more than a few bitflags we can always change it.
>
> We may need to add xattr to this and there are many other
> options. However most tests will just have NULL or 0, 0... So we could
> move it into a vararg or have a tst_require2?
>

My 2cent:

I slightly think tst_cgroup_opts is good stuff, it gives a possibility
to let users hook the cgoup mount way or, more options they
needed (maybe for customized users). And, in most situations
that test just has NULL, that's fine.

@Richard,
do you want to reserve the ?no_cleanup? in this structure as well?
I noticed you leave it in tst_cgroup02 test.



> >> +    struct tst_cgroup_tree *trees[TST_CGROUP_MAX_TREES + 1];
> >> +};
> >
> > So this is an array of directories in different trees, can we please
> > name the strucuture better. What about tst_cgroup_nodes or
> > tst_cgroup_dirs?
>
> I guess tst_cgroup_tree always represents a directory. So I would go
> with tst_cgroup_dir.
>

+1

>> +struct tst_cgroup_tree {
> >
> > Why isn't this called node or dir? Either of these would be more
> > fitting.
> >
> > Also the tst_cgroup structure could use a better name, the tst_cgroup is
> > actually an array of pointers to directories.
>
> As far as the test author is concerned it represents an actual "Control
> Group". We keep arrays of directories to join together multiple V1
> controllers on different hierarchies, but that is an implementation
> detail.
>
> If anything it should be called tst_cgroup_group, but I don't like the
> repetition. :-p
>

Personally, I think tst_cgroup_node may be better since we have described
a tree for cgroup.  root, node, and leaves(files) should be more vivid.


>> +
> >> +    /* In general we avoid having sprintfs everywhere and use
> >> +     * openat, linkat, etc.
> >> +     */
> >> +    int dir;
> >
> > Can we name this dir_fd so it's obvious what it is?
>

+1
I stand by Cyril at this point.


>> +/* Lookup tree for item names. */
> >> +typedef struct cgroup_item items_t[];
> >> +static items_t items = {
> >> +    [0] = { "cgroup", .inner = (items_t){
> >> +                    { "cgroup.procs", "tasks" },
> >> +                    { "cgroup.subtree_control" },
> >> +                    { "cgroup.clone_children", "clone_children" },
> >> +                    { NULL }
> >>              }
> >> -    }
> >> -    SAFE_FCLOSE(file);
> >> +    },
> >> +    [TST_CGROUP_MEMORY] = { "memory", .inner = (items_t){
> >> +                    { "memory.current", "memory.usage_in_bytes" },
> >> +                    { "memory.max", "memory.limit_in_bytes" },
> >> +                    { "memory.swappiness", "memory.swappiness" },
> >> +                    { "memory.swap.current",
> "memory.memsw.usage_in_bytes" },
> >> +                    { "memory.swap.max", "memory.memsw.limit_in_bytes"
> },
> >> +                    { "memory.kmem.usage_in_bytes",
> "memory.kmem.usage_in_bytes" },
> >> +                    { "memory.kmem.limit_in_bytes",
> "memory.kmem.usage_in_bytes" },
> >> +                    { NULL }
> >> +            },
> >> +      .css_indx = TST_CGROUP_MEMORY
> >> +    },
> >> +    [TST_CGROUP_CPUSET] = { "cpuset", .inner = (items_t){
> >> +                    { "cpuset.cpus", "cpuset.cpus" },
> >> +                    { "cpuset.mems", "cpuset.mems" },
> >> +                    { "cpuset.memory_migrate", "cpuset.memory_migrate"
> },
> >> +                    { NULL }
> >> +            },
> >> +      .css_indx = TST_CGROUP_CPUSET
> >> +    },
> >> +    { NULL }
> >> +};
> >
> > Item is a very generic word, this is a list of known knobs and map
> > between v1 and v2. So maybe cgroup_knob_map or just cgroup_knobs ?
>
> I suppose that cgroup_item can be seperated into just two structs now
> that I removed controller sub items like "memory.swap". This might help
> solve the missing initializer warnings as well.


To separate into two structs sounds good, the 'cgroup_item' contains both
trunk node and controller files make things mess up.


> I would not describe "memory.current" as a knob. It is a read only file
> and I think of knobs as something you can write to. So I would prefer
> cgroup_file and cgroup_subsys.


+1
cgroup_file and cgroup_subsys(or cgroup_controller) all very nice.

-- 
Regards,
Li Wang
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linux.it/pipermail/ltp/attachments/20210416/de8068fe/attachment-0001.htm>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [LTP] [PATCH v3 3/7] Add new CGroups APIs
  2021-04-12 14:55 ` [LTP] [PATCH v3 3/7] Add new CGroups APIs Richard Palethorpe
  2021-04-14 15:39   ` Cyril Hrubis
@ 2021-04-16  6:57   ` Li Wang
  2021-04-26 16:01     ` Richard Palethorpe
  1 sibling, 1 reply; 22+ messages in thread
From: Li Wang @ 2021-04-16  6:57 UTC (permalink / raw)
  To: ltp

 Hi Richard,

Thanks for your work, the whole design looks good.

Minor suggestions as below:


> +static const char *ltp_cgroup_dir = "ltp";
> +static const char *ltp_cgroup_drain_dir = "drain";
> +static char test_cgroup_dir[PATH_MAX/4];
> +static const char *ltp_mount_prefix = "/tmp/cgroup_";
> +static const char *ltp_v2_mount = "unified";
> +
> +#define first_root                             \
> +       (roots[0].ver ? roots : roots + 1)
> +#define for_each_root(r)                       \
> +       for ((r) = first_root; (r)->ver; (r)++)
> +#define for_each_v1_root(r)                    \
> +       for ((r) = roots + 1; (r)->ver; (r)++)
>

If you go through the whole files you will find, there are some places use
'r' to represent the root but other uses 't', even in functions that
include ?t?
to represent a tree.

That probably a minor issue to make people get lost:).

So I'd suggest a unified abbreviation in all:

r  --> root
t  --> tree

if a function has root and tree, plz avoid using abbreviations, just use
itself.



> +#define for_each_css(css)                      \
> +       for ((css) = items + 1; (css)->name; (css)++)
> +
> +/* Controller items may only be in a single tree. So when (ss) > 0
> + * we only loop once.
> + */
> +#define for_each_tree(cg, css, t)                                      \
> +       for ((t) = (css) ? (cg)->trees_by_css + (css) : (cg)->trees;    \
> +            *(t);                                                      \
> +            (t) = (css) ? (cg)->trees + TST_CGROUP_MAX_TREES : (t) + 1)
> +
> +static int has_css(uint32_t css_field, enum tst_cgroup_css type)
> +{
> +       return !!(css_field & (1 << type));
>  }
>
>


> +struct tst_cgroup *tst_cgroup_mk(const struct tst_cgroup *parent,
> +                                const char *name)
>  {
>

Since we have already got the parent tst_cgroup, it means we know which
controllers/type has been mounted.

Can we add the required controllers (e.g. "+memory")  to subtree_control
automatically at here, rather than doing it manually before creating
children?
(then we can remove the line#27~30 in tst_cgroup02.c)


-- 
Regards,
Li Wang
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linux.it/pipermail/ltp/attachments/20210416/bbe7cbfd/attachment.htm>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [LTP] [PATCH v3 1/7] API: Add safe openat, printfat, readat and unlinkat
  2021-04-12 14:55 ` [LTP] [PATCH v3 1/7] API: Add safe openat, printfat, readat and unlinkat Richard Palethorpe
@ 2021-04-16  6:59   ` Li Wang
  2021-04-26 15:07     ` Richard Palethorpe
  0 siblings, 1 reply; 22+ messages in thread
From: Li Wang @ 2021-04-16  6:59 UTC (permalink / raw)
  To: ltp

Hi Richard,

On Mon, Apr 12, 2021 at 10:55 PM Richard Palethorpe <rpalethorpe@suse.com>
wrote:

> Add 'at' variants for a number of system calls and LTP SAFE API
> functions. This avoids using sprintf everywhere to build paths.
>
> Also adds tst_decode_fd which allows us to retrieve the path for an FD
> for debugging purposes without having to store it ourselves. However
> the proc symlink may not be available on some systems.
>
> Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com>
> ---
>  include/tst_safe_file_ops.h |  39 ++++++++
>  lib/tst_safe_file_ops.c     | 171 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 210 insertions(+)
>  create mode 100644 lib/tst_safe_file_ops.c
>
> diff --git a/include/tst_safe_file_ops.h b/include/tst_safe_file_ops.h
> index 223eddd1f..dff6a793c 100644
> --- a/include/tst_safe_file_ops.h
> +++ b/include/tst_safe_file_ops.h
> @@ -57,4 +57,43 @@
>  #define TST_MOUNT_OVERLAY() \
>         (mount_overlay(__FILE__, __LINE__, 0) == 0)
>
> +#define SAFE_OPENAT(dirfd, path, oflags, ...)                  \
> +       safe_openat(__FILE__, __LINE__,                         \
> +                   (dirfd), (path), (oflags), ## __VA_ARGS__)
> +
> +#define SAFE_FILE_READAT(dirfd, path, buf, nbyte)                      \
> +       safe_file_readat(__FILE__, __LINE__,                            \
> +                        (dirfd), (path), (buf), (nbyte))
> +
> +
> +#define SAFE_FILE_PRINTFAT(dirfd, path, fmt, ...)                      \
> +       safe_file_printfat(__FILE__, __LINE__,                          \
> +                          (dirfd), (path), (fmt), __VA_ARGS__)
> +
> +#define SAFE_UNLINKAT(dirfd, path, flags)                              \
> +       safe_unlinkat(__FILE__, __LINE__, (dirfd), (path), (flags))
> +
>

The above macros are suggested to leave in this "tst_safe_file_ops.h" file.

But, the function prototypes below should be moved to "safe_file_ops_fn.h",
because that purposely to separate macros and function in different places.
(I remember I had commented this in V2, probably you were missing it:)



> +char *tst_decode_fd(int fd);
> +
> +int safe_openat(const char *file, const int lineno,
> +               int dirfd, const char *path, int oflags, ...);
> +
> +ssize_t safe_file_readat(const char *file, const int lineno,
> +                        int dirfd, const char *path, char *buf, size_t
> nbyte);
> +
> +int tst_file_vprintfat(int dirfd, const char *path, const char *fmt,
> va_list va);
> +int tst_file_printfat(int dirfd, const char *path, const char *fmt, ...)
> +                       __attribute__ ((format (printf, 3, 4)));
> +
> +int safe_file_vprintfat(const char *file, const int lineno,
> +                       int dirfd, const char *path,
> +                       const char *fmt, va_list va);
> +
> +int safe_file_printfat(const char *file, const int lineno,
> +                      int dirfd, const char *path, const char *fmt, ...)
> +                       __attribute__ ((format (printf, 5, 6)));
> +
> +int safe_unlinkat(const char *file, const int lineno,
> +                 int dirfd, const char *path, int flags);

+
>  #endif /* TST_SAFE_FILE_OPS */
>



> diff --git a/lib/tst_safe_file_ops.c b/lib/tst_safe_file_ops.c
> new file mode 100644
> index 000000000..af4157476
> --- /dev/null
> +++ b/lib/tst_safe_file_ops.c
>

And, we'd better achieve all the functions in "lib/safe_file_ops.c"
but not create a separate new C file.

-- 
Regards,
Li Wang
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linux.it/pipermail/ltp/attachments/20210416/8b6ad04b/attachment.htm>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [LTP] [PATCH v3 2/7] API: Add macro for the container_of trick
  2021-04-12 14:55 ` [LTP] [PATCH v3 2/7] API: Add macro for the container_of trick Richard Palethorpe
@ 2021-04-16  7:01   ` Li Wang
  2021-04-26 15:15     ` Richard Palethorpe
  0 siblings, 1 reply; 22+ messages in thread
From: Li Wang @ 2021-04-16  7:01 UTC (permalink / raw)
  To: ltp

Richard Palethorpe <rpalethorpe@suse.com> wrote:

Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com>
>
Reviewed-by: Li Wang <liwang@redhat.com>


> ---
>  include/tst_common.h | 5 +++++
>  1 file changed, 5 insertions(+)
>
> diff --git a/include/tst_common.h b/include/tst_common.h
> index fd7a900d4..317925d1d 100644
> --- a/include/tst_common.h
> +++ b/include/tst_common.h
> @@ -83,4 +83,9 @@
>  #define TST_RES_SUPPORTS_TCONF_TFAIL_TINFO_TPASS_TWARN(condition) \
>         TST_BUILD_BUG_ON(condition)
>
> +#define tst_container_of(ptr, type, member) ({              \
> +       const typeof( ((type *)0)->member ) *__mptr = (ptr); \
> +       (type *)( (char *)__mptr - offsetof(type,member) );  \
> +})
>

I'd suggest defining it as uppercase 'TST_CONTAINER_OF(...)' to respect
other macro's naming policy in tst_common.h.

-- 
Regards,
Li Wang
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linux.it/pipermail/ltp/attachments/20210416/d63e627f/attachment-0001.htm>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [LTP] [PATCH v3 4/7] Add new CGroups API library tests
  2021-04-12 14:55 ` [LTP] [PATCH v3 4/7] Add new CGroups API library tests Richard Palethorpe
@ 2021-04-16  7:22   ` Li Wang
  0 siblings, 0 replies; 22+ messages in thread
From: Li Wang @ 2021-04-16  7:22 UTC (permalink / raw)
  To: ltp

>  lib/newlib_tests/.gitignore     |  2 +
>  lib/newlib_tests/test21.c       | 45 ++++++++---------
>  lib/newlib_tests/tst_cgroup01.c | 51 +++++++++++++++++++
>  lib/newlib_tests/tst_cgroup02.c | 90 +++++++++++++++++++++++++++++++++
>

Since test21 is covered by tst_cgroup02, I'd suggest deleting test21.c
directly.

-- 
Regards,
Li Wang
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linux.it/pipermail/ltp/attachments/20210416/63ff23f7/attachment.htm>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [LTP] [PATCH v3 5/7] docs: Update CGroups API
  2021-04-12 14:55 ` [LTP] [PATCH v3 5/7] docs: Update CGroups API Richard Palethorpe
@ 2021-04-16  8:11   ` Li Wang
  2021-04-26 16:44     ` Richard Palethorpe
  0 siblings, 1 reply; 22+ messages in thread
From: Li Wang @ 2021-04-16  8:11 UTC (permalink / raw)
  To: ltp

> +static void run(void)
> +{
> +       char buf[BUFSIZ];
> +       size_t mem = 0;
> +
> +       cg_child = tst_cgroup_mk(cg, "child");
> +       SAFE_CGROUP_PRINTF(cg_child, "cgroup.procs", "%d", getpid());
> +
> +       if (SAFE_CGROUP_VER(cg, "memory") != TST_CGROUP_V1)
> +               SAFE_CGROUP_PRINT(cg, "cgroup.subtree_control", "+memory");
> +       if (SAFE_CGROUP_VER(cg, "cpuset") != TST_CGROUP_V1)
> +               SAFE_CGROUP_PRINT(cg, "cgroup.subtree_control", "+cpuset");
>

Kind reminder:

If you decide to add controllers automatically (as I suggested in patch3/7)
in
tst_cgroup_mk(), then these lines should be removed.


> +Another example of an edge case is the following.
> +
> +[source,c]
>
> +-------------------------------------------------------------------------------
> +       if (tst_cgroup_ver(cg, "memory") == TST_CGROUP_V1)
> +               SAFE_CGROUP_PRINTF(cg, "memory.swap.max", "%lu", ~0UL);
> +       else
> +               SAFE_CGROUP_PRINT(cg, "memory.swap.max", "max");
>

typo PRINT --> PRINTF ^.

Btw, these documented works are quite awesome!

-- 
Regards,
Li Wang
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.linux.it/pipermail/ltp/attachments/20210416/6b4c0b42/attachment.htm>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [LTP] [PATCH v3 1/7] API: Add safe openat, printfat, readat and unlinkat
  2021-04-16  6:59   ` Li Wang
@ 2021-04-26 15:07     ` Richard Palethorpe
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Palethorpe @ 2021-04-26 15:07 UTC (permalink / raw)
  To: ltp

Hello Li,

Li Wang <liwang@redhat.com> writes:

> Hi Richard,
>
> On Mon, Apr 12, 2021 at 10:55 PM Richard Palethorpe <rpalethorpe@suse.com>
> wrote:
>
>> Add 'at' variants for a number of system calls and LTP SAFE API
>> functions. This avoids using sprintf everywhere to build paths.
>>
>> Also adds tst_decode_fd which allows us to retrieve the path for an FD
>> for debugging purposes without having to store it ourselves. However
>> the proc symlink may not be available on some systems.
>>
>> Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com>
>> ---
>>  include/tst_safe_file_ops.h |  39 ++++++++
>>  lib/tst_safe_file_ops.c     | 171 ++++++++++++++++++++++++++++++++++++
>>  2 files changed, 210 insertions(+)
>>  create mode 100644 lib/tst_safe_file_ops.c
>>
>> diff --git a/include/tst_safe_file_ops.h b/include/tst_safe_file_ops.h
>> index 223eddd1f..dff6a793c 100644
>> --- a/include/tst_safe_file_ops.h
>> +++ b/include/tst_safe_file_ops.h
>> @@ -57,4 +57,43 @@
>>  #define TST_MOUNT_OVERLAY() \
>>         (mount_overlay(__FILE__, __LINE__, 0) == 0)
>>
>> +#define SAFE_OPENAT(dirfd, path, oflags, ...)                  \
>> +       safe_openat(__FILE__, __LINE__,                         \
>> +                   (dirfd), (path), (oflags), ## __VA_ARGS__)
>> +
>> +#define SAFE_FILE_READAT(dirfd, path, buf, nbyte)                      \
>> +       safe_file_readat(__FILE__, __LINE__,                            \
>> +                        (dirfd), (path), (buf), (nbyte))
>> +
>> +
>> +#define SAFE_FILE_PRINTFAT(dirfd, path, fmt, ...)                      \
>> +       safe_file_printfat(__FILE__, __LINE__,                          \
>> +                          (dirfd), (path), (fmt), __VA_ARGS__)
>> +
>> +#define SAFE_UNLINKAT(dirfd, path, flags)                              \
>> +       safe_unlinkat(__FILE__, __LINE__, (dirfd), (path), (flags))
>> +
>>
>
> The above macros are suggested to leave in this "tst_safe_file_ops.h"
> file.

I think this is just for legacy reasons. To separate new and old API
function definitions. These *at functions will never be in the old API
though.

>
> But, the function prototypes below should be moved to "safe_file_ops_fn.h",
> because that purposely to separate macros and function in different places.
> (I remember I had commented this in V2, probably you were missing it:)
>

Probably the best thing to do is create tst_safe_file_at.{c,h} and move
these functions there. This would be more consistent with the new API.

I think I was trying to think of a better solution, but then forgot
about it.

>
>
>> +char *tst_decode_fd(int fd);
>> +
>> +int safe_openat(const char *file, const int lineno,
>> +               int dirfd, const char *path, int oflags, ...);
>> +
>> +ssize_t safe_file_readat(const char *file, const int lineno,
>> +                        int dirfd, const char *path, char *buf, size_t
>> nbyte);
>> +
>> +int tst_file_vprintfat(int dirfd, const char *path, const char *fmt,
>> va_list va);
>> +int tst_file_printfat(int dirfd, const char *path, const char *fmt, ...)
>> +                       __attribute__ ((format (printf, 3, 4)));
>> +
>> +int safe_file_vprintfat(const char *file, const int lineno,
>> +                       int dirfd, const char *path,
>> +                       const char *fmt, va_list va);
>> +
>> +int safe_file_printfat(const char *file, const int lineno,
>> +                      int dirfd, const char *path, const char *fmt, ...)
>> +                       __attribute__ ((format (printf, 5, 6)));
>> +
>> +int safe_unlinkat(const char *file, const int lineno,
>> +                 int dirfd, const char *path, int flags);
>
> +
>>  #endif /* TST_SAFE_FILE_OPS */
>>
>
>
>
>> diff --git a/lib/tst_safe_file_ops.c b/lib/tst_safe_file_ops.c
>> new file mode 100644
>> index 000000000..af4157476
>> --- /dev/null
>> +++ b/lib/tst_safe_file_ops.c
>>
>
> And, we'd better achieve all the functions in "lib/safe_file_ops.c"
> but not create a separate new C file.

This would mean they all have a useless argument for the old API.

-- 
Thank you,
Richard.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [LTP] [PATCH v3 2/7] API: Add macro for the container_of trick
  2021-04-16  7:01   ` Li Wang
@ 2021-04-26 15:15     ` Richard Palethorpe
  2021-04-27 11:03       ` Cyril Hrubis
  0 siblings, 1 reply; 22+ messages in thread
From: Richard Palethorpe @ 2021-04-26 15:15 UTC (permalink / raw)
  To: ltp

Hi,

Li Wang <liwang@redhat.com> writes:

> Richard Palethorpe <rpalethorpe@suse.com> wrote:
>
> Signed-off-by: Richard Palethorpe <rpalethorpe@suse.com>
>>
> Reviewed-by: Li Wang <liwang@redhat.com>
>
>
>> ---
>>  include/tst_common.h | 5 +++++
>>  1 file changed, 5 insertions(+)
>>
>> diff --git a/include/tst_common.h b/include/tst_common.h
>> index fd7a900d4..317925d1d 100644
>> --- a/include/tst_common.h
>> +++ b/include/tst_common.h
>> @@ -83,4 +83,9 @@
>>  #define TST_RES_SUPPORTS_TCONF_TFAIL_TINFO_TPASS_TWARN(condition) \
>>         TST_BUILD_BUG_ON(condition)
>>
>> +#define tst_container_of(ptr, type, member) ({              \
>> +       const typeof( ((type *)0)->member ) *__mptr = (ptr); \
>> +       (type *)( (char *)__mptr - offsetof(type,member) );  \
>> +})
>>
>
> I'd suggest defining it as uppercase 'TST_CONTAINER_OF(...)' to respect
> other macro's naming policy in tst_common.h.

I don't mind either way. I suspect it is lower case to match offsetof
and maybe it is expected to become a compiler intrinsic. Perhaps we
should remove the tst_. WDYT Cyril??

-- 
Thank you,
Richard.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [LTP] [PATCH v3 3/7] Add new CGroups APIs
  2021-04-16  6:57   ` Li Wang
@ 2021-04-26 16:01     ` Richard Palethorpe
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Palethorpe @ 2021-04-26 16:01 UTC (permalink / raw)
  To: ltp

Hello,

Li Wang <liwang@redhat.com> writes:

>  Hi Richard,
>
> Thanks for your work, the whole design looks good.
>
> Minor suggestions as below:
>
>
>> +static const char *ltp_cgroup_dir = "ltp";
>> +static const char *ltp_cgroup_drain_dir = "drain";
>> +static char test_cgroup_dir[PATH_MAX/4];
>> +static const char *ltp_mount_prefix = "/tmp/cgroup_";
>> +static const char *ltp_v2_mount = "unified";
>> +
>> +#define first_root                             \
>> +       (roots[0].ver ? roots : roots + 1)
>> +#define for_each_root(r)                       \
>> +       for ((r) = first_root; (r)->ver; (r)++)
>> +#define for_each_v1_root(r)                    \
>> +       for ((r) = roots + 1; (r)->ver; (r)++)
>>
>
> If you go through the whole files you will find, there are some places use
> 'r' to represent the root but other uses 't', even in functions that
> include ?t?
> to represent a tree.
>
> That probably a minor issue to make people get lost:).
>
> So I'd suggest a unified abbreviation in all:
>
> r  --> root
> t  --> tree
>
> if a function has root and tree, plz avoid using abbreviations, just use
> itself.

Yes, I should do something about this naming. I think this is related to
other issues Cyril mentioned in the naming of tree items.

>
>
>
>> +#define for_each_css(css)                      \
>> +       for ((css) = items + 1; (css)->name; (css)++)
>> +
>> +/* Controller items may only be in a single tree. So when (ss) > 0
>> + * we only loop once.
>> + */
>> +#define for_each_tree(cg, css, t)                                      \
>> +       for ((t) = (css) ? (cg)->trees_by_css + (css) : (cg)->trees;    \
>> +            *(t);                                                      \
>> +            (t) = (css) ? (cg)->trees + TST_CGROUP_MAX_TREES : (t) + 1)
>> +
>> +static int has_css(uint32_t css_field, enum tst_cgroup_css type)
>> +{
>> +       return !!(css_field & (1 << type));
>>  }
>>
>>
>
>
>> +struct tst_cgroup *tst_cgroup_mk(const struct tst_cgroup *parent,
>> +                                const char *name)
>>  {
>>
>
> Since we have already got the parent tst_cgroup, it means we know which
> controllers/type has been mounted.
>
> Can we add the required controllers (e.g. "+memory")  to subtree_control
> automatically at here, rather than doing it manually before creating
> children?
> (then we can remove the line#27~30 in tst_cgroup02.c)

It is possible and more consistent with V1 behavior. I suppose users
could remove controllers again if they wanted to test some V2 specific
configuration.

OTOH we could leave this for a later patch when we actually have some
real tests containing child groups? I'm not fully sure what the results
of doing this are and we may just have to reverse it.

-- 
Thank you,
Richard.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [LTP] [PATCH v3 3/7] Add new CGroups APIs
  2021-04-16  5:00       ` Li Wang
@ 2021-04-26 16:39         ` Richard Palethorpe
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Palethorpe @ 2021-04-26 16:39 UTC (permalink / raw)
  To: ltp


Li Wang <liwang@redhat.com> writes:

> Hi,
>
>>> -enum tst_cgroup_ctrl {
>> >> -    TST_CGROUP_MEMCG = 1,
>> >> +/* Controller sub-systems */
>> >> +enum tst_cgroup_css {
>> >> +    TST_CGROUP_MEMORY = 1,
>> >>      TST_CGROUP_CPUSET = 2,
>> >> -    /* add cgroup controller */
>> >>  };
>> >
>> > I spend a bit of time figuring out what css means, can't we just use
>> > controler in the code instead? It's a bit longer but more obvious.
>> >
>> > Or is this consitent with kernel naming?
>>
>> The kernel seems to refer to controllers as subsystems and css appears
>> to mean "CGroup subsystem". I'm not sure if subsystems are necessarily
>> controllers, but it seems that way. Also "controller" has too many
>> letters in it. Indeed it makes some function names very long and I found
>> that harder to read.
>
>
> I also feel css looks oddly, if "controller" is too long we can go back
> with "ctrl",
> then we might need a one-line comment for the name abbreviation.
>
> e.g.  previous "/* add cgroup controller */" is make no sense but a hint
> for people
> to know this is cgroup controller and could be extended in the future.

OK ctrl it is and I will create a note about internal kernel naming.

>
>
>> I suppose to be totally consistent with the kernel we should randomly
>> name some things css and others subsys. :-p
>>
>
> Not very necessary, this test API is facing for userland,
> understanding easily
> for users is more important I think.
>
>
>>> +struct tst_cgroup_opts {
>> >> +    /* Only try to mount V1 CGroup controllers. This will not
>> >> +     * prevent V2 from being used if it is already mounted, it
>> >> +     * only indicates that we should mount V1 controllers if
>> >> +     * nothing is present. By default we try to mount V2 first. */
>> >> +    int only_mount_v1:1;
>> >> +};
>> >
>> > Do we need to pass the flags in a structure?
>> >
>> > This is not an API that has to be future proof, when we find out we need
>> > more than a few bitflags we can always change it.
>>
>> We may need to add xattr to this and there are many other
>> options. However most tests will just have NULL or 0, 0... So we could
>> move it into a vararg or have a tst_require2?
>>
>
> My 2cent:
>
> I slightly think tst_cgroup_opts is good stuff, it gives a possibility
> to let users hook the cgoup mount way or, more options they
> needed (maybe for customized users). And, in most situations
> that test just has NULL, that's fine.

+1

>
> @Richard,
> do you want to reserve the ?no_cleanup? in this structure as well?
> I noticed you leave it in tst_cgroup02 test.

I think this is only necessary if we add a new flag to all tests in
tst_test to prevent CGroup cleanup. It should be easy to add this in a
later patch.

>
>
>
>> >> +    struct tst_cgroup_tree *trees[TST_CGROUP_MAX_TREES + 1];
>> >> +};
>> >
>> > So this is an array of directories in different trees, can we please
>> > name the strucuture better. What about tst_cgroup_nodes or
>> > tst_cgroup_dirs?
>>
>> I guess tst_cgroup_tree always represents a directory. So I would go
>> with tst_cgroup_dir.
>>
>
> +1
>
>>> +struct tst_cgroup_tree {
>> >
>> > Why isn't this called node or dir? Either of these would be more
>> > fitting.
>> >
>> > Also the tst_cgroup structure could use a better name, the tst_cgroup is
>> > actually an array of pointers to directories.
>>
>> As far as the test author is concerned it represents an actual "Control
>> Group". We keep arrays of directories to join together multiple V1
>> controllers on different hierarchies, but that is an implementation
>> detail.
>>
>> If anything it should be called tst_cgroup_group, but I don't like the
>> repetition. :-p
>>
>
> Personally, I think tst_cgroup_node may be better since we have described
> a tree for cgroup.  root, node, and leaves(files) should be more vivid.
>

OK

>
>
>>> +
>> >> +    /* In general we avoid having sprintfs everywhere and use
>> >> +     * openat, linkat, etc.
>> >> +     */
>> >> +    int dir;
>> >
>> > Can we name this dir_fd so it's obvious what it is?
>>
>
> +1
> I stand by Cyril at this point.
>
>
>>> +/* Lookup tree for item names. */
>> >> +typedef struct cgroup_item items_t[];
>> >> +static items_t items = {
>> >> +    [0] = { "cgroup", .inner = (items_t){
>> >> +                    { "cgroup.procs", "tasks" },
>> >> +                    { "cgroup.subtree_control" },
>> >> +                    { "cgroup.clone_children", "clone_children" },
>> >> +                    { NULL }
>> >>              }
>> >> -    }
>> >> -    SAFE_FCLOSE(file);
>> >> +    },
>> >> +    [TST_CGROUP_MEMORY] = { "memory", .inner = (items_t){
>> >> +                    { "memory.current", "memory.usage_in_bytes" },
>> >> +                    { "memory.max", "memory.limit_in_bytes" },
>> >> +                    { "memory.swappiness", "memory.swappiness" },
>> >> +                    { "memory.swap.current",
>> "memory.memsw.usage_in_bytes" },
>> >> +                    { "memory.swap.max", "memory.memsw.limit_in_bytes"
>> },
>> >> +                    { "memory.kmem.usage_in_bytes",
>> "memory.kmem.usage_in_bytes" },
>> >> +                    { "memory.kmem.limit_in_bytes",
>> "memory.kmem.usage_in_bytes" },
>> >> +                    { NULL }
>> >> +            },
>> >> +      .css_indx = TST_CGROUP_MEMORY
>> >> +    },
>> >> +    [TST_CGROUP_CPUSET] = { "cpuset", .inner = (items_t){
>> >> +                    { "cpuset.cpus", "cpuset.cpus" },
>> >> +                    { "cpuset.mems", "cpuset.mems" },
>> >> +                    { "cpuset.memory_migrate", "cpuset.memory_migrate"
>> },
>> >> +                    { NULL }
>> >> +            },
>> >> +      .css_indx = TST_CGROUP_CPUSET
>> >> +    },
>> >> +    { NULL }
>> >> +};
>> >
>> > Item is a very generic word, this is a list of known knobs and map
>> > between v1 and v2. So maybe cgroup_knob_map or just cgroup_knobs ?
>>
>> I suppose that cgroup_item can be seperated into just two structs now
>> that I removed controller sub items like "memory.swap". This might help
>> solve the missing initializer warnings as well.
>
>
> To separate into two structs sounds good, the 'cgroup_item' contains both
> trunk node and controller files make things mess up.
>
>
>> I would not describe "memory.current" as a knob. It is a read only file
>> and I think of knobs as something you can write to. So I would prefer
>> cgroup_file and cgroup_subsys.
>
>
> +1
> cgroup_file and cgroup_subsys(or cgroup_controller) all very nice.

I will try cgroup_file and cgroup_ctrl.

-- 
Thank you,
Richard.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [LTP] [PATCH v3 5/7] docs: Update CGroups API
  2021-04-16  8:11   ` Li Wang
@ 2021-04-26 16:44     ` Richard Palethorpe
  0 siblings, 0 replies; 22+ messages in thread
From: Richard Palethorpe @ 2021-04-26 16:44 UTC (permalink / raw)
  To: ltp

Hello,

Li Wang <liwang@redhat.com> writes:

>> +static void run(void)
>> +{
>> +       char buf[BUFSIZ];
>> +       size_t mem = 0;
>> +
>> +       cg_child = tst_cgroup_mk(cg, "child");
>> +       SAFE_CGROUP_PRINTF(cg_child, "cgroup.procs", "%d", getpid());
>> +
>> +       if (SAFE_CGROUP_VER(cg, "memory") != TST_CGROUP_V1)
>> +               SAFE_CGROUP_PRINT(cg, "cgroup.subtree_control", "+memory");
>> +       if (SAFE_CGROUP_VER(cg, "cpuset") != TST_CGROUP_V1)
>> +               SAFE_CGROUP_PRINT(cg, "cgroup.subtree_control", "+cpuset");
>>
>
> Kind reminder:
>
> If you decide to add controllers automatically (as I suggested in patch3/7)
> in
> tst_cgroup_mk(), then these lines should be removed.
>
>
>> +Another example of an edge case is the following.
>> +
>> +[source,c]
>>
>> +-------------------------------------------------------------------------------
>> +       if (tst_cgroup_ver(cg, "memory") == TST_CGROUP_V1)
>> +               SAFE_CGROUP_PRINTF(cg, "memory.swap.max", "%lu", ~0UL);
>> +       else
>> +               SAFE_CGROUP_PRINT(cg, "memory.swap.max", "max");
>>
>
> typo PRINT --> PRINTF ^.

This function actually exists :-p

>
> Btw, these documented works are quite awesome!

Thanks!

-- 
Thank you,
Richard.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [LTP] [PATCH v3 2/7] API: Add macro for the container_of trick
  2021-04-26 15:15     ` Richard Palethorpe
@ 2021-04-27 11:03       ` Cyril Hrubis
  0 siblings, 0 replies; 22+ messages in thread
From: Cyril Hrubis @ 2021-04-27 11:03 UTC (permalink / raw)
  To: ltp

Hi!
> >> ---
> >>  include/tst_common.h | 5 +++++
> >>  1 file changed, 5 insertions(+)
> >>
> >> diff --git a/include/tst_common.h b/include/tst_common.h
> >> index fd7a900d4..317925d1d 100644
> >> --- a/include/tst_common.h
> >> +++ b/include/tst_common.h
> >> @@ -83,4 +83,9 @@
> >>  #define TST_RES_SUPPORTS_TCONF_TFAIL_TINFO_TPASS_TWARN(condition) \
> >>         TST_BUILD_BUG_ON(condition)
> >>
> >> +#define tst_container_of(ptr, type, member) ({              \
> >> +       const typeof( ((type *)0)->member ) *__mptr = (ptr); \
> >> +       (type *)( (char *)__mptr - offsetof(type,member) );  \
> >> +})
> >>
> >
> > I'd suggest defining it as uppercase 'TST_CONTAINER_OF(...)' to respect
> > other macro's naming policy in tst_common.h.
> 
> I don't mind either way. I suspect it is lower case to match offsetof
> and maybe it is expected to become a compiler intrinsic. Perhaps we
> should remove the tst_. WDYT Cyril??

I think that it makes sense to use the same name as in the Linux kernel
since the macro is well known there. I would go just with container_of()
in this case.

-- 
Cyril Hrubis
chrubis@suse.cz

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2021-04-27 11:03 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-12 14:54 [LTP] [PATCH v3 0/7] CGroup API rewrite Richard Palethorpe
2021-04-12 14:55 ` [LTP] [PATCH v3 1/7] API: Add safe openat, printfat, readat and unlinkat Richard Palethorpe
2021-04-16  6:59   ` Li Wang
2021-04-26 15:07     ` Richard Palethorpe
2021-04-12 14:55 ` [LTP] [PATCH v3 2/7] API: Add macro for the container_of trick Richard Palethorpe
2021-04-16  7:01   ` Li Wang
2021-04-26 15:15     ` Richard Palethorpe
2021-04-27 11:03       ` Cyril Hrubis
2021-04-12 14:55 ` [LTP] [PATCH v3 3/7] Add new CGroups APIs Richard Palethorpe
2021-04-14 15:39   ` Cyril Hrubis
2021-04-15 13:10     ` Richard Palethorpe
2021-04-16  5:00       ` Li Wang
2021-04-26 16:39         ` Richard Palethorpe
2021-04-16  6:57   ` Li Wang
2021-04-26 16:01     ` Richard Palethorpe
2021-04-12 14:55 ` [LTP] [PATCH v3 4/7] Add new CGroups API library tests Richard Palethorpe
2021-04-16  7:22   ` Li Wang
2021-04-12 14:55 ` [LTP] [PATCH v3 5/7] docs: Update CGroups API Richard Palethorpe
2021-04-16  8:11   ` Li Wang
2021-04-26 16:44     ` Richard Palethorpe
2021-04-12 14:55 ` [LTP] [PATCH v3 6/7] mem: Convert tests to new " Richard Palethorpe
2021-04-12 14:55 ` [LTP] [PATCH v3 7/7] madvise06: Convert " Richard Palethorpe

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.