All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] cgroup-v2: Add taskstats counters in cgroup.stat
@ 2021-03-11  6:17 Chengming Zhou
  2021-03-11 11:10 ` Tejun Heo
  0 siblings, 1 reply; 2+ messages in thread
From: Chengming Zhou @ 2021-03-11  6:17 UTC (permalink / raw)
  To: tj, lizefan.x, hannes, corbet, mingo, peterz, juri.lelli,
	vincent.guittot
  Cc: dietmar.eggemann, rostedt, bsegall, mgorman, bristot, cgroups,
	linux-doc, linux-kernel, zhouchengming, songmuchun

We have the netlink CGROUPSTATS_CMD_GET interface to get taskstats
of the cgroup on v1, but haven't the equivalent interface on v2,
making it difficult to calculate the per-cgroup cpu load in cadvisor
or implement the cgroup proc interface in lxcfs, like /proc/loadavg.

Since we already have these counters maintained in psi subsystem,
so this patch sum them up and export in the cgroup.stat interface.

Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com>
---
 Documentation/admin-guide/cgroup-v2.rst |  9 +++++++
 include/linux/psi.h                     |  1 +
 kernel/cgroup/cgroup.c                  |  3 +++
 kernel/sched/psi.c                      | 34 +++++++++++++++++++++++++
 4 files changed, 47 insertions(+)

diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst
index 64c62b979f2f..4184e749f687 100644
--- a/Documentation/admin-guide/cgroup-v2.rst
+++ b/Documentation/admin-guide/cgroup-v2.rst
@@ -923,6 +923,15 @@ All cgroup core files are prefixed with "cgroup."
 		A dying cgroup can consume system resources not exceeding
 		limits, which were active at the moment of cgroup deletion.
 
+	  nr_iowait_tasks
+	    Total number of tasks in iowait.
+
+	  nr_memstall_tasks
+	    Total number of tasks in memstall.
+
+	  nr_running_tasks
+	    Total number of runnable tasks.
+
   cgroup.freeze
 	A read-write single value file which exists on non-root cgroups.
 	Allowed values are "0" and "1". The default is "0".
diff --git a/include/linux/psi.h b/include/linux/psi.h
index 7361023f3fdd..ea98239424ca 100644
--- a/include/linux/psi.h
+++ b/include/linux/psi.h
@@ -30,6 +30,7 @@ int psi_show(struct seq_file *s, struct psi_group *group, enum psi_res res);
 int psi_cgroup_alloc(struct cgroup *cgrp);
 void psi_cgroup_free(struct cgroup *cgrp);
 void cgroup_move_task(struct task_struct *p, struct css_set *to);
+void psi_taskstat_show(struct seq_file *m, struct cgroup *cgrp);
 
 struct psi_trigger *psi_trigger_create(struct psi_group *group,
 			char *buf, size_t nbytes, enum psi_res res);
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 9153b20e5cc6..2724ae318a3b 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -3502,6 +3502,9 @@ static int cgroup_stat_show(struct seq_file *seq, void *v)
 	seq_printf(seq, "nr_dying_descendants %d\n",
 		   cgroup->nr_dying_descendants);
 
+#ifdef CONFIG_PSI
+	psi_taskstat_show(seq, cgroup);
+#endif
 	return 0;
 }
 
diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index 967732c0766c..0ae8bd278ca4 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -1000,6 +1000,40 @@ void cgroup_move_task(struct task_struct *task, struct css_set *to)
 
 	task_rq_unlock(rq, task, &rf);
 }
+
+void psi_taskstat_show(struct seq_file *m, struct cgroup *cgrp)
+{
+	struct psi_group *group;
+	int cpu;
+	int s;
+	unsigned int taskstat[NR_PSI_TASK_COUNTS - 1] = { 0, };
+
+	if (static_branch_likely(&psi_disabled))
+		return;
+
+	group = cgroup_ino(cgrp) == 1 ? &psi_system : &cgrp->psi;
+
+	for_each_possible_cpu(cpu) {
+		struct psi_group_cpu *groupc = per_cpu_ptr(group->pcpu, cpu);
+		unsigned int tasks[NR_PSI_TASK_COUNTS];
+		unsigned int seq;
+
+		do {
+			seq = read_seqcount_begin(&groupc->seq);
+			memcpy(tasks, groupc->tasks, sizeof(groupc->tasks));
+		} while (read_seqcount_retry(&groupc->seq, seq));
+
+		for (s = 0; s < NR_ONCPU; s++)
+			taskstat[s] += tasks[s];
+	}
+
+	seq_printf(m, "nr_iowait_tasks %u\n"
+		   "nr_memstall_tasks %u\n"
+		   "nr_running_tasks %u\n",
+		   taskstat[NR_IOWAIT],
+		   taskstat[NR_MEMSTALL],
+		   taskstat[NR_RUNNING]);
+}
 #endif /* CONFIG_CGROUPS */
 
 int psi_show(struct seq_file *m, struct psi_group *group, enum psi_res res)
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] cgroup-v2: Add taskstats counters in cgroup.stat
  2021-03-11  6:17 [PATCH] cgroup-v2: Add taskstats counters in cgroup.stat Chengming Zhou
@ 2021-03-11 11:10 ` Tejun Heo
  0 siblings, 0 replies; 2+ messages in thread
From: Tejun Heo @ 2021-03-11 11:10 UTC (permalink / raw)
  To: Chengming Zhou
  Cc: lizefan.x, hannes, corbet, mingo, peterz, juri.lelli,
	vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
	bristot, cgroups, linux-doc, linux-kernel, songmuchun

On Thu, Mar 11, 2021 at 02:17:52PM +0800, Chengming Zhou wrote:
> We have the netlink CGROUPSTATS_CMD_GET interface to get taskstats
> of the cgroup on v1, but haven't the equivalent interface on v2,
> making it difficult to calculate the per-cgroup cpu load in cadvisor
> or implement the cgroup proc interface in lxcfs, like /proc/loadavg.

So, this is what the PSI metrics are for and we've been using it for that
for quite a while now. I'd much prefer not adding something duplicate (and
incomplete).

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-03-11 11:11 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-11  6:17 [PATCH] cgroup-v2: Add taskstats counters in cgroup.stat Chengming Zhou
2021-03-11 11:10 ` Tejun Heo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.