All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/6] sched: support schedstats for RT sched class
@ 2020-12-01 11:54 Yafang Shao
  2020-12-01 11:54 ` [PATCH 1/6] sched: don't include stats.h in sched.h Yafang Shao
                   ` (5 more replies)
  0 siblings, 6 replies; 19+ messages in thread
From: Yafang Shao @ 2020-12-01 11:54 UTC (permalink / raw)
  To: mgorman, mingo, peterz, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, bristot, qianjun.kernel
  Cc: linux-kernel, linux-rt-users, Yafang Shao

We want to measure the latency of RT tasks in our production
environment with schedstats facility, but currently schedstats is only
supported for fair sched class. This patch enable it for RT sched class
as well.

- Structure

Before we make schedstats available for RT sched class, we should make
struct sched_statistics independent of fair sched class first.


The struct sched_statistics is the schedular statistics of a task_struct
or a task_group. So we can move it into struct task_struct and
struct task_group to achieve the goal.

Below is the detailed explaination of the change in the structs.

The struct before this patch:

struct task_struct {            |-> struct sched_entity {
    ...                         |       ...
    struct sched_entity *se; ---|       struct sched_statistics statistics;
    struct sched_rt_entity *rt;         ...
    ...                                 ...
};                                  };

struct task_group {             |--> se[0]->statistics : schedstats of CPU0
    ...                         |
 #ifdef CONFIG_FAIR_GROUP_SCHED |
    struct sched_entity **se; --|--> se[1]->statistics : schedstats of CPU1
                                |
 #endif                         |
                                |--> se[N]->statistics : schedstats of CPUn

 #ifdef CONFIG_FAIR_GROUP_SCHED
    struct sched_rt_entity  **rt_se; (N/A)
 #endif
    ...
};

The '**se' in task_group is allocated in the fair sched class, which is
hard to be reused by other sched class.

The struct after this patch:
struct task_struct {
    ...
    struct sched_statistics statistics;
    ...
    struct sched_entity *se;
    struct sched_rt_entity *rt;
    ...
};

struct task_group {                    |---> stats[0] : of CPU0
    ...                                |
    struct sched_statistics **stats; --|---> stats[1] : of CPU1
    ...                                |
                                       |---> stats[n] : of CPUn
 #ifdef CONFIG_FAIR_GROUP_SCHED
    struct sched_entity **se;
 #endif
 #ifdef CONFIG_RT_GROUP_SCHED
    struct sched_rt_entity  **rt_se;
 #endif
    ...
};

After the patch it is clearly that both of se or rt_se can easily get the
sched_statistics by a task_struct or a task_group.

- Function Interface

The original prototype of the schedstats helpers are

update_stats_wait_*(struct cfs_rq *cfs_rq, struct sched_entity *se)

The cfs_rq in these helpers is used to get the rq_clock, and the se is
used to get the struct sched_statistics and the struct task_struct. In
order to make these helpers available by all sched classes, we can pass
the rq, sched_statistics and task_struct directly.

Then the new helpers are

update_stats_wait_*(struct rq *rq, struct task_struct *p,
                    struct sched_statistics *stats)

which are independent of fair sched class.

To avoid vmlinux growing too large or introducing ovehead when
!schedstat_enabled(), some new helpers after schedstat_enabled() are also
introduced, Suggested by Mel. These helpers are in sched/stats.c,

__update_stats_wait_*(struct rq *rq, struct task_struct *p,
                      struct sched_statistics *stats)

- Implementation

After we make the struct sched_statistics and the helpers of it
independent of fair sched class, we can easily use the schedstats
facility for RT sched class.

The schedstat usage in RT sched class is similar with fair sched class,
for example,
                fair                            RT
enqueue         update_stats_enqueue_fair       update_stats_enqueue_rt
dequeue         update_stats_dequeue_fair       update_stats_dequeue_rt
put_prev_task   update_stats_wait_start         update_stats_wait_start
set_next_task   update_stats_wait_end           update_stats_wait_end

- Usage
 
The user can get the schedstats information in the same way in fair sched
class. For example,
		fair				RT
task show	/proc/[pid]/sched               /proc/[pid]/sched
group show	cpu.stat in cgroup		cpu.stat in cgroup

The sched:sched_stat_{wait, sleep, iowait, blocked, runtime} tracepoints
can be used to trace RT tasks as well.

Detailed examples of the output of schedstats for RT tasks are in patch #6.


Changes since RFC:
- improvement of schedstats helpers, per Mel.
- make struct schedstats independent of fair sched class

Yafang Shao (6):
  sched: don't include stats.h in sched.h
  sched, fair: use __schedstat_set() in set_next_entity()
  sched: make struct sched_statistics independent of fair sched class
  sched: make schedstats helpers independent of fair sched class
  sched, rt: support sched_stat_runtime tracepoint for RT sched class
  sched, rt: support schedstats for RT sched class

 include/linux/sched.h    |   3 +-
 kernel/sched/core.c      |  25 +++--
 kernel/sched/deadline.c  |   5 +-
 kernel/sched/debug.c     |  82 +++++++--------
 kernel/sched/fair.c      | 209 +++++++++++++++------------------------
 kernel/sched/idle.c      |   1 +
 kernel/sched/rt.c        | 142 +++++++++++++++++++++++++-
 kernel/sched/sched.h     |   9 +-
 kernel/sched/stats.c     | 105 ++++++++++++++++++++
 kernel/sched/stats.h     |  80 +++++++++++++++
 kernel/sched/stop_task.c |   5 +-
 11 files changed, 476 insertions(+), 190 deletions(-)

-- 
2.18.4


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH 1/6] sched: don't include stats.h in sched.h
  2020-12-01 11:54 [PATCH 0/6] sched: support schedstats for RT sched class Yafang Shao
@ 2020-12-01 11:54 ` Yafang Shao
  2020-12-01 11:54 ` [PATCH 2/6] sched, fair: use __schedstat_set() in set_next_entity() Yafang Shao
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 19+ messages in thread
From: Yafang Shao @ 2020-12-01 11:54 UTC (permalink / raw)
  To: mgorman, mingo, peterz, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, bristot, qianjun.kernel
  Cc: linux-kernel, linux-rt-users, Yafang Shao

This patch is a preparation of the followup patches. In the followup
patches some common helpers will be defined in stats.h, and these common
helpers require some definitions in sched.h, so let's move stats.h out
of sched.h.

The source files which require stats.h include it specifically.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
 kernel/sched/core.c      | 1 +
 kernel/sched/deadline.c  | 1 +
 kernel/sched/debug.c     | 1 +
 kernel/sched/fair.c      | 1 +
 kernel/sched/idle.c      | 1 +
 kernel/sched/rt.c        | 2 +-
 kernel/sched/sched.h     | 6 +++++-
 kernel/sched/stats.c     | 1 +
 kernel/sched/stats.h     | 2 ++
 kernel/sched/stop_task.c | 1 +
 10 files changed, 15 insertions(+), 2 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index d2003a7d5ab5..fd76628778f7 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -11,6 +11,7 @@
 #undef CREATE_TRACE_POINTS
 
 #include "sched.h"
+#include "stats.h"
 
 #include <linux/nospec.h>
 
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index f232305dcefe..7a0124f81a4f 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -16,6 +16,7 @@
  *                    Fabio Checconi <fchecconi@gmail.com>
  */
 #include "sched.h"
+#include "stats.h"
 #include "pelt.h"
 
 struct dl_bandwidth def_dl_bandwidth;
diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index 2357921580f9..9758aa1bba1e 100644
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -7,6 +7,7 @@
  * Copyright(C) 2007, Red Hat, Inc., Ingo Molnar
  */
 #include "sched.h"
+#include "stats.h"
 
 static DEFINE_SPINLOCK(sched_debug_lock);
 
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 8917d2d715ef..8ff1daa3d9bb 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -21,6 +21,7 @@
  *  Copyright (C) 2007 Red Hat, Inc., Peter Zijlstra
  */
 #include "sched.h"
+#include "stats.h"
 
 /*
  * Targeted preemption latency for CPU-bound tasks:
diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index 24d0ee26377d..95c02cbca04a 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -7,6 +7,7 @@
  *        tasks which are handled in sched/fair.c )
  */
 #include "sched.h"
+#include "stats.h"
 
 #include <trace/events/power.h>
 
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 49ec096a8aa1..af772ac0f32d 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -4,7 +4,7 @@
  * policies)
  */
 #include "sched.h"
-
+#include "stats.h"
 #include "pelt.h"
 
 int sched_rr_timeslice = RR_TIMESLICE;
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index df80bfcea92e..871544bb9a38 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -2,6 +2,9 @@
 /*
  * Scheduler internal types and methods:
  */
+#ifndef _KERNEL_SCHED_SCHED_H
+#define _KERNEL_SCHED_SCHED_H
+
 #include <linux/sched.h>
 
 #include <linux/sched/autogroup.h>
@@ -1538,7 +1541,6 @@ extern void flush_smp_call_function_from_idle(void);
 static inline void flush_smp_call_function_from_idle(void) { }
 #endif
 
-#include "stats.h"
 #include "autogroup.h"
 
 #ifdef CONFIG_CGROUP_SCHED
@@ -2633,3 +2635,5 @@ static inline bool is_per_cpu_kthread(struct task_struct *p)
 
 void swake_up_all_locked(struct swait_queue_head *q);
 void __prepare_to_swait(struct swait_queue_head *q, struct swait_queue *wait);
+
+#endif	/* _KERNEL_SCHED_SCHED_H */
diff --git a/kernel/sched/stats.c b/kernel/sched/stats.c
index 750fb3c67eed..844bd9dbfbf0 100644
--- a/kernel/sched/stats.c
+++ b/kernel/sched/stats.c
@@ -3,6 +3,7 @@
  * /proc/schedstat implementation
  */
 #include "sched.h"
+#include "stats.h"
 
 /*
  * Current schedstat API version.
diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h
index 33d0daf83842..c23b653ffc53 100644
--- a/kernel/sched/stats.h
+++ b/kernel/sched/stats.h
@@ -2,6 +2,8 @@
 
 #ifdef CONFIG_SCHEDSTATS
 
+#include "sched.h"
+
 /*
  * Expects runqueue lock to be held for atomicity of update
  */
diff --git a/kernel/sched/stop_task.c b/kernel/sched/stop_task.c
index ceb5b6b12561..a5d289049388 100644
--- a/kernel/sched/stop_task.c
+++ b/kernel/sched/stop_task.c
@@ -8,6 +8,7 @@
  * See kernel/stop_machine.c
  */
 #include "sched.h"
+#include "stats.h"
 
 #ifdef CONFIG_SMP
 static int
-- 
2.18.4


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 2/6] sched, fair: use __schedstat_set() in set_next_entity()
  2020-12-01 11:54 [PATCH 0/6] sched: support schedstats for RT sched class Yafang Shao
  2020-12-01 11:54 ` [PATCH 1/6] sched: don't include stats.h in sched.h Yafang Shao
@ 2020-12-01 11:54 ` Yafang Shao
  2020-12-01 12:28   ` Mel Gorman
  2020-12-01 11:54 ` [PATCH 3/6] sched: make struct sched_statistics independent of fair sched class Yafang Shao
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 19+ messages in thread
From: Yafang Shao @ 2020-12-01 11:54 UTC (permalink / raw)
  To: mgorman, mingo, peterz, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, bristot, qianjun.kernel
  Cc: linux-kernel, linux-rt-users, Yafang Shao

schedstat_enabled() has been already checked, so we can use
__schedstat_set() directly.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
 kernel/sched/fair.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 8ff1daa3d9bb..7e7c03cede94 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4414,7 +4414,7 @@ set_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *se)
 	 */
 	if (schedstat_enabled() &&
 	    rq_of(cfs_rq)->cfs.load.weight >= 2*se->load.weight) {
-		schedstat_set(se->statistics.slice_max,
+		__schedstat_set(se->statistics.slice_max,
 			max((u64)schedstat_val(se->statistics.slice_max),
 			    se->sum_exec_runtime - se->prev_sum_exec_runtime));
 	}
-- 
2.18.4


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 3/6] sched: make struct sched_statistics independent of fair sched class
  2020-12-01 11:54 [PATCH 0/6] sched: support schedstats for RT sched class Yafang Shao
  2020-12-01 11:54 ` [PATCH 1/6] sched: don't include stats.h in sched.h Yafang Shao
  2020-12-01 11:54 ` [PATCH 2/6] sched, fair: use __schedstat_set() in set_next_entity() Yafang Shao
@ 2020-12-01 11:54 ` Yafang Shao
  2020-12-01 12:41   ` Mel Gorman
                     ` (2 more replies)
  2020-12-01 11:54 ` [PATCH 4/6] sched: make schedstats helpers " Yafang Shao
                   ` (2 subsequent siblings)
  5 siblings, 3 replies; 19+ messages in thread
From: Yafang Shao @ 2020-12-01 11:54 UTC (permalink / raw)
  To: mgorman, mingo, peterz, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, bristot, qianjun.kernel
  Cc: linux-kernel, linux-rt-users, Yafang Shao

If we want to use schedstats facility, we should move out of
struct sched_statistics from the struct sched_entity or add it into other
sctructs of sched entity as well. Obviously the latter one is bad because
that requires more spaces. So we should move it into a common struct which
can be used by all sched classes.

The struct sched_statistics is the schedular statistics of a task_struct
or a task_group. So we can move it into struct task_struct and
struct task_group to achieve the goal.

Below is the detailed explaination of the change in the structs.

- Before this patch

struct task_struct {            |-> struct sched_entity {
    ...                         |       ...
    struct sched_entity *se; ---|       struct sched_statistics statistics;
    struct sched_rt_entity *rt;         ...
    ...                                 ...
};                                  };

struct task_group {             |--> se[0]->statistics : schedstats of CPU0
    ...                         |
 #ifdef CONFIG_FAIR_GROUP_SCHED |
    struct sched_entity **se; --|--> se[1]->statistics : schedstats of CPU1
                                |
 #endif                         |
                                |--> se[N]->statistics : schedstats of CPUn

 #ifdef CONFIG_FAIR_GROUP_SCHED
    struct sched_rt_entity  **rt_se; (N/A)
 #endif
    ...
};

The '**se' in task_group is allocated in the fair sched class, which is
hard to be reused by other sched class.

- After this patch

struct task_struct {
    ...
    struct sched_statistics statistics;
    ...
    struct sched_entity *se;
    struct sched_rt_entity *rt;
    ...
};

struct task_group {                    |---> stats[0] : of CPU0
    ...                                |
    struct sched_statistics **stats; --|---> stats[1] : of CPU1
    ...                                |
                                       |---> stats[n] : of CPUn
 #ifdef CONFIG_FAIR_GROUP_SCHED
    struct sched_entity **se;
 #endif
 #ifdef CONFIG_RT_GROUP_SCHED
    struct sched_rt_entity  **rt_se;
 #endif
    ...
};

After the patch it is clearly that both of se or rt_se can easily get the
sched_statistics by a task_struct or a task_group.

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
 include/linux/sched.h    |   3 +-
 kernel/sched/core.c      |  24 ++++---
 kernel/sched/deadline.c  |   4 +-
 kernel/sched/debug.c     |  81 +++++++++++-----------
 kernel/sched/fair.c      | 142 ++++++++++++++++++++++++++-------------
 kernel/sched/rt.c        |   4 +-
 kernel/sched/sched.h     |   3 +
 kernel/sched/stats.h     |  46 +++++++++++++
 kernel/sched/stop_task.c |   4 +-
 9 files changed, 206 insertions(+), 105 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 063cd120b459..f8e969be4bee 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -462,8 +462,6 @@ struct sched_entity {
 
 	u64				nr_migrations;
 
-	struct sched_statistics		statistics;
-
 #ifdef CONFIG_FAIR_GROUP_SCHED
 	int				depth;
 	struct sched_entity		*parent;
@@ -681,6 +679,7 @@ struct task_struct {
 	unsigned int			rt_priority;
 
 	const struct sched_class	*sched_class;
+	struct sched_statistics         stats;
 	struct sched_entity		se;
 	struct sched_rt_entity		rt;
 #ifdef CONFIG_CGROUP_SCHED
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index fd76628778f7..081b4f1f2cb4 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2427,11 +2427,11 @@ ttwu_stat(struct task_struct *p, int cpu, int wake_flags)
 #ifdef CONFIG_SMP
 	if (cpu == rq->cpu) {
 		__schedstat_inc(rq->ttwu_local);
-		__schedstat_inc(p->se.statistics.nr_wakeups_local);
+		__schedstat_inc(p->stats.nr_wakeups_local);
 	} else {
 		struct sched_domain *sd;
 
-		__schedstat_inc(p->se.statistics.nr_wakeups_remote);
+		__schedstat_inc(p->stats.nr_wakeups_remote);
 		rcu_read_lock();
 		for_each_domain(rq->cpu, sd) {
 			if (cpumask_test_cpu(cpu, sched_domain_span(sd))) {
@@ -2443,14 +2443,14 @@ ttwu_stat(struct task_struct *p, int cpu, int wake_flags)
 	}
 
 	if (wake_flags & WF_MIGRATED)
-		__schedstat_inc(p->se.statistics.nr_wakeups_migrate);
+		__schedstat_inc(p->stats.nr_wakeups_migrate);
 #endif /* CONFIG_SMP */
 
 	__schedstat_inc(rq->ttwu_count);
-	__schedstat_inc(p->se.statistics.nr_wakeups);
+	__schedstat_inc(p->stats.nr_wakeups);
 
 	if (wake_flags & WF_SYNC)
-		__schedstat_inc(p->se.statistics.nr_wakeups_sync);
+		__schedstat_inc(p->stats.nr_wakeups_sync);
 }
 
 /*
@@ -3075,7 +3075,7 @@ static void __sched_fork(unsigned long clone_flags, struct task_struct *p)
 
 #ifdef CONFIG_SCHEDSTATS
 	/* Even if schedstat is disabled, there should not be garbage */
-	memset(&p->se.statistics, 0, sizeof(p->se.statistics));
+	memset(&p->stats, 0, sizeof(p->stats));
 #endif
 
 	RB_CLEAR_NODE(&p->dl.rb_node);
@@ -7347,9 +7347,9 @@ void normalize_rt_tasks(void)
 			continue;
 
 		p->se.exec_start = 0;
-		schedstat_set(p->se.statistics.wait_start,  0);
-		schedstat_set(p->se.statistics.sleep_start, 0);
-		schedstat_set(p->se.statistics.block_start, 0);
+		schedstat_set(p->stats.wait_start,  0);
+		schedstat_set(p->stats.sleep_start, 0);
+		schedstat_set(p->stats.block_start, 0);
 
 		if (!dl_task(p) && !rt_task(p)) {
 			/*
@@ -7439,6 +7439,7 @@ static void sched_free_group(struct task_group *tg)
 {
 	free_fair_sched_group(tg);
 	free_rt_sched_group(tg);
+	free_tg_schedstats(tg);
 	autogroup_free(tg);
 	kmem_cache_free(task_group_cache, tg);
 }
@@ -7458,6 +7459,9 @@ struct task_group *sched_create_group(struct task_group *parent)
 	if (!alloc_rt_sched_group(tg, parent))
 		goto err;
 
+	if (!alloc_tg_schedstats(tg))
+		goto err;
+
 	alloc_uclamp_sched_group(tg, parent);
 
 	return tg;
@@ -8144,7 +8148,7 @@ static int cpu_cfs_stat_show(struct seq_file *sf, void *v)
 		int i;
 
 		for_each_possible_cpu(i)
-			ws += schedstat_val(tg->se[i]->statistics.wait_sum);
+			ws += schedstat_val(tg->stats[i]->wait_sum);
 
 		seq_printf(sf, "wait_sum %llu\n", ws);
 	}
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 7a0124f81a4f..315d140b8f0e 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1245,8 +1245,8 @@ static void update_curr_dl(struct rq *rq)
 		return;
 	}
 
-	schedstat_set(curr->se.statistics.exec_max,
-		      max(curr->se.statistics.exec_max, delta_exec));
+	schedstat_set(curr->stats.exec_max,
+		      max(curr->stats.exec_max, delta_exec));
 
 	curr->se.sum_exec_runtime += delta_exec;
 	account_group_exec_runtime(curr, delta_exec);
diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index 9758aa1bba1e..8c646fcb89de 100644
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -444,16 +444,16 @@ static void print_cfs_group_stats(struct seq_file *m, int cpu, struct task_group
 	PN(se->sum_exec_runtime);
 
 	if (schedstat_enabled()) {
-		PN_SCHEDSTAT(se->statistics.wait_start);
-		PN_SCHEDSTAT(se->statistics.sleep_start);
-		PN_SCHEDSTAT(se->statistics.block_start);
-		PN_SCHEDSTAT(se->statistics.sleep_max);
-		PN_SCHEDSTAT(se->statistics.block_max);
-		PN_SCHEDSTAT(se->statistics.exec_max);
-		PN_SCHEDSTAT(se->statistics.slice_max);
-		PN_SCHEDSTAT(se->statistics.wait_max);
-		PN_SCHEDSTAT(se->statistics.wait_sum);
-		P_SCHEDSTAT(se->statistics.wait_count);
+		PN_SCHEDSTAT(tg->stats[cpu]->wait_start);
+		PN_SCHEDSTAT(tg->stats[cpu]->sleep_start);
+		PN_SCHEDSTAT(tg->stats[cpu]->block_start);
+		PN_SCHEDSTAT(tg->stats[cpu]->sleep_max);
+		PN_SCHEDSTAT(tg->stats[cpu]->block_max);
+		PN_SCHEDSTAT(tg->stats[cpu]->exec_max);
+		PN_SCHEDSTAT(tg->stats[cpu]->slice_max);
+		PN_SCHEDSTAT(tg->stats[cpu]->wait_max);
+		PN_SCHEDSTAT(tg->stats[cpu]->wait_sum);
+		P_SCHEDSTAT(tg->stats[cpu]->wait_count);
 	}
 
 	P(se->load.weight);
@@ -499,9 +499,9 @@ print_task(struct seq_file *m, struct rq *rq, struct task_struct *p)
 		p->prio);
 
 	SEQ_printf(m, "%9Ld.%06ld %9Ld.%06ld %9Ld.%06ld",
-		SPLIT_NS(schedstat_val_or_zero(p->se.statistics.wait_sum)),
+		SPLIT_NS(schedstat_val_or_zero(p->stats.wait_sum)),
 		SPLIT_NS(p->se.sum_exec_runtime),
-		SPLIT_NS(schedstat_val_or_zero(p->se.statistics.sum_sleep_runtime)));
+		SPLIT_NS(schedstat_val_or_zero(p->stats.sum_sleep_runtime)));
 
 #ifdef CONFIG_NUMA_BALANCING
 	SEQ_printf(m, " %d %d", task_node(p), task_numa_group_id(p));
@@ -938,34 +938,33 @@ void proc_sched_show_task(struct task_struct *p, struct pid_namespace *ns,
 
 	if (schedstat_enabled()) {
 		u64 avg_atom, avg_per_cpu;
-
-		PN_SCHEDSTAT(se.statistics.sum_sleep_runtime);
-		PN_SCHEDSTAT(se.statistics.wait_start);
-		PN_SCHEDSTAT(se.statistics.sleep_start);
-		PN_SCHEDSTAT(se.statistics.block_start);
-		PN_SCHEDSTAT(se.statistics.sleep_max);
-		PN_SCHEDSTAT(se.statistics.block_max);
-		PN_SCHEDSTAT(se.statistics.exec_max);
-		PN_SCHEDSTAT(se.statistics.slice_max);
-		PN_SCHEDSTAT(se.statistics.wait_max);
-		PN_SCHEDSTAT(se.statistics.wait_sum);
-		P_SCHEDSTAT(se.statistics.wait_count);
-		PN_SCHEDSTAT(se.statistics.iowait_sum);
-		P_SCHEDSTAT(se.statistics.iowait_count);
-		P_SCHEDSTAT(se.statistics.nr_migrations_cold);
-		P_SCHEDSTAT(se.statistics.nr_failed_migrations_affine);
-		P_SCHEDSTAT(se.statistics.nr_failed_migrations_running);
-		P_SCHEDSTAT(se.statistics.nr_failed_migrations_hot);
-		P_SCHEDSTAT(se.statistics.nr_forced_migrations);
-		P_SCHEDSTAT(se.statistics.nr_wakeups);
-		P_SCHEDSTAT(se.statistics.nr_wakeups_sync);
-		P_SCHEDSTAT(se.statistics.nr_wakeups_migrate);
-		P_SCHEDSTAT(se.statistics.nr_wakeups_local);
-		P_SCHEDSTAT(se.statistics.nr_wakeups_remote);
-		P_SCHEDSTAT(se.statistics.nr_wakeups_affine);
-		P_SCHEDSTAT(se.statistics.nr_wakeups_affine_attempts);
-		P_SCHEDSTAT(se.statistics.nr_wakeups_passive);
-		P_SCHEDSTAT(se.statistics.nr_wakeups_idle);
+		PN_SCHEDSTAT(stats.sum_sleep_runtime);
+		PN_SCHEDSTAT(stats.wait_start);
+		PN_SCHEDSTAT(stats.sleep_start);
+		PN_SCHEDSTAT(stats.block_start);
+		PN_SCHEDSTAT(stats.sleep_max);
+		PN_SCHEDSTAT(stats.block_max);
+		PN_SCHEDSTAT(stats.exec_max);
+		PN_SCHEDSTAT(stats.slice_max);
+		PN_SCHEDSTAT(stats.wait_max);
+		PN_SCHEDSTAT(stats.wait_sum);
+		P_SCHEDSTAT(stats.wait_count);
+		PN_SCHEDSTAT(stats.iowait_sum);
+		P_SCHEDSTAT(stats.iowait_count);
+		P_SCHEDSTAT(stats.nr_migrations_cold);
+		P_SCHEDSTAT(stats.nr_failed_migrations_affine);
+		P_SCHEDSTAT(stats.nr_failed_migrations_running);
+		P_SCHEDSTAT(stats.nr_failed_migrations_hot);
+		P_SCHEDSTAT(stats.nr_forced_migrations);
+		P_SCHEDSTAT(stats.nr_wakeups);
+		P_SCHEDSTAT(stats.nr_wakeups_sync);
+		P_SCHEDSTAT(stats.nr_wakeups_migrate);
+		P_SCHEDSTAT(stats.nr_wakeups_local);
+		P_SCHEDSTAT(stats.nr_wakeups_remote);
+		P_SCHEDSTAT(stats.nr_wakeups_affine);
+		P_SCHEDSTAT(stats.nr_wakeups_affine_attempts);
+		P_SCHEDSTAT(stats.nr_wakeups_passive);
+		P_SCHEDSTAT(stats.nr_wakeups_idle);
 
 		avg_atom = p->se.sum_exec_runtime;
 		if (nr_switches)
@@ -1031,6 +1030,6 @@ void proc_sched_show_task(struct task_struct *p, struct pid_namespace *ns,
 void proc_sched_set_task(struct task_struct *p)
 {
 #ifdef CONFIG_SCHEDSTATS
-	memset(&p->se.statistics, 0, sizeof(p->se.statistics));
+	memset(&p->stats, 0, sizeof(p->stats));
 #endif
 }
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 7e7c03cede94..14d8df308d44 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -837,12 +837,44 @@ static void update_tg_load_avg(struct cfs_rq *cfs_rq)
 }
 #endif /* CONFIG_SMP */
 
+static inline void
+__schedstat_from_sched_entity(struct sched_entity *se,
+			      struct sched_statistics **stats)
+{
+	struct task_struct *p;
+	struct task_group *tg;
+	struct cfs_rq *cfs_rq;
+	int cpu;
+
+	if (entity_is_task(se)) {
+		p = task_of(se);
+		*stats = &p->stats;
+	} else {
+		cfs_rq = group_cfs_rq(se);
+		tg = cfs_rq->tg;
+		cpu = cpu_of(rq_of(cfs_rq));
+		*stats = tg->stats[cpu];
+	}
+}
+
+static inline void
+schedstat_from_sched_entity(struct sched_entity *se,
+			    struct sched_statistics **stats)
+{
+	if (!schedstat_enabled())
+		return;
+
+	__schedstat_from_sched_entity(se, stats);
+}
+
 /*
  * Update the current task's runtime statistics.
  */
 static void update_curr(struct cfs_rq *cfs_rq)
 {
 	struct sched_entity *curr = cfs_rq->curr;
+	struct sched_statistics *stats = NULL;
+
 	u64 now = rq_clock_task(rq_of(cfs_rq));
 	u64 delta_exec;
 
@@ -855,8 +887,12 @@ static void update_curr(struct cfs_rq *cfs_rq)
 
 	curr->exec_start = now;
 
-	schedstat_set(curr->statistics.exec_max,
-		      max(delta_exec, curr->statistics.exec_max));
+	if (schedstat_enabled()) {
+		__schedstat_from_sched_entity(curr, &stats);
+
+		__schedstat_set(stats->exec_max,
+		      max(delta_exec, stats->exec_max));
+	}
 
 	curr->sum_exec_runtime += delta_exec;
 	schedstat_add(cfs_rq->exec_clock, delta_exec);
@@ -883,67 +919,78 @@ static void update_curr_fair(struct rq *rq)
 static inline void
 update_stats_wait_start(struct cfs_rq *cfs_rq, struct sched_entity *se)
 {
+	struct sched_statistics *stats = NULL;
 	u64 wait_start, prev_wait_start;
 
 	if (!schedstat_enabled())
 		return;
 
+	__schedstat_from_sched_entity(se, &stats);
+
 	wait_start = rq_clock(rq_of(cfs_rq));
-	prev_wait_start = schedstat_val(se->statistics.wait_start);
+	prev_wait_start = schedstat_val(stats->wait_start);
 
 	if (entity_is_task(se) && task_on_rq_migrating(task_of(se)) &&
 	    likely(wait_start > prev_wait_start))
 		wait_start -= prev_wait_start;
 
-	__schedstat_set(se->statistics.wait_start, wait_start);
+	__schedstat_set(stats->wait_start, wait_start);
 }
 
 static inline void
 update_stats_wait_end(struct cfs_rq *cfs_rq, struct sched_entity *se)
 {
-	struct task_struct *p;
+	struct sched_statistics *stats = NULL;
+	struct task_struct *p = NULL;
 	u64 delta;
 
 	if (!schedstat_enabled())
 		return;
 
-	delta = rq_clock(rq_of(cfs_rq)) - schedstat_val(se->statistics.wait_start);
+	__schedstat_from_sched_entity(se, &stats);
 
+	delta = rq_clock(rq_of(cfs_rq)) - schedstat_val(stats->wait_start);
 	if (entity_is_task(se)) {
 		p = task_of(se);
+
 		if (task_on_rq_migrating(p)) {
 			/*
 			 * Preserve migrating task's wait time so wait_start
 			 * time stamp can be adjusted to accumulate wait time
 			 * prior to migration.
 			 */
-			__schedstat_set(se->statistics.wait_start, delta);
+			__schedstat_set(stats->wait_start, delta);
+
 			return;
 		}
+
 		trace_sched_stat_wait(p, delta);
 	}
 
-	__schedstat_set(se->statistics.wait_max,
-		      max(schedstat_val(se->statistics.wait_max), delta));
-	__schedstat_inc(se->statistics.wait_count);
-	__schedstat_add(se->statistics.wait_sum, delta);
-	__schedstat_set(se->statistics.wait_start, 0);
+	__schedstat_set(stats->wait_max,
+			max(schedstat_val(stats->wait_max), delta));
+	__schedstat_inc(stats->wait_count);
+	__schedstat_add(stats->wait_sum, delta);
+	__schedstat_set(stats->wait_start, 0);
 }
 
 static inline void
 update_stats_enqueue_sleeper(struct cfs_rq *cfs_rq, struct sched_entity *se)
 {
-	struct task_struct *tsk = NULL;
+	struct sched_statistics *stats = NULL;
+	struct task_struct *p = NULL;
 	u64 sleep_start, block_start;
 
 	if (!schedstat_enabled())
 		return;
 
-	sleep_start = schedstat_val(se->statistics.sleep_start);
-	block_start = schedstat_val(se->statistics.block_start);
-
 	if (entity_is_task(se))
-		tsk = task_of(se);
+		p = task_of(se);
+
+	__schedstat_from_sched_entity(se, &stats);
+
+	sleep_start = schedstat_val(stats->sleep_start);
+	block_start = schedstat_val(stats->block_start);
 
 	if (sleep_start) {
 		u64 delta = rq_clock(rq_of(cfs_rq)) - sleep_start;
@@ -951,15 +998,15 @@ update_stats_enqueue_sleeper(struct cfs_rq *cfs_rq, struct sched_entity *se)
 		if ((s64)delta < 0)
 			delta = 0;
 
-		if (unlikely(delta > schedstat_val(se->statistics.sleep_max)))
-			__schedstat_set(se->statistics.sleep_max, delta);
+		if (unlikely(delta > schedstat_val(stats->sleep_max)))
+			__schedstat_set(stats->sleep_max, delta);
 
-		__schedstat_set(se->statistics.sleep_start, 0);
-		__schedstat_add(se->statistics.sum_sleep_runtime, delta);
+		__schedstat_set(stats->sleep_start, 0);
+		__schedstat_add(stats->sum_sleep_runtime, delta);
 
-		if (tsk) {
-			account_scheduler_latency(tsk, delta >> 10, 1);
-			trace_sched_stat_sleep(tsk, delta);
+		if (p) {
+			account_scheduler_latency(p, delta >> 10, 1);
+			trace_sched_stat_sleep(p, delta);
 		}
 	}
 	if (block_start) {
@@ -968,20 +1015,20 @@ update_stats_enqueue_sleeper(struct cfs_rq *cfs_rq, struct sched_entity *se)
 		if ((s64)delta < 0)
 			delta = 0;
 
-		if (unlikely(delta > schedstat_val(se->statistics.block_max)))
-			__schedstat_set(se->statistics.block_max, delta);
+		if (unlikely(delta > schedstat_val(stats->block_max)))
+			__schedstat_set(stats->block_max, delta);
 
-		__schedstat_set(se->statistics.block_start, 0);
-		__schedstat_add(se->statistics.sum_sleep_runtime, delta);
+		__schedstat_set(stats->block_start, 0);
+		__schedstat_add(stats->sum_sleep_runtime, delta);
 
-		if (tsk) {
-			if (tsk->in_iowait) {
-				__schedstat_add(se->statistics.iowait_sum, delta);
-				__schedstat_inc(se->statistics.iowait_count);
-				trace_sched_stat_iowait(tsk, delta);
+		if (p) {
+			if (p->in_iowait) {
+				__schedstat_add(stats->iowait_sum, delta);
+				__schedstat_inc(stats->iowait_count);
+				trace_sched_stat_iowait(p, delta);
 			}
 
-			trace_sched_stat_blocked(tsk, delta);
+			trace_sched_stat_blocked(p, delta);
 
 			/*
 			 * Blocking time is in units of nanosecs, so shift by
@@ -990,10 +1037,10 @@ update_stats_enqueue_sleeper(struct cfs_rq *cfs_rq, struct sched_entity *se)
 			 */
 			if (unlikely(prof_on == SLEEP_PROFILING)) {
 				profile_hits(SLEEP_PROFILING,
-						(void *)get_wchan(tsk),
+						(void *)get_wchan(p),
 						delta >> 20);
 			}
-			account_scheduler_latency(tsk, delta >> 10, 0);
+			account_scheduler_latency(p, delta >> 10, 0);
 		}
 	}
 }
@@ -1036,10 +1083,10 @@ update_stats_dequeue(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
 		struct task_struct *tsk = task_of(se);
 
 		if (tsk->state & TASK_INTERRUPTIBLE)
-			__schedstat_set(se->statistics.sleep_start,
+			__schedstat_set(tsk->stats.sleep_start,
 				      rq_clock(rq_of(cfs_rq)));
 		if (tsk->state & TASK_UNINTERRUPTIBLE)
-			__schedstat_set(se->statistics.block_start,
+			__schedstat_set(tsk->stats.block_start,
 				      rq_clock(rq_of(cfs_rq)));
 	}
 }
@@ -4392,6 +4439,8 @@ check_preempt_tick(struct cfs_rq *cfs_rq, struct sched_entity *curr)
 static void
 set_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *se)
 {
+	struct sched_statistics *stats = NULL;
+
 	/* 'current' is not kept within the tree. */
 	if (se->on_rq) {
 		/*
@@ -4414,8 +4463,9 @@ set_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *se)
 	 */
 	if (schedstat_enabled() &&
 	    rq_of(cfs_rq)->cfs.load.weight >= 2*se->load.weight) {
-		__schedstat_set(se->statistics.slice_max,
-			max((u64)schedstat_val(se->statistics.slice_max),
+		__schedstat_from_sched_entity(se, &stats);
+		__schedstat_set(stats->slice_max,
+			max((u64)schedstat_val(stats->slice_max),
 			    se->sum_exec_runtime - se->prev_sum_exec_runtime));
 	}
 
@@ -5862,12 +5912,12 @@ static int wake_affine(struct sched_domain *sd, struct task_struct *p,
 	if (sched_feat(WA_WEIGHT) && target == nr_cpumask_bits)
 		target = wake_affine_weight(sd, p, this_cpu, prev_cpu, sync);
 
-	schedstat_inc(p->se.statistics.nr_wakeups_affine_attempts);
+	schedstat_inc(p->stats.nr_wakeups_affine_attempts);
 	if (target == nr_cpumask_bits)
 		return prev_cpu;
 
 	schedstat_inc(sd->ttwu_move_affine);
-	schedstat_inc(p->se.statistics.nr_wakeups_affine);
+	schedstat_inc(p->stats.nr_wakeups_affine);
 	return target;
 }
 
@@ -7533,7 +7583,7 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env)
 	if (!cpumask_test_cpu(env->dst_cpu, p->cpus_ptr)) {
 		int cpu;
 
-		schedstat_inc(p->se.statistics.nr_failed_migrations_affine);
+		schedstat_inc(p->stats.nr_failed_migrations_affine);
 
 		env->flags |= LBF_SOME_PINNED;
 
@@ -7564,7 +7614,7 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env)
 	env->flags &= ~LBF_ALL_PINNED;
 
 	if (task_running(env->src_rq, p)) {
-		schedstat_inc(p->se.statistics.nr_failed_migrations_running);
+		schedstat_inc(p->stats.nr_failed_migrations_running);
 		return 0;
 	}
 
@@ -7582,12 +7632,12 @@ int can_migrate_task(struct task_struct *p, struct lb_env *env)
 	    env->sd->nr_balance_failed > env->sd->cache_nice_tries) {
 		if (tsk_cache_hot == 1) {
 			schedstat_inc(env->sd->lb_hot_gained[env->idle]);
-			schedstat_inc(p->se.statistics.nr_forced_migrations);
+			schedstat_inc(p->stats.nr_forced_migrations);
 		}
 		return 1;
 	}
 
-	schedstat_inc(p->se.statistics.nr_failed_migrations_hot);
+	schedstat_inc(p->stats.nr_failed_migrations_hot);
 	return 0;
 }
 
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index af772ac0f32d..2d543a270dfe 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -1008,8 +1008,8 @@ static void update_curr_rt(struct rq *rq)
 	if (unlikely((s64)delta_exec <= 0))
 		return;
 
-	schedstat_set(curr->se.statistics.exec_max,
-		      max(curr->se.statistics.exec_max, delta_exec));
+	schedstat_set(curr->stats.exec_max,
+		      max(curr->stats.exec_max, delta_exec));
 
 	curr->se.sum_exec_runtime += delta_exec;
 	account_group_exec_runtime(curr, delta_exec);
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 871544bb9a38..b1cdb942c67d 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -387,6 +387,9 @@ struct cfs_bandwidth {
 struct task_group {
 	struct cgroup_subsys_state css;
 
+	/* schedstats of this group on each CPU */
+	struct sched_statistics **stats;
+
 #ifdef CONFIG_FAIR_GROUP_SCHED
 	/* schedulable entities of this group on each CPU */
 	struct sched_entity	**se;
diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h
index c23b653ffc53..87242968712e 100644
--- a/kernel/sched/stats.h
+++ b/kernel/sched/stats.h
@@ -42,6 +42,42 @@ rq_sched_info_dequeued(struct rq *rq, unsigned long long delta)
 #define   schedstat_val(var)		(var)
 #define   schedstat_val_or_zero(var)	((schedstat_enabled()) ? (var) : 0)
 
+static inline void free_tg_schedstats(struct task_group *tg)
+{
+	int i;
+
+	for_each_possible_cpu(i) {
+		if (tg->stats)
+			kfree(tg->stats[i]);
+	}
+
+	kfree(tg->stats);
+}
+
+static inline int alloc_tg_schedstats(struct task_group *tg)
+{
+	struct sched_statistics *stats;
+	int i;
+
+	/*
+	 * This memory should be allocated whatever schedstat_enabled() or
+	 * not.
+	 */
+	tg->stats = kcalloc(nr_cpu_ids, sizeof(stats), GFP_KERNEL);
+	if (!tg->stats)
+		return 0;
+
+	for_each_possible_cpu(i) {
+		stats = kzalloc_node(sizeof(struct sched_statistics),
+				     GFP_KERNEL, cpu_to_node(i));
+		if (!stats)
+			return 0;
+		tg->stats[i] = stats;
+	}
+
+	return 1;
+}
+
 #else /* !CONFIG_SCHEDSTATS: */
 static inline void rq_sched_info_arrive  (struct rq *rq, unsigned long long delta) { }
 static inline void rq_sched_info_dequeued(struct rq *rq, unsigned long long delta) { }
@@ -55,6 +91,16 @@ static inline void rq_sched_info_depart  (struct rq *rq, unsigned long long delt
 # define   schedstat_set(var, val)	do { } while (0)
 # define   schedstat_val(var)		0
 # define   schedstat_val_or_zero(var)	0
+
+static inline void free_tg_schedstats(struct task_group *tg)
+{
+}
+
+static inline int alloc_tg_schedstats(struct task_group *tg)
+{
+	return 1;
+}
+
 #endif /* CONFIG_SCHEDSTATS */
 
 #ifdef CONFIG_PSI
diff --git a/kernel/sched/stop_task.c b/kernel/sched/stop_task.c
index a5d289049388..f35cd67a0881 100644
--- a/kernel/sched/stop_task.c
+++ b/kernel/sched/stop_task.c
@@ -70,8 +70,8 @@ static void put_prev_task_stop(struct rq *rq, struct task_struct *prev)
 	if (unlikely((s64)delta_exec < 0))
 		delta_exec = 0;
 
-	schedstat_set(curr->se.statistics.exec_max,
-			max(curr->se.statistics.exec_max, delta_exec));
+	schedstat_set(curr->stats.exec_max,
+			max(curr->stats.exec_max, delta_exec));
 
 	curr->se.sum_exec_runtime += delta_exec;
 	account_group_exec_runtime(curr, delta_exec);
-- 
2.18.4


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 4/6] sched: make schedstats helpers independent of fair sched class
  2020-12-01 11:54 [PATCH 0/6] sched: support schedstats for RT sched class Yafang Shao
                   ` (2 preceding siblings ...)
  2020-12-01 11:54 ` [PATCH 3/6] sched: make struct sched_statistics independent of fair sched class Yafang Shao
@ 2020-12-01 11:54 ` Yafang Shao
  2020-12-01 12:46   ` Mel Gorman
  2020-12-01 13:58   ` kernel test robot
  2020-12-01 11:54 ` [PATCH 5/6] sched, rt: support sched_stat_runtime tracepoint for RT " Yafang Shao
  2020-12-01 11:54 ` [PATCH 6/6] sched, rt: support schedstats " Yafang Shao
  5 siblings, 2 replies; 19+ messages in thread
From: Yafang Shao @ 2020-12-01 11:54 UTC (permalink / raw)
  To: mgorman, mingo, peterz, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, bristot, qianjun.kernel
  Cc: linux-kernel, linux-rt-users, Yafang Shao

The original prototype of the schedstats helpers are

update_stats_wait_*(struct cfs_rq *cfs_rq, struct sched_entity *se)

The cfs_rq in these helpers is used to get the rq_clock, and the se is
used to get the struct sched_statistics and the struct task_struct. In
order to make these helpers available by all sched classes, we can pass
the rq, sched_statistics and task_struct directly.

Then the new helpers are

update_stats_wait_*(struct rq *rq, struct task_struct *p,
		    struct sched_statistics *stats)

which are independent of fair sched class.

To avoid vmlinux growing too large or introducing ovehead when
!schedstat_enabled(), some new helpers after schedstat_enabled() are also
introduced, Suggested by Mel. These helpers are in sched/stats.c,

__update_stats_wait_*(struct rq *rq, struct task_struct *p,
		      struct sched_statistics *stats)

Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
 kernel/sched/fair.c  | 140 +++++++------------------------------------
 kernel/sched/stats.c | 104 ++++++++++++++++++++++++++++++++
 kernel/sched/stats.h |  32 ++++++++++
 3 files changed, 157 insertions(+), 119 deletions(-)

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 14d8df308d44..b869a83fac29 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -917,69 +917,44 @@ static void update_curr_fair(struct rq *rq)
 }
 
 static inline void
-update_stats_wait_start(struct cfs_rq *cfs_rq, struct sched_entity *se)
+update_stats_wait_start_fair(struct cfs_rq *cfs_rq, struct sched_entity *se)
 {
 	struct sched_statistics *stats = NULL;
-	u64 wait_start, prev_wait_start;
+	struct task_struct *p = NULL;
 
 	if (!schedstat_enabled())
 		return;
 
-	__schedstat_from_sched_entity(se, &stats);
-
-	wait_start = rq_clock(rq_of(cfs_rq));
-	prev_wait_start = schedstat_val(stats->wait_start);
+	if (entity_is_task(se))
+		p = task_of(se);
 
-	if (entity_is_task(se) && task_on_rq_migrating(task_of(se)) &&
-	    likely(wait_start > prev_wait_start))
-		wait_start -= prev_wait_start;
+	__schedstat_from_sched_entity(se, &stats);
 
-	__schedstat_set(stats->wait_start, wait_start);
+	__update_stats_wait_start(rq_of(cfs_rq), p, stats);
 }
 
 static inline void
-update_stats_wait_end(struct cfs_rq *cfs_rq, struct sched_entity *se)
+update_stats_wait_end_fair(struct cfs_rq *cfs_rq, struct sched_entity *se)
 {
 	struct sched_statistics *stats = NULL;
 	struct task_struct *p = NULL;
-	u64 delta;
 
 	if (!schedstat_enabled())
 		return;
 
-	__schedstat_from_sched_entity(se, &stats);
-
-	delta = rq_clock(rq_of(cfs_rq)) - schedstat_val(stats->wait_start);
-	if (entity_is_task(se)) {
+	if (entity_is_task(se))
 		p = task_of(se);
 
-		if (task_on_rq_migrating(p)) {
-			/*
-			 * Preserve migrating task's wait time so wait_start
-			 * time stamp can be adjusted to accumulate wait time
-			 * prior to migration.
-			 */
-			__schedstat_set(stats->wait_start, delta);
-
-			return;
-		}
-
-		trace_sched_stat_wait(p, delta);
-	}
+	__schedstat_from_sched_entity(se, &stats);
 
-	__schedstat_set(stats->wait_max,
-			max(schedstat_val(stats->wait_max), delta));
-	__schedstat_inc(stats->wait_count);
-	__schedstat_add(stats->wait_sum, delta);
-	__schedstat_set(stats->wait_start, 0);
+	__update_stats_wait_end(rq_of(cfs_rq), p, stats);
 }
 
 static inline void
-update_stats_enqueue_sleeper(struct cfs_rq *cfs_rq, struct sched_entity *se)
+update_stats_enqueue_sleeper_fair(struct cfs_rq *cfs_rq, struct sched_entity *se)
 {
 	struct sched_statistics *stats = NULL;
 	struct task_struct *p = NULL;
-	u64 sleep_start, block_start;
 
 	if (!schedstat_enabled())
 		return;
@@ -989,67 +964,14 @@ update_stats_enqueue_sleeper(struct cfs_rq *cfs_rq, struct sched_entity *se)
 
 	__schedstat_from_sched_entity(se, &stats);
 
-	sleep_start = schedstat_val(stats->sleep_start);
-	block_start = schedstat_val(stats->block_start);
-
-	if (sleep_start) {
-		u64 delta = rq_clock(rq_of(cfs_rq)) - sleep_start;
-
-		if ((s64)delta < 0)
-			delta = 0;
-
-		if (unlikely(delta > schedstat_val(stats->sleep_max)))
-			__schedstat_set(stats->sleep_max, delta);
-
-		__schedstat_set(stats->sleep_start, 0);
-		__schedstat_add(stats->sum_sleep_runtime, delta);
-
-		if (p) {
-			account_scheduler_latency(p, delta >> 10, 1);
-			trace_sched_stat_sleep(p, delta);
-		}
-	}
-	if (block_start) {
-		u64 delta = rq_clock(rq_of(cfs_rq)) - block_start;
-
-		if ((s64)delta < 0)
-			delta = 0;
-
-		if (unlikely(delta > schedstat_val(stats->block_max)))
-			__schedstat_set(stats->block_max, delta);
-
-		__schedstat_set(stats->block_start, 0);
-		__schedstat_add(stats->sum_sleep_runtime, delta);
-
-		if (p) {
-			if (p->in_iowait) {
-				__schedstat_add(stats->iowait_sum, delta);
-				__schedstat_inc(stats->iowait_count);
-				trace_sched_stat_iowait(p, delta);
-			}
-
-			trace_sched_stat_blocked(p, delta);
-
-			/*
-			 * Blocking time is in units of nanosecs, so shift by
-			 * 20 to get a milliseconds-range estimation of the
-			 * amount of time that the task spent sleeping:
-			 */
-			if (unlikely(prof_on == SLEEP_PROFILING)) {
-				profile_hits(SLEEP_PROFILING,
-						(void *)get_wchan(p),
-						delta >> 20);
-			}
-			account_scheduler_latency(p, delta >> 10, 0);
-		}
-	}
+	__update_stats_enqueue_sleeper(rq_of(cfs_rq), p, stats);
 }
 
 /*
  * Task is being enqueued - update stats:
  */
 static inline void
-update_stats_enqueue(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
+update_stats_enqueue_fair(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
 {
 	if (!schedstat_enabled())
 		return;
@@ -1059,14 +981,14 @@ update_stats_enqueue(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
 	 * a dequeue/enqueue event is a NOP)
 	 */
 	if (se != cfs_rq->curr)
-		update_stats_wait_start(cfs_rq, se);
+		update_stats_wait_start_fair(cfs_rq, se);
 
 	if (flags & ENQUEUE_WAKEUP)
-		update_stats_enqueue_sleeper(cfs_rq, se);
+		update_stats_enqueue_sleeper_fair(cfs_rq, se);
 }
 
 static inline void
-update_stats_dequeue(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
+update_stats_dequeue_fair(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
 {
 
 	if (!schedstat_enabled())
@@ -1077,7 +999,7 @@ update_stats_dequeue(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
 	 * waiting task:
 	 */
 	if (se != cfs_rq->curr)
-		update_stats_wait_end(cfs_rq, se);
+		update_stats_wait_end_fair(cfs_rq, se);
 
 	if ((flags & DEQUEUE_SLEEP) && entity_is_task(se)) {
 		struct task_struct *tsk = task_of(se);
@@ -4186,26 +4108,6 @@ place_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int initial)
 
 static void check_enqueue_throttle(struct cfs_rq *cfs_rq);
 
-static inline void check_schedstat_required(void)
-{
-#ifdef CONFIG_SCHEDSTATS
-	if (schedstat_enabled())
-		return;
-
-	/* Force schedstat enabled if a dependent tracepoint is active */
-	if (trace_sched_stat_wait_enabled()    ||
-			trace_sched_stat_sleep_enabled()   ||
-			trace_sched_stat_iowait_enabled()  ||
-			trace_sched_stat_blocked_enabled() ||
-			trace_sched_stat_runtime_enabled())  {
-		printk_deferred_once("Scheduler tracepoints stat_sleep, stat_iowait, "
-			     "stat_blocked and stat_runtime require the "
-			     "kernel parameter schedstats=enable or "
-			     "kernel.sched_schedstats=1\n");
-	}
-#endif
-}
-
 static inline bool cfs_bandwidth_used(void);
 
 /*
@@ -4279,7 +4181,7 @@ enqueue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
 		place_entity(cfs_rq, se, 0);
 
 	check_schedstat_required();
-	update_stats_enqueue(cfs_rq, se, flags);
+	update_stats_enqueue_fair(cfs_rq, se, flags);
 	check_spread(cfs_rq, se);
 	if (!curr)
 		__enqueue_entity(cfs_rq, se);
@@ -4363,7 +4265,7 @@ dequeue_entity(struct cfs_rq *cfs_rq, struct sched_entity *se, int flags)
 	update_load_avg(cfs_rq, se, UPDATE_TG);
 	se_update_runnable(se);
 
-	update_stats_dequeue(cfs_rq, se, flags);
+	update_stats_dequeue_fair(cfs_rq, se, flags);
 
 	clear_buddies(cfs_rq, se);
 
@@ -4448,7 +4350,7 @@ set_next_entity(struct cfs_rq *cfs_rq, struct sched_entity *se)
 		 * a CPU. So account for the time it spent waiting on the
 		 * runqueue.
 		 */
-		update_stats_wait_end(cfs_rq, se);
+		update_stats_wait_end_fair(cfs_rq, se);
 		__dequeue_entity(cfs_rq, se);
 		update_load_avg(cfs_rq, se, UPDATE_TG);
 	}
@@ -4550,7 +4452,7 @@ static void put_prev_entity(struct cfs_rq *cfs_rq, struct sched_entity *prev)
 	check_spread(cfs_rq, prev);
 
 	if (prev->on_rq) {
-		update_stats_wait_start(cfs_rq, prev);
+		update_stats_wait_start_fair(cfs_rq, prev);
 		/* Put 'current' back into the tree. */
 		__enqueue_entity(cfs_rq, prev);
 		/* in !on_rq case, update occurred at dequeue */
diff --git a/kernel/sched/stats.c b/kernel/sched/stats.c
index 844bd9dbfbf0..1a9614c69669 100644
--- a/kernel/sched/stats.c
+++ b/kernel/sched/stats.c
@@ -5,6 +5,110 @@
 #include "sched.h"
 #include "stats.h"
 
+void __update_stats_wait_start(struct rq *rq, struct task_struct *p,
+			       struct sched_statistics *stats)
+{
+	u64 wait_start, prev_wait_start;
+
+	wait_start = rq_clock(rq);
+	prev_wait_start = schedstat_val(stats->wait_start);
+
+	if (p && likely(wait_start > prev_wait_start))
+		wait_start -= prev_wait_start;
+
+	__schedstat_set(stats->wait_start, wait_start);
+}
+
+void __update_stats_wait_end(struct rq *rq, struct task_struct *p,
+			     struct sched_statistics *stats)
+{
+	u64 delta;
+
+	delta = rq_clock(rq) - schedstat_val(stats->wait_start);
+
+	if (p) {
+		if (task_on_rq_migrating(p)) {
+			/*
+			 * Preserve migrating task's wait time so wait_start
+			 * time stamp can be adjusted to accumulate wait time
+			 * prior to migration.
+			 */
+			__schedstat_set(stats->wait_start, delta);
+
+			return;
+		}
+
+		trace_sched_stat_wait(p, delta);
+	}
+
+	__schedstat_set(stats->wait_max,
+			max(schedstat_val(stats->wait_max), delta));
+	__schedstat_inc(stats->wait_count);
+	__schedstat_add(stats->wait_sum, delta);
+	__schedstat_set(stats->wait_start, 0);
+}
+
+void __update_stats_enqueue_sleeper(struct rq *rq, struct task_struct *p,
+				    struct sched_statistics *stats)
+{
+	u64 sleep_start, block_start;
+
+	sleep_start = schedstat_val(stats->sleep_start);
+	block_start = schedstat_val(stats->block_start);
+
+	if (sleep_start) {
+		u64 delta = rq_clock(rq) - sleep_start;
+
+		if ((s64)delta < 0)
+			delta = 0;
+
+		if (unlikely(delta > schedstat_val(stats->sleep_max)))
+			__schedstat_set(stats->sleep_max, delta);
+
+		__schedstat_set(stats->sleep_start, 0);
+		__schedstat_add(stats->sum_sleep_runtime, delta);
+
+		if (p) {
+			account_scheduler_latency(p, delta >> 10, 1);
+			trace_sched_stat_sleep(p, delta);
+		}
+	}
+	if (block_start) {
+		u64 delta = rq_clock(rq) - block_start;
+
+		if ((s64)delta < 0)
+			delta = 0;
+
+		if (unlikely(delta > schedstat_val(stats->block_max)))
+			__schedstat_set(stats->block_max, delta);
+
+		__schedstat_set(stats->block_start, 0);
+		__schedstat_add(stats->sum_sleep_runtime, delta);
+
+		if (p) {
+			if (p->in_iowait) {
+				__schedstat_add(stats->iowait_sum, delta);
+				__schedstat_inc(stats->iowait_count);
+				trace_sched_stat_iowait(p, delta);
+			}
+
+			trace_sched_stat_blocked(p, delta);
+
+			/*
+			 * Blocking time is in units of nanosecs, so shift by
+			 * 20 to get a milliseconds-range estimation of the
+			 * amount of time that the task spent sleeping:
+			 */
+			if (unlikely(prof_on == SLEEP_PROFILING)) {
+				profile_hits(SLEEP_PROFILING,
+						(void *)get_wchan(p),
+						delta >> 20);
+			}
+			account_scheduler_latency(p, delta >> 10, 0);
+		}
+	}
+}
+
 /*
  * Current schedstat API version.
  *
diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h
index 87242968712e..b8e3d4ee21e1 100644
--- a/kernel/sched/stats.h
+++ b/kernel/sched/stats.h
@@ -78,6 +78,33 @@ static inline int alloc_tg_schedstats(struct task_group *tg)
 	return 1;
 }
 
+void __update_stats_wait_start(struct rq *rq, struct task_struct *p,
+			       struct sched_statistics *stats);
+
+void __update_stats_wait_end(struct rq *rq, struct task_struct *p,
+			     struct sched_statistics *stats);
+void __update_stats_enqueue_sleeper(struct rq *rq, struct task_struct *p,
+				    struct sched_statistics *stats);
+
+static inline void
+check_schedstat_required(void)
+{
+	if (schedstat_enabled())
+		return;
+
+	/* Force schedstat enabled if a dependent tracepoint is active */
+	if (trace_sched_stat_wait_enabled()    ||
+		trace_sched_stat_sleep_enabled()   ||
+		trace_sched_stat_iowait_enabled()  ||
+		trace_sched_stat_blocked_enabled() ||
+		trace_sched_stat_runtime_enabled())  {
+		printk_deferred_once("Scheduler tracepoints stat_sleep, stat_iowait, "
+				     "stat_blocked and stat_runtime require the "
+				     "kernel parameter schedstats=enable or "
+				     "kernel.sched_schedstats=1\n");
+	}
+}
+
 #else /* !CONFIG_SCHEDSTATS: */
 static inline void rq_sched_info_arrive  (struct rq *rq, unsigned long long delta) { }
 static inline void rq_sched_info_dequeued(struct rq *rq, unsigned long long delta) { }
@@ -101,6 +128,11 @@ static inline int alloc_tg_schedstats(struct task_group *tg)
 	return 1;
 }
 
+# define __update_stats_wait_start(rq, p, stats)	do { } while (0)
+# define __update_stats_wait_end(rq, p, stats)		do { } while (0)
+# define __update_stats_enqueue_sleeper(rq, p, stats)	do { } while (0)
+# define check_schedstat_required()			do { } while (0)
+
 #endif /* CONFIG_SCHEDSTATS */
 
 #ifdef CONFIG_PSI
-- 
2.18.4


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 5/6] sched, rt: support sched_stat_runtime tracepoint for RT sched class
  2020-12-01 11:54 [PATCH 0/6] sched: support schedstats for RT sched class Yafang Shao
                   ` (3 preceding siblings ...)
  2020-12-01 11:54 ` [PATCH 4/6] sched: make schedstats helpers " Yafang Shao
@ 2020-12-01 11:54 ` Yafang Shao
  2020-12-01 11:54 ` [PATCH 6/6] sched, rt: support schedstats " Yafang Shao
  5 siblings, 0 replies; 19+ messages in thread
From: Yafang Shao @ 2020-12-01 11:54 UTC (permalink / raw)
  To: mgorman, mingo, peterz, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, bristot, qianjun.kernel
  Cc: linux-kernel, linux-rt-users, Yafang Shao

The runtime of a RT task has already been there, so we only need to
place the sched_stat_runtime tracepoint there.

One difference between fair task and RT task is that there is no vruntime
in RT task. To reuse the sched_stat_runtime tracepoint, '0' is passed as
vruntime for RT task.

The output of this tracepoint for RT task as follows,
          stress-9748    [039] d.h.   113.519352: sched_stat_runtime: comm=stress pid=9748 runtime=997573 [ns] vruntime=0 [ns]
          stress-9748    [039] d.h.   113.520352: sched_stat_runtime: comm=stress pid=9748 runtime=997627 [ns] vruntime=0 [ns]
          stress-9748    [039] d.h.   113.521352: sched_stat_runtime: comm=stress pid=9748 runtime=998203 [ns] vruntime=0 [ns]

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
 kernel/sched/rt.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 2d543a270dfe..da989653b0a2 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -1011,6 +1011,8 @@ static void update_curr_rt(struct rq *rq)
 	schedstat_set(curr->stats.exec_max,
 		      max(curr->stats.exec_max, delta_exec));
 
+	trace_sched_stat_runtime(curr, delta_exec, 0);
+
 	curr->se.sum_exec_runtime += delta_exec;
 	account_group_exec_runtime(curr, delta_exec);
 
-- 
2.18.4


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH 6/6] sched, rt: support schedstats for RT sched class
  2020-12-01 11:54 [PATCH 0/6] sched: support schedstats for RT sched class Yafang Shao
                   ` (4 preceding siblings ...)
  2020-12-01 11:54 ` [PATCH 5/6] sched, rt: support sched_stat_runtime tracepoint for RT " Yafang Shao
@ 2020-12-01 11:54 ` Yafang Shao
  2020-12-01 13:59   ` kernel test robot
                     ` (2 more replies)
  5 siblings, 3 replies; 19+ messages in thread
From: Yafang Shao @ 2020-12-01 11:54 UTC (permalink / raw)
  To: mgorman, mingo, peterz, juri.lelli, vincent.guittot,
	dietmar.eggemann, rostedt, bsegall, bristot, qianjun.kernel
  Cc: linux-kernel, linux-rt-users, Yafang Shao

We want to measure the latency of RT tasks in our production
environment with schedstats facility, but currently schedstats is only
supported for fair sched class. This patch enable it for RT sched class
as well.

After we make the struct sched_statistics and the helpers of it
independent of fair sched class, we can easily use the schedstats
facility for RT sched class.

The schedstat usage in RT sched class is similar with fair sched class,
for example,
                fair                            RT
enqueue         update_stats_enqueue_fair       update_stats_enqueue_rt
dequeue         update_stats_dequeue_fair       update_stats_dequeue_rt
put_prev_task   update_stats_wait_start         update_stats_wait_start
set_next_task   update_stats_wait_end           update_stats_wait_end

The user can get the schedstats information in the same way in fair sched
class. For example,
		fair				RT
task show	/proc/[pid]/sched               /proc/[pid]/sched
group show	cpu.stat in cgroup		cpu.stat in cgroup

The output of a RT task's schedstats as follows,
$ cat /proc/10461/sched
...
stats.sum_sleep_runtime                      :         37966.502936
stats.wait_start                             :             0.000000
stats.sleep_start                            :             0.000000
stats.block_start                            :        279182.986012
stats.sleep_max                              :             9.001121
stats.block_max                              :             9.292692
stats.exec_max                               :             0.090009
stats.slice_max                              :             0.000000
stats.wait_max                               :             0.005305
stats.wait_sum                               :             0.352352
stats.wait_count                             :               236173
stats.iowait_sum                             :         37875.625128
stats.iowait_count                           :               235933
stats.nr_migrations_cold                     :                    0
stats.nr_failed_migrations_affine            :                    0
stats.nr_failed_migrations_running           :                    0
stats.nr_failed_migrations_hot               :                    0
stats.nr_forced_migrations                   :                    0
stats.nr_wakeups                             :               236172
stats.nr_wakeups_sync                        :                    0
stats.nr_wakeups_migrate                     :                    2
stats.nr_wakeups_local                       :               235865
stats.nr_wakeups_remote                      :                  307
stats.nr_wakeups_affine                      :                    0
stats.nr_wakeups_affine_attempts             :                    0
stats.nr_wakeups_passive                     :                    0
stats.nr_wakeups_idle                        :                    0
...

The sched:sched_stat_{wait, sleep, iowait, blocked} tracepoints can
be used to trace RT tasks as well.

The output of these tracepoints for a RT tasks as follows,

- blocked & iowait
    kworker/48:1-442     [048] d...   539.830872: sched_stat_iowait: comm=stress pid=10461 delay=158242 [ns]
    kworker/48:1-442     [048] d...   539.830872: sched_stat_blocked: comm=stress pid=10461 delay=158242 [ns]

- wait
          stress-10460   [001] dN..   813.965304: sched_stat_wait: comm=stress pid=10462 delay=99997536 [ns]
          stress-10462   [001] d.h.   813.966300: sched_stat_runtime: comm=stress pid=10462 runtime=993812 [ns] vruntime=0 [ns]
          [...]
          stress-10462   [001] d.h.   814.065300: sched_stat_runtime: comm=stress pid=10462 runtime=994484 [ns] vruntime=0 [ns]
          [ totally 100 times of sched_stat_runtime for pid 10462]
          [ The delay of pid 10462 is the sum of above runtime ]
          stress-10462   [001] dN..   814.065307: sched_stat_wait: comm=stress pid=10460 delay=100001089 [ns]
          stress-10460   [001] d.h.   814.066299: sched_stat_runtime: comm=stress pid=10460 runtime=991934 [ns] vruntime=0 [ns]

- sleep
           sleep-15582   [041] dN..  1732.814348: sched_stat_sleep: comm=sleep.sh pid=15474 delay=1001223130 [ns]
           sleep-15584   [041] dN..  1733.815908: sched_stat_sleep: comm=sleep.sh pid=15474 delay=1001238954 [ns]
           [ In sleep.sh, it sleeps 1 sec each time. ]

Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
---
 kernel/sched/rt.c | 134 +++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 133 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index da989653b0a2..f764c2b9070d 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -1271,6 +1271,121 @@ static void __delist_rt_entity(struct sched_rt_entity *rt_se, struct rt_prio_arr
 	rt_se->on_list = 0;
 }
 
+static inline void
+__schedstat_from_sched_rt_entity(struct sched_rt_entity *rt_se,
+				 struct sched_statistics **stats)
+{
+	struct task_struct *p;
+	struct task_group *tg;
+	struct rt_rq *rt_rq;
+	int cpu;
+
+	if (rt_entity_is_task(rt_se)) {
+		p = rt_task_of(rt_se);
+		*stats = &p->stats;
+	} else {
+		rt_rq = group_rt_rq(rt_se);
+		tg = rt_rq->tg;
+		cpu = cpu_of(rq_of_rt_rq(rt_rq));
+		*stats = tg->stats[cpu];
+	}
+}
+
+static inline void
+schedstat_from_sched_rt_entity(struct sched_rt_entity *rt_se,
+			       struct sched_statistics **stats)
+{
+	if (!schedstat_enabled())
+		return;
+
+	__schedstat_from_sched_rt_entity(rt_se, stats);
+}
+
+static inline void
+update_stats_wait_start_rt(struct rt_rq *rt_rq, struct sched_rt_entity *rt_se)
+{
+	struct sched_statistics *stats = NULL;
+	struct task_struct *p = NULL;
+
+	if (!schedstat_enabled())
+		return;
+
+	if (rt_entity_is_task(rt_se))
+		p = rt_task_of(rt_se);
+
+	__schedstat_from_sched_rt_entity(rt_se, &stats);
+
+	__update_stats_wait_start(rq_of_rt_rq(rt_rq), p, stats);
+}
+
+static inline void
+update_stats_enqueue_sleeper_rt(struct rt_rq *rt_rq, struct sched_rt_entity *rt_se)
+{
+	struct sched_statistics *stats = NULL;
+	struct task_struct *p = NULL;
+
+	if (!schedstat_enabled())
+		return;
+
+	if (rt_entity_is_task(rt_se))
+		p = rt_task_of(rt_se);
+
+	__schedstat_from_sched_rt_entity(rt_se, &stats);
+
+	__update_stats_enqueue_sleeper(rq_of_rt_rq(rt_rq), p, stats);
+}
+
+static inline void
+update_stats_enqueue_rt(struct rt_rq *rt_rq, struct sched_rt_entity *rt_se,
+			int flags)
+{
+	if (!schedstat_enabled())
+		return;
+
+	if (flags & ENQUEUE_WAKEUP)
+		update_stats_enqueue_sleeper_rt(rt_rq, rt_se);
+}
+
+static inline void
+update_stats_wait_end_rt(struct rt_rq *rt_rq, struct sched_rt_entity *rt_se)
+{
+	struct sched_statistics *stats = NULL;
+	struct task_struct *p = NULL;
+
+	if (!schedstat_enabled())
+		return;
+
+	if (rt_entity_is_task(rt_se))
+		p = rt_task_of(rt_se);
+
+	__schedstat_from_sched_rt_entity(rt_se, &stats);
+
+	__update_stats_wait_end(rq_of_rt_rq(rt_rq), p, stats);
+}
+
+static inline void
+update_stats_dequeue_rt(struct rt_rq *rt_rq, struct sched_rt_entity *rt_se,
+			int flags)
+{
+	struct task_struct *p = NULL;
+
+	if (!schedstat_enabled())
+		return;
+
+	if (rt_entity_is_task(rt_se))
+		p = rt_task_of(rt_se);
+
+	if ((flags & DEQUEUE_SLEEP) && p) {
+		if (p->state & TASK_INTERRUPTIBLE)
+			__schedstat_set(p->stats.sleep_start,
+					rq_clock(rq_of_rt_rq(rt_rq)));
+
+		if (p->state & TASK_UNINTERRUPTIBLE)
+			__schedstat_set(p->stats.block_start,
+					rq_clock(rq_of_rt_rq(rt_rq)));
+	}
+}
+
 static void __enqueue_rt_entity(struct sched_rt_entity *rt_se, unsigned int flags)
 {
 	struct rt_rq *rt_rq = rt_rq_of_se(rt_se);
@@ -1344,6 +1459,8 @@ static void enqueue_rt_entity(struct sched_rt_entity *rt_se, unsigned int flags)
 {
 	struct rq *rq = rq_of_rt_se(rt_se);
 
+	update_stats_enqueue_rt(rt_rq_of_se(rt_se), rt_se, flags);
+
 	dequeue_rt_stack(rt_se, flags);
 	for_each_sched_rt_entity(rt_se)
 		__enqueue_rt_entity(rt_se, flags);
@@ -1354,6 +1471,7 @@ static void dequeue_rt_entity(struct sched_rt_entity *rt_se, unsigned int flags)
 {
 	struct rq *rq = rq_of_rt_se(rt_se);
 
+	update_stats_dequeue_rt(rt_rq_of_se(rt_se), rt_se, flags);
 	dequeue_rt_stack(rt_se, flags);
 
 	for_each_sched_rt_entity(rt_se) {
@@ -1376,6 +1494,9 @@ enqueue_task_rt(struct rq *rq, struct task_struct *p, int flags)
 	if (flags & ENQUEUE_WAKEUP)
 		rt_se->timeout = 0;
 
+	check_schedstat_required();
+	update_stats_wait_start_rt(rt_rq_of_se(rt_se), rt_se);
+
 	enqueue_rt_entity(rt_se, flags);
 
 	if (!task_current(rq, p) && p->nr_cpus_allowed > 1)
@@ -1574,9 +1695,14 @@ static void check_preempt_curr_rt(struct rq *rq, struct task_struct *p, int flag
 #endif
 }
 
-static inline void set_next_task_rt(struct rq *rq, struct task_struct *p, bool first)
+void set_next_task_rt(struct rq *rq, struct task_struct *p, bool first)
 {
+	struct sched_rt_entity *rt_se = &p->rt;
+	struct rt_rq *rt_rq = &rq->rt;
+
 	p->se.exec_start = rq_clock_task(rq);
+	if (on_rt_rq(&p->rt))
+		update_stats_wait_end_rt(rt_rq, rt_se);
 
 	/* The running task is never eligible for pushing */
 	dequeue_pushable_task(rq, p);
@@ -1640,6 +1766,12 @@ static struct task_struct *pick_next_task_rt(struct rq *rq)
 
 static void put_prev_task_rt(struct rq *rq, struct task_struct *p)
 {
+	struct sched_rt_entity *rt_se = &p->rt;
+	struct rt_rq *rt_rq = &rq->rt;
+
+	if (on_rt_rq(&p->rt))
+		update_stats_wait_start_rt(rt_rq, rt_se);
+
 	update_curr_rt(rq);
 
 	update_rt_rq_load_avg(rq_clock_pelt(rq), rq, 1);
-- 
2.18.4


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH 2/6] sched, fair: use __schedstat_set() in set_next_entity()
  2020-12-01 11:54 ` [PATCH 2/6] sched, fair: use __schedstat_set() in set_next_entity() Yafang Shao
@ 2020-12-01 12:28   ` Mel Gorman
  0 siblings, 0 replies; 19+ messages in thread
From: Mel Gorman @ 2020-12-01 12:28 UTC (permalink / raw)
  To: Yafang Shao
  Cc: mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann,
	rostedt, bsegall, bristot, qianjun.kernel, linux-kernel,
	linux-rt-users

On Tue, Dec 01, 2020 at 07:54:12PM +0800, Yafang Shao wrote:
> schedstat_enabled() has been already checked, so we can use
> __schedstat_set() directly.
> 
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>

Acked-by: Mel Gorman <mgorman@suse.de>

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/6] sched: make struct sched_statistics independent of fair sched class
  2020-12-01 11:54 ` [PATCH 3/6] sched: make struct sched_statistics independent of fair sched class Yafang Shao
@ 2020-12-01 12:41   ` Mel Gorman
  2020-12-02  2:06     ` Yafang Shao
  2020-12-01 13:19   ` kernel test robot
  2020-12-01 13:30   ` kernel test robot
  2 siblings, 1 reply; 19+ messages in thread
From: Mel Gorman @ 2020-12-01 12:41 UTC (permalink / raw)
  To: Yafang Shao
  Cc: mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann,
	rostedt, bsegall, bristot, qianjun.kernel, linux-kernel,
	linux-rt-users

On Tue, Dec 01, 2020 at 07:54:13PM +0800, Yafang Shao wrote:
> If we want to use schedstats facility, we should move out of
> struct sched_statistics from the struct sched_entity or add it into other
> sctructs of sched entity as well. Obviously the latter one is bad because
> that requires more spaces. So we should move it into a common struct which
> can be used by all sched classes.
> 
> The struct sched_statistics is the schedular statistics of a task_struct
> or a task_group. So we can move it into struct task_struct and
> struct task_group to achieve the goal.
> 
> Below is the detailed explaination of the change in the structs.
> 
> - Before this patch
> 
> struct task_struct {            |-> struct sched_entity {
>     ...                         |       ...
>     struct sched_entity *se; ---|       struct sched_statistics statistics;
>     struct sched_rt_entity *rt;         ...
>     ...                                 ...
> };                                  };
> 
> struct task_group {             |--> se[0]->statistics : schedstats of CPU0
>     ...                         |
>  #ifdef CONFIG_FAIR_GROUP_SCHED |
>     struct sched_entity **se; --|--> se[1]->statistics : schedstats of CPU1
>                                 |
>  #endif                         |
>                                 |--> se[N]->statistics : schedstats of CPUn
> 
>  #ifdef CONFIG_FAIR_GROUP_SCHED
>     struct sched_rt_entity  **rt_se; (N/A)
>  #endif
>     ...
> };
> 
> The '**se' in task_group is allocated in the fair sched class, which is
> hard to be reused by other sched class.
> 
> - After this patch
> 
> struct task_struct {
>     ...
>     struct sched_statistics statistics;
>     ...
>     struct sched_entity *se;
>     struct sched_rt_entity *rt;
>     ...
> };
> 
> struct task_group {                    |---> stats[0] : of CPU0
>     ...                                |
>     struct sched_statistics **stats; --|---> stats[1] : of CPU1
>     ...                                |
>                                        |---> stats[n] : of CPUn
>  #ifdef CONFIG_FAIR_GROUP_SCHED
>     struct sched_entity **se;
>  #endif
>  #ifdef CONFIG_RT_GROUP_SCHED
>     struct sched_rt_entity  **rt_se;
>  #endif
>     ...
> };
> 
> After the patch it is clearly that both of se or rt_se can easily get the
> sched_statistics by a task_struct or a task_group.
> 
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>

I didn't see anything wrong as such, it's mostly a mechanical
conversion. The one slight caveat is the potential change in cache
location for the statistics but it's not necessarily negative. The stats
potentially move to a different cache line but it's less obvious whether
that even matters given the location is very similar.

There is increased overhead now when schedstats are *enabled* because
_schedstat_from_sched_entity() has to be called but it appears that it is
protected by a schedstat_enabled() check. So ok, schedstats when enabled
are now a bit more expensive but they were expensive in the first place
so does it matter?

I'd have been happied if there was a comparison with schedstats enabled
just in case the overhead is too high but it could also do with a second
set of eyeballs.

It's somewhat tentative but

Acked-by: Mel Gorman <mgorman@suse.de>

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/6] sched: make schedstats helpers independent of fair sched class
  2020-12-01 11:54 ` [PATCH 4/6] sched: make schedstats helpers " Yafang Shao
@ 2020-12-01 12:46   ` Mel Gorman
  2020-12-02  2:04     ` Yafang Shao
  2020-12-01 13:58   ` kernel test robot
  1 sibling, 1 reply; 19+ messages in thread
From: Mel Gorman @ 2020-12-01 12:46 UTC (permalink / raw)
  To: Yafang Shao
  Cc: mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann,
	rostedt, bsegall, bristot, qianjun.kernel, linux-kernel,
	linux-rt-users

On Tue, Dec 01, 2020 at 07:54:14PM +0800, Yafang Shao wrote:
> The original prototype of the schedstats helpers are
> 
> update_stats_wait_*(struct cfs_rq *cfs_rq, struct sched_entity *se)
> 
> The cfs_rq in these helpers is used to get the rq_clock, and the se is
> used to get the struct sched_statistics and the struct task_struct. In
> order to make these helpers available by all sched classes, we can pass
> the rq, sched_statistics and task_struct directly.
> 
> Then the new helpers are
> 
> update_stats_wait_*(struct rq *rq, struct task_struct *p,
> 		    struct sched_statistics *stats)
> 
> which are independent of fair sched class.
> 
> To avoid vmlinux growing too large or introducing ovehead when
> !schedstat_enabled(), some new helpers after schedstat_enabled() are also
> introduced, Suggested by Mel. These helpers are in sched/stats.c,
> 
> __update_stats_wait_*(struct rq *rq, struct task_struct *p,
> 		      struct sched_statistics *stats)
> 
> Cc: Mel Gorman <mgorman@suse.de>
> Signed-off-by: Yafang Shao <laoar.shao@gmail.com>

Think it's ok, it's mostly code shuffling. I'd have been happier if
there was evidence showing a before/after comparison of the sched stats
for something simple like "perf bench sched pipe" and a clear statement
of no functional change as well as a comparison of the vmlinux files but
I think it's ok so;

Acked-by: Mel Gorman <mgorman@suse.de>

I didn't look at the rt.c parts

-- 
Mel Gorman
SUSE Labs

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/6] sched: make struct sched_statistics independent of fair sched class
  2020-12-01 11:54 ` [PATCH 3/6] sched: make struct sched_statistics independent of fair sched class Yafang Shao
  2020-12-01 12:41   ` Mel Gorman
@ 2020-12-01 13:19   ` kernel test robot
  2020-12-01 13:30   ` kernel test robot
  2 siblings, 0 replies; 19+ messages in thread
From: kernel test robot @ 2020-12-01 13:19 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 3836 bytes --]

Hi Yafang,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on v5.10-rc6]
[cannot apply to tip/sched/core next-20201201]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Yafang-Shao/sched-support-schedstats-for-RT-sched-class/20201201-200101
base:    b65054597872ce3aefbc6a666385eabdf9e288da
config: nds32-alldefconfig (attached as .config)
compiler: nds32le-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/a2e8aa563953755a3bae609996b94b02a530641e
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Yafang-Shao/sched-support-schedstats-for-RT-sched-class/20201201-200101
        git checkout a2e8aa563953755a3bae609996b94b02a530641e
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=nds32 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   kernel/sched/fair.c: In function '__schedstat_from_sched_entity':
   kernel/sched/fair.c:854:14: error: 'struct cfs_rq' has no member named 'tg'
     854 |   tg = cfs_rq->tg;
         |              ^~
>> kernel/sched/fair.c:856:14: error: dereferencing pointer to incomplete type 'struct task_group'
     856 |   *stats = tg->stats[cpu];
         |              ^~
   kernel/sched/fair.c:847:6: warning: variable 'cpu' set but not used [-Wunused-but-set-variable]
     847 |  int cpu;
         |      ^~~
   kernel/sched/fair.c: At top level:
   kernel/sched/fair.c:5419:6: warning: no previous prototype for 'init_cfs_bandwidth' [-Wmissing-prototypes]
    5419 | void init_cfs_bandwidth(struct cfs_bandwidth *cfs_b) {}
         |      ^~~~~~~~~~~~~~~~~~
   kernel/sched/fair.c:11201:6: warning: no previous prototype for 'free_fair_sched_group' [-Wmissing-prototypes]
   11201 | void free_fair_sched_group(struct task_group *tg) { }
         |      ^~~~~~~~~~~~~~~~~~~~~
   kernel/sched/fair.c:11203:5: warning: no previous prototype for 'alloc_fair_sched_group' [-Wmissing-prototypes]
   11203 | int alloc_fair_sched_group(struct task_group *tg, struct task_group *parent)
         |     ^~~~~~~~~~~~~~~~~~~~~~
   kernel/sched/fair.c:11208:6: warning: no previous prototype for 'online_fair_sched_group' [-Wmissing-prototypes]
   11208 | void online_fair_sched_group(struct task_group *tg) { }
         |      ^~~~~~~~~~~~~~~~~~~~~~~
   kernel/sched/fair.c:11210:6: warning: no previous prototype for 'unregister_fair_sched_group' [-Wmissing-prototypes]
   11210 | void unregister_fair_sched_group(struct task_group *tg) { }
         |      ^~~~~~~~~~~~~~~~~~~~~~~~~~~

vim +856 kernel/sched/fair.c

   839	
   840	static inline void
   841	__schedstat_from_sched_entity(struct sched_entity *se,
   842				      struct sched_statistics **stats)
   843	{
   844		struct task_struct *p;
   845		struct task_group *tg;
   846		struct cfs_rq *cfs_rq;
   847		int cpu;
   848	
   849		if (entity_is_task(se)) {
   850			p = task_of(se);
   851			*stats = &p->stats;
   852		} else {
   853			cfs_rq = group_cfs_rq(se);
   854			tg = cfs_rq->tg;
   855			cpu = cpu_of(rq_of(cfs_rq));
 > 856			*stats = tg->stats[cpu];
   857		}
   858	}
   859	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 7317 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/6] sched: make struct sched_statistics independent of fair sched class
  2020-12-01 11:54 ` [PATCH 3/6] sched: make struct sched_statistics independent of fair sched class Yafang Shao
  2020-12-01 12:41   ` Mel Gorman
  2020-12-01 13:19   ` kernel test robot
@ 2020-12-01 13:30   ` kernel test robot
  2 siblings, 0 replies; 19+ messages in thread
From: kernel test robot @ 2020-12-01 13:30 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 3758 bytes --]

Hi Yafang,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on v5.10-rc6]
[cannot apply to tip/sched/core next-20201201]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Yafang-Shao/sched-support-schedstats-for-RT-sched-class/20201201-200101
base:    b65054597872ce3aefbc6a666385eabdf9e288da
config: i386-randconfig-m021-20201201 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-15) 9.3.0
reproduce (this is a W=1 build):
        # https://github.com/0day-ci/linux/commit/a2e8aa563953755a3bae609996b94b02a530641e
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Yafang-Shao/sched-support-schedstats-for-RT-sched-class/20201201-200101
        git checkout a2e8aa563953755a3bae609996b94b02a530641e
        # save the attached .config to linux build tree
        make W=1 ARCH=i386 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   In file included from kernel/sched/fair.c:24:
   kernel/sched/stats.h: In function 'free_tg_schedstats':
   kernel/sched/stats.h:50:9: error: dereferencing pointer to incomplete type 'struct task_group'
      50 |   if (tg->stats)
         |         ^~
   kernel/sched/fair.c: In function '__schedstat_from_sched_entity':
   kernel/sched/fair.c:854:14: error: 'struct cfs_rq' has no member named 'tg'
     854 |   tg = cfs_rq->tg;
         |              ^~
>> kernel/sched/fair.c:847:6: warning: variable 'cpu' set but not used [-Wunused-but-set-variable]
     847 |  int cpu;
         |      ^~~
   kernel/sched/fair.c: At top level:
   kernel/sched/fair.c:5419:6: warning: no previous prototype for 'init_cfs_bandwidth' [-Wmissing-prototypes]
    5419 | void init_cfs_bandwidth(struct cfs_bandwidth *cfs_b) {}
         |      ^~~~~~~~~~~~~~~~~~
   kernel/sched/fair.c:11201:6: warning: no previous prototype for 'free_fair_sched_group' [-Wmissing-prototypes]
   11201 | void free_fair_sched_group(struct task_group *tg) { }
         |      ^~~~~~~~~~~~~~~~~~~~~
   kernel/sched/fair.c:11203:5: warning: no previous prototype for 'alloc_fair_sched_group' [-Wmissing-prototypes]
   11203 | int alloc_fair_sched_group(struct task_group *tg, struct task_group *parent)
         |     ^~~~~~~~~~~~~~~~~~~~~~
   kernel/sched/fair.c:11208:6: warning: no previous prototype for 'online_fair_sched_group' [-Wmissing-prototypes]
   11208 | void online_fair_sched_group(struct task_group *tg) { }
         |      ^~~~~~~~~~~~~~~~~~~~~~~
   kernel/sched/fair.c:11210:6: warning: no previous prototype for 'unregister_fair_sched_group' [-Wmissing-prototypes]
   11210 | void unregister_fair_sched_group(struct task_group *tg) { }
         |      ^~~~~~~~~~~~~~~~~~~~~~~~~~~

vim +/cpu +847 kernel/sched/fair.c

   839	
   840	static inline void
   841	__schedstat_from_sched_entity(struct sched_entity *se,
   842				      struct sched_statistics **stats)
   843	{
   844		struct task_struct *p;
   845		struct task_group *tg;
   846		struct cfs_rq *cfs_rq;
 > 847		int cpu;
   848	
   849		if (entity_is_task(se)) {
   850			p = task_of(se);
   851			*stats = &p->stats;
   852		} else {
   853			cfs_rq = group_cfs_rq(se);
   854			tg = cfs_rq->tg;
   855			cpu = cpu_of(rq_of(cfs_rq));
   856			*stats = tg->stats[cpu];
   857		}
   858	}
   859	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 39476 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/6] sched: make schedstats helpers independent of fair sched class
  2020-12-01 11:54 ` [PATCH 4/6] sched: make schedstats helpers " Yafang Shao
  2020-12-01 12:46   ` Mel Gorman
@ 2020-12-01 13:58   ` kernel test robot
  1 sibling, 0 replies; 19+ messages in thread
From: kernel test robot @ 2020-12-01 13:58 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 4298 bytes --]

Hi Yafang,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on v5.10-rc6]
[cannot apply to tip/sched/core next-20201201]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Yafang-Shao/sched-support-schedstats-for-RT-sched-class/20201201-200101
base:    b65054597872ce3aefbc6a666385eabdf9e288da
config: m68k-randconfig-p001-20201201 (attached as .config)
compiler: m68k-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/e204b91936080185697938950f4db7466425716c
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Yafang-Shao/sched-support-schedstats-for-RT-sched-class/20201201-200101
        git checkout e204b91936080185697938950f4db7466425716c
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=m68k 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   In file included from include/asm-generic/bug.h:5,
                    from arch/m68k/include/asm/bug.h:32,
                    from include/linux/bug.h:5,
                    from include/linux/thread_info.h:12,
                    from arch/m68k/include/asm/current.h:16,
                    from include/linux/sched.h:12,
                    from kernel/sched/sched.h:8,
                    from kernel/sched/fair.c:23:
   include/linux/scatterlist.h: In function 'sg_set_buf':
   arch/m68k/include/asm/page_no.h:33:50: warning: ordered comparison of pointer with null pointer [-Wextra]
      33 | #define virt_addr_valid(kaddr) (((void *)(kaddr) >= (void *)PAGE_OFFSET) && \
         |                                                  ^~
   include/linux/compiler.h:78:42: note: in definition of macro 'unlikely'
      78 | # define unlikely(x) __builtin_expect(!!(x), 0)
         |                                          ^
   include/linux/scatterlist.h:143:2: note: in expansion of macro 'BUG_ON'
     143 |  BUG_ON(!virt_addr_valid(buf));
         |  ^~~~~~
   include/linux/scatterlist.h:143:10: note: in expansion of macro 'virt_addr_valid'
     143 |  BUG_ON(!virt_addr_valid(buf));
         |          ^~~~~~~~~~~~~~~
   kernel/sched/fair.c: In function '__schedstat_from_sched_entity':
   kernel/sched/fair.c:854:14: error: 'struct cfs_rq' has no member named 'tg'
     854 |   tg = cfs_rq->tg;
         |              ^~
   kernel/sched/fair.c: In function 'update_stats_wait_start_fair':
>> kernel/sched/fair.c:923:22: warning: variable 'p' set but not used [-Wunused-but-set-variable]
     923 |  struct task_struct *p = NULL;
         |                      ^
   kernel/sched/fair.c: In function 'update_stats_wait_end_fair':
   kernel/sched/fair.c:940:22: warning: variable 'p' set but not used [-Wunused-but-set-variable]
     940 |  struct task_struct *p = NULL;
         |                      ^
   kernel/sched/fair.c: In function 'update_stats_enqueue_sleeper_fair':
   kernel/sched/fair.c:957:22: warning: variable 'p' set but not used [-Wunused-but-set-variable]
     957 |  struct task_struct *p = NULL;
         |                      ^

vim +/p +923 kernel/sched/fair.c

   918	
   919	static inline void
   920	update_stats_wait_start_fair(struct cfs_rq *cfs_rq, struct sched_entity *se)
   921	{
   922		struct sched_statistics *stats = NULL;
 > 923		struct task_struct *p = NULL;
   924	
   925		if (!schedstat_enabled())
   926			return;
   927	
   928		if (entity_is_task(se))
   929			p = task_of(se);
   930	
   931		__schedstat_from_sched_entity(se, &stats);
   932	
   933		__update_stats_wait_start(rq_of(cfs_rq), p, stats);
   934	}
   935	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 20261 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 6/6] sched, rt: support schedstats for RT sched class
  2020-12-01 11:54 ` [PATCH 6/6] sched, rt: support schedstats " Yafang Shao
@ 2020-12-01 13:59   ` kernel test robot
  2020-12-01 14:13   ` kernel test robot
  2020-12-01 14:30   ` kernel test robot
  2 siblings, 0 replies; 19+ messages in thread
From: kernel test robot @ 2020-12-01 13:59 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 4325 bytes --]

Hi Yafang,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on v5.10-rc6]
[cannot apply to tip/sched/core next-20201201]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Yafang-Shao/sched-support-schedstats-for-RT-sched-class/20201201-200101
base:    b65054597872ce3aefbc6a666385eabdf9e288da
config: nds32-alldefconfig (attached as .config)
compiler: nds32le-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/8fe6f2ed51d6372798149583be6c936c597c500e
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Yafang-Shao/sched-support-schedstats-for-RT-sched-class/20201201-200101
        git checkout 8fe6f2ed51d6372798149583be6c936c597c500e
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=nds32 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   kernel/sched/rt.c:253:6: warning: no previous prototype for 'free_rt_sched_group' [-Wmissing-prototypes]
     253 | void free_rt_sched_group(struct task_group *tg) { }
         |      ^~~~~~~~~~~~~~~~~~~
   kernel/sched/rt.c:255:5: warning: no previous prototype for 'alloc_rt_sched_group' [-Wmissing-prototypes]
     255 | int alloc_rt_sched_group(struct task_group *tg, struct task_group *parent)
         |     ^~~~~~~~~~~~~~~~~~~~
   kernel/sched/rt.c:668:6: warning: no previous prototype for 'sched_rt_bandwidth_account' [-Wmissing-prototypes]
     668 | bool sched_rt_bandwidth_account(struct rt_rq *rt_rq)
         |      ^~~~~~~~~~~~~~~~~~~~~~~~~~
   kernel/sched/rt.c: In function '__schedstat_from_sched_rt_entity':
   kernel/sched/rt.c:1288:13: error: 'struct rt_rq' has no member named 'tg'
    1288 |   tg = rt_rq->tg;
         |             ^~
>> kernel/sched/rt.c:1290:14: error: dereferencing pointer to incomplete type 'struct task_group'
    1290 |   *stats = tg->stats[cpu];
         |              ^~
   kernel/sched/rt.c:1281:6: warning: variable 'cpu' set but not used [-Wunused-but-set-variable]
    1281 |  int cpu;
         |      ^~~
   kernel/sched/rt.c: In function 'update_stats_wait_start_rt':
   kernel/sched/rt.c:1308:22: warning: variable 'p' set but not used [-Wunused-but-set-variable]
    1308 |  struct task_struct *p = NULL;
         |                      ^
   kernel/sched/rt.c: In function 'update_stats_enqueue_sleeper_rt':
   kernel/sched/rt.c:1325:22: warning: variable 'p' set but not used [-Wunused-but-set-variable]
    1325 |  struct task_struct *p = NULL;
         |                      ^
   kernel/sched/rt.c: In function 'update_stats_wait_end_rt':
   kernel/sched/rt.c:1353:22: warning: variable 'p' set but not used [-Wunused-but-set-variable]
    1353 |  struct task_struct *p = NULL;
         |                      ^
   kernel/sched/rt.c: At top level:
   kernel/sched/rt.c:1698:6: warning: no previous prototype for 'set_next_task_rt' [-Wmissing-prototypes]
    1698 | void set_next_task_rt(struct rq *rq, struct task_struct *p, bool first)
         |      ^~~~~~~~~~~~~~~~

vim +1290 kernel/sched/rt.c

  1273	
  1274	static inline void
  1275	__schedstat_from_sched_rt_entity(struct sched_rt_entity *rt_se,
  1276					 struct sched_statistics **stats)
  1277	{
  1278		struct task_struct *p;
  1279		struct task_group *tg;
  1280		struct rt_rq *rt_rq;
  1281		int cpu;
  1282	
  1283		if (rt_entity_is_task(rt_se)) {
  1284			p = rt_task_of(rt_se);
  1285			*stats = &p->stats;
  1286		} else {
  1287			rt_rq = group_rt_rq(rt_se);
  1288			tg = rt_rq->tg;
  1289			cpu = cpu_of(rq_of_rt_rq(rt_rq));
> 1290			*stats = tg->stats[cpu];
  1291		}
  1292	}
  1293	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 7317 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 6/6] sched, rt: support schedstats for RT sched class
  2020-12-01 11:54 ` [PATCH 6/6] sched, rt: support schedstats " Yafang Shao
  2020-12-01 13:59   ` kernel test robot
@ 2020-12-01 14:13   ` kernel test robot
  2020-12-01 14:30   ` kernel test robot
  2 siblings, 0 replies; 19+ messages in thread
From: kernel test robot @ 2020-12-01 14:13 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 4772 bytes --]

Hi Yafang,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on v5.10-rc6]
[cannot apply to tip/sched/core next-20201201]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Yafang-Shao/sched-support-schedstats-for-RT-sched-class/20201201-200101
base:    b65054597872ce3aefbc6a666385eabdf9e288da
config: riscv-randconfig-r034-20201201 (attached as .config)
compiler: clang version 12.0.0 (https://github.com/llvm/llvm-project ac40a2d8f16b8a8c68fc811d67f647740e965cb8)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install riscv cross compiling tool for clang build
        # apt-get install binutils-riscv64-linux-gnu
        # https://github.com/0day-ci/linux/commit/8fe6f2ed51d6372798149583be6c936c597c500e
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Yafang-Shao/sched-support-schedstats-for-RT-sched-class/20201201-200101
        git checkout 8fe6f2ed51d6372798149583be6c936c597c500e
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=riscv 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   kernel/sched/rt.c:253:6: warning: no previous prototype for function 'free_rt_sched_group' [-Wmissing-prototypes]
   void free_rt_sched_group(struct task_group *tg) { }
        ^
   kernel/sched/rt.c:253:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   void free_rt_sched_group(struct task_group *tg) { }
   ^
   static 
   kernel/sched/rt.c:255:5: warning: no previous prototype for function 'alloc_rt_sched_group' [-Wmissing-prototypes]
   int alloc_rt_sched_group(struct task_group *tg, struct task_group *parent)
       ^
   kernel/sched/rt.c:255:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   int alloc_rt_sched_group(struct task_group *tg, struct task_group *parent)
   ^
   static 
   kernel/sched/rt.c:668:6: warning: no previous prototype for function 'sched_rt_bandwidth_account' [-Wmissing-prototypes]
   bool sched_rt_bandwidth_account(struct rt_rq *rt_rq)
        ^
   kernel/sched/rt.c:668:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   bool sched_rt_bandwidth_account(struct rt_rq *rt_rq)
   ^
   static 
   kernel/sched/rt.c:1288:15: error: no member named 'tg' in 'struct rt_rq'
                   tg = rt_rq->tg;
                        ~~~~~  ^
   kernel/sched/rt.c:1290:14: error: incomplete definition of type 'struct task_group'
                   *stats = tg->stats[cpu];
                            ~~^
   include/linux/sched.h:65:8: note: forward declaration of 'struct task_group'
   struct task_group;
          ^
>> kernel/sched/rt.c:1698:6: warning: no previous prototype for function 'set_next_task_rt' [-Wmissing-prototypes]
   void set_next_task_rt(struct rq *rq, struct task_struct *p, bool first)
        ^
   kernel/sched/rt.c:1698:1: note: declare 'static' if the function is not intended to be used outside of this translation unit
   void set_next_task_rt(struct rq *rq, struct task_struct *p, bool first)
   ^
   static 
   4 warnings and 2 errors generated.

vim +/set_next_task_rt +1698 kernel/sched/rt.c

  1697	
> 1698	void set_next_task_rt(struct rq *rq, struct task_struct *p, bool first)
  1699	{
  1700		struct sched_rt_entity *rt_se = &p->rt;
  1701		struct rt_rq *rt_rq = &rq->rt;
  1702	
  1703		p->se.exec_start = rq_clock_task(rq);
  1704		if (on_rt_rq(&p->rt))
  1705			update_stats_wait_end_rt(rt_rq, rt_se);
  1706	
  1707		/* The running task is never eligible for pushing */
  1708		dequeue_pushable_task(rq, p);
  1709	
  1710		if (!first)
  1711			return;
  1712	
  1713		/*
  1714		 * If prev task was rt, put_prev_task() has already updated the
  1715		 * utilization. We only care of the case where we start to schedule a
  1716		 * rt task
  1717		 */
  1718		if (rq->curr->sched_class != &rt_sched_class)
  1719			update_rt_rq_load_avg(rq_clock_pelt(rq), rq, 0);
  1720	
  1721		rt_queue_push_tasks(rq);
  1722	}
  1723	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 35259 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 6/6] sched, rt: support schedstats for RT sched class
  2020-12-01 11:54 ` [PATCH 6/6] sched, rt: support schedstats " Yafang Shao
  2020-12-01 13:59   ` kernel test robot
  2020-12-01 14:13   ` kernel test robot
@ 2020-12-01 14:30   ` kernel test robot
  2020-12-02  2:07     ` Yafang Shao
  2 siblings, 1 reply; 19+ messages in thread
From: kernel test robot @ 2020-12-01 14:30 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 4825 bytes --]

Hi Yafang,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on v5.10-rc6]
[cannot apply to tip/sched/core next-20201201]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Yafang-Shao/sched-support-schedstats-for-RT-sched-class/20201201-200101
base:    b65054597872ce3aefbc6a666385eabdf9e288da
config: m68k-randconfig-p001-20201201 (attached as .config)
compiler: m68k-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/8fe6f2ed51d6372798149583be6c936c597c500e
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Yafang-Shao/sched-support-schedstats-for-RT-sched-class/20201201-200101
        git checkout 8fe6f2ed51d6372798149583be6c936c597c500e
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=m68k 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   In file included from include/asm-generic/bug.h:5,
                    from arch/m68k/include/asm/bug.h:32,
                    from include/linux/bug.h:5,
                    from include/linux/thread_info.h:12,
                    from arch/m68k/include/asm/current.h:16,
                    from include/linux/sched.h:12,
                    from kernel/sched/sched.h:8,
                    from kernel/sched/rt.c:6:
   include/linux/scatterlist.h: In function 'sg_set_buf':
   arch/m68k/include/asm/page_no.h:33:50: warning: ordered comparison of pointer with null pointer [-Wextra]
      33 | #define virt_addr_valid(kaddr) (((void *)(kaddr) >= (void *)PAGE_OFFSET) && \
         |                                                  ^~
   include/linux/compiler.h:78:42: note: in definition of macro 'unlikely'
      78 | # define unlikely(x) __builtin_expect(!!(x), 0)
         |                                          ^
   include/linux/scatterlist.h:143:2: note: in expansion of macro 'BUG_ON'
     143 |  BUG_ON(!virt_addr_valid(buf));
         |  ^~~~~~
   include/linux/scatterlist.h:143:10: note: in expansion of macro 'virt_addr_valid'
     143 |  BUG_ON(!virt_addr_valid(buf));
         |          ^~~~~~~~~~~~~~~
   kernel/sched/rt.c: At top level:
   kernel/sched/rt.c:668:6: warning: no previous prototype for 'sched_rt_bandwidth_account' [-Wmissing-prototypes]
     668 | bool sched_rt_bandwidth_account(struct rt_rq *rt_rq)
         |      ^~~~~~~~~~~~~~~~~~~~~~~~~~
   kernel/sched/rt.c: In function '__schedstat_from_sched_rt_entity':
   kernel/sched/rt.c:1288:13: error: 'struct rt_rq' has no member named 'tg'
    1288 |   tg = rt_rq->tg;
         |             ^~
   kernel/sched/rt.c: In function 'update_stats_wait_start_rt':
>> kernel/sched/rt.c:1308:22: warning: variable 'p' set but not used [-Wunused-but-set-variable]
    1308 |  struct task_struct *p = NULL;
         |                      ^
   kernel/sched/rt.c: In function 'update_stats_enqueue_sleeper_rt':
   kernel/sched/rt.c:1325:22: warning: variable 'p' set but not used [-Wunused-but-set-variable]
    1325 |  struct task_struct *p = NULL;
         |                      ^
   kernel/sched/rt.c: In function 'update_stats_wait_end_rt':
   kernel/sched/rt.c:1353:22: warning: variable 'p' set but not used [-Wunused-but-set-variable]
    1353 |  struct task_struct *p = NULL;
         |                      ^
   kernel/sched/rt.c: At top level:
   kernel/sched/rt.c:1698:6: warning: no previous prototype for 'set_next_task_rt' [-Wmissing-prototypes]
    1698 | void set_next_task_rt(struct rq *rq, struct task_struct *p, bool first)
         |      ^~~~~~~~~~~~~~~~

vim +/p +1308 kernel/sched/rt.c

  1303	
  1304	static inline void
  1305	update_stats_wait_start_rt(struct rt_rq *rt_rq, struct sched_rt_entity *rt_se)
  1306	{
  1307		struct sched_statistics *stats = NULL;
> 1308		struct task_struct *p = NULL;
  1309	
  1310		if (!schedstat_enabled())
  1311			return;
  1312	
  1313		if (rt_entity_is_task(rt_se))
  1314			p = rt_task_of(rt_se);
  1315	
  1316		__schedstat_from_sched_rt_entity(rt_se, &stats);
  1317	
  1318		__update_stats_wait_start(rq_of_rt_rq(rt_rq), p, stats);
  1319	}
  1320	

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org

[-- Attachment #2: config.gz --]
[-- Type: application/gzip, Size: 20261 bytes --]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 4/6] sched: make schedstats helpers independent of fair sched class
  2020-12-01 12:46   ` Mel Gorman
@ 2020-12-02  2:04     ` Yafang Shao
  0 siblings, 0 replies; 19+ messages in thread
From: Yafang Shao @ 2020-12-02  2:04 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Benjamin Segall, bristot,
	jun qian, LKML, linux-rt-users

On Tue, Dec 1, 2020 at 8:46 PM Mel Gorman <mgorman@suse.de> wrote:
>
> On Tue, Dec 01, 2020 at 07:54:14PM +0800, Yafang Shao wrote:
> > The original prototype of the schedstats helpers are
> >
> > update_stats_wait_*(struct cfs_rq *cfs_rq, struct sched_entity *se)
> >
> > The cfs_rq in these helpers is used to get the rq_clock, and the se is
> > used to get the struct sched_statistics and the struct task_struct. In
> > order to make these helpers available by all sched classes, we can pass
> > the rq, sched_statistics and task_struct directly.
> >
> > Then the new helpers are
> >
> > update_stats_wait_*(struct rq *rq, struct task_struct *p,
> >                   struct sched_statistics *stats)
> >
> > which are independent of fair sched class.
> >
> > To avoid vmlinux growing too large or introducing ovehead when
> > !schedstat_enabled(), some new helpers after schedstat_enabled() are also
> > introduced, Suggested by Mel. These helpers are in sched/stats.c,
> >
> > __update_stats_wait_*(struct rq *rq, struct task_struct *p,
> >                     struct sched_statistics *stats)
> >
> > Cc: Mel Gorman <mgorman@suse.de>
> > Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
>
> Think it's ok, it's mostly code shuffling. I'd have been happier if
> there was evidence showing a before/after comparison of the sched stats
> for something simple like "perf bench sched pipe" and a clear statement
> of no functional change as well as a comparison of the vmlinux files but
> I think it's ok so;
>

Thanks for the review, I will do the comparison as you suggested.

> Acked-by: Mel Gorman <mgorman@suse.de>
>

Thanks!

> I didn't look at the rt.c parts
>
> --
> Mel Gorman
> SUSE Labs



-- 
Thanks
Yafang

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 3/6] sched: make struct sched_statistics independent of fair sched class
  2020-12-01 12:41   ` Mel Gorman
@ 2020-12-02  2:06     ` Yafang Shao
  0 siblings, 0 replies; 19+ messages in thread
From: Yafang Shao @ 2020-12-02  2:06 UTC (permalink / raw)
  To: Mel Gorman
  Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vincent Guittot,
	Dietmar Eggemann, Steven Rostedt, Benjamin Segall, bristot,
	jun qian, LKML, linux-rt-users

On Tue, Dec 1, 2020 at 8:41 PM Mel Gorman <mgorman@suse.de> wrote:
>
> On Tue, Dec 01, 2020 at 07:54:13PM +0800, Yafang Shao wrote:
> > If we want to use schedstats facility, we should move out of
> > struct sched_statistics from the struct sched_entity or add it into other
> > sctructs of sched entity as well. Obviously the latter one is bad because
> > that requires more spaces. So we should move it into a common struct which
> > can be used by all sched classes.
> >
> > The struct sched_statistics is the schedular statistics of a task_struct
> > or a task_group. So we can move it into struct task_struct and
> > struct task_group to achieve the goal.
> >
> > Below is the detailed explaination of the change in the structs.
> >
> > - Before this patch
> >
> > struct task_struct {            |-> struct sched_entity {
> >     ...                         |       ...
> >     struct sched_entity *se; ---|       struct sched_statistics statistics;
> >     struct sched_rt_entity *rt;         ...
> >     ...                                 ...
> > };                                  };
> >
> > struct task_group {             |--> se[0]->statistics : schedstats of CPU0
> >     ...                         |
> >  #ifdef CONFIG_FAIR_GROUP_SCHED |
> >     struct sched_entity **se; --|--> se[1]->statistics : schedstats of CPU1
> >                                 |
> >  #endif                         |
> >                                 |--> se[N]->statistics : schedstats of CPUn
> >
> >  #ifdef CONFIG_FAIR_GROUP_SCHED
> >     struct sched_rt_entity  **rt_se; (N/A)
> >  #endif
> >     ...
> > };
> >
> > The '**se' in task_group is allocated in the fair sched class, which is
> > hard to be reused by other sched class.
> >
> > - After this patch
> >
> > struct task_struct {
> >     ...
> >     struct sched_statistics statistics;
> >     ...
> >     struct sched_entity *se;
> >     struct sched_rt_entity *rt;
> >     ...
> > };
> >
> > struct task_group {                    |---> stats[0] : of CPU0
> >     ...                                |
> >     struct sched_statistics **stats; --|---> stats[1] : of CPU1
> >     ...                                |
> >                                        |---> stats[n] : of CPUn
> >  #ifdef CONFIG_FAIR_GROUP_SCHED
> >     struct sched_entity **se;
> >  #endif
> >  #ifdef CONFIG_RT_GROUP_SCHED
> >     struct sched_rt_entity  **rt_se;
> >  #endif
> >     ...
> > };
> >
> > After the patch it is clearly that both of se or rt_se can easily get the
> > sched_statistics by a task_struct or a task_group.
> >
> > Signed-off-by: Yafang Shao <laoar.shao@gmail.com>
>
> I didn't see anything wrong as such, it's mostly a mechanical
> conversion. The one slight caveat is the potential change in cache
> location for the statistics but it's not necessarily negative. The stats
> potentially move to a different cache line but it's less obvious whether
> that even matters given the location is very similar.
>
> There is increased overhead now when schedstats are *enabled* because
> _schedstat_from_sched_entity() has to be called but it appears that it is
> protected by a schedstat_enabled() check. So ok, schedstats when enabled
> are now a bit more expensive but they were expensive in the first place
> so does it matter?
>
> I'd have been happied if there was a comparison with schedstats enabled
> just in case the overhead is too high but it could also do with a second
> set of eyeballs.
>

Sure, I will do the comparison. Thanks for the review again.


> It's somewhat tentative but
>
> Acked-by: Mel Gorman <mgorman@suse.de>
>
> --
> Mel Gorman
> SUSE Labs



-- 
Thanks
Yafang

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH 6/6] sched, rt: support schedstats for RT sched class
  2020-12-01 14:30   ` kernel test robot
@ 2020-12-02  2:07     ` Yafang Shao
  0 siblings, 0 replies; 19+ messages in thread
From: Yafang Shao @ 2020-12-02  2:07 UTC (permalink / raw)
  To: kbuild-all

[-- Attachment #1: Type: text/plain, Size: 5250 bytes --]

On Tue, Dec 1, 2020 at 10:30 PM kernel test robot <lkp@intel.com> wrote:
>
> Hi Yafang,
>
> Thank you for the patch! Perhaps something to improve:
>
> [auto build test WARNING on v5.10-rc6]
> [cannot apply to tip/sched/core next-20201201]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch]
>
> url:    https://github.com/0day-ci/linux/commits/Yafang-Shao/sched-support-schedstats-for-RT-sched-class/20201201-200101
> base:    b65054597872ce3aefbc6a666385eabdf9e288da
> config: m68k-randconfig-p001-20201201 (attached as .config)
> compiler: m68k-linux-gcc (GCC) 9.3.0
> reproduce (this is a W=1 build):
>         wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
>         chmod +x ~/bin/make.cross
>         # https://github.com/0day-ci/linux/commit/8fe6f2ed51d6372798149583be6c936c597c500e
>         git remote add linux-review https://github.com/0day-ci/linux
>         git fetch --no-tags linux-review Yafang-Shao/sched-support-schedstats-for-RT-sched-class/20201201-200101
>         git checkout 8fe6f2ed51d6372798149583be6c936c597c500e
>         # save the attached .config to linux build tree
>         COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=m68k
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot <lkp@intel.com>
>
> All warnings (new ones prefixed by >>):
>
>    In file included from include/asm-generic/bug.h:5,
>                     from arch/m68k/include/asm/bug.h:32,
>                     from include/linux/bug.h:5,
>                     from include/linux/thread_info.h:12,
>                     from arch/m68k/include/asm/current.h:16,
>                     from include/linux/sched.h:12,
>                     from kernel/sched/sched.h:8,
>                     from kernel/sched/rt.c:6:
>    include/linux/scatterlist.h: In function 'sg_set_buf':
>    arch/m68k/include/asm/page_no.h:33:50: warning: ordered comparison of pointer with null pointer [-Wextra]
>       33 | #define virt_addr_valid(kaddr) (((void *)(kaddr) >= (void *)PAGE_OFFSET) && \
>          |                                                  ^~
>    include/linux/compiler.h:78:42: note: in definition of macro 'unlikely'
>       78 | # define unlikely(x) __builtin_expect(!!(x), 0)
>          |                                          ^
>    include/linux/scatterlist.h:143:2: note: in expansion of macro 'BUG_ON'
>      143 |  BUG_ON(!virt_addr_valid(buf));
>          |  ^~~~~~
>    include/linux/scatterlist.h:143:10: note: in expansion of macro 'virt_addr_valid'
>      143 |  BUG_ON(!virt_addr_valid(buf));
>          |          ^~~~~~~~~~~~~~~
>    kernel/sched/rt.c: At top level:
>    kernel/sched/rt.c:668:6: warning: no previous prototype for 'sched_rt_bandwidth_account' [-Wmissing-prototypes]
>      668 | bool sched_rt_bandwidth_account(struct rt_rq *rt_rq)
>          |      ^~~~~~~~~~~~~~~~~~~~~~~~~~
>    kernel/sched/rt.c: In function '__schedstat_from_sched_rt_entity':
>    kernel/sched/rt.c:1288:13: error: 'struct rt_rq' has no member named 'tg'
>     1288 |   tg = rt_rq->tg;
>          |             ^~
>    kernel/sched/rt.c: In function 'update_stats_wait_start_rt':
> >> kernel/sched/rt.c:1308:22: warning: variable 'p' set but not used [-Wunused-but-set-variable]
>     1308 |  struct task_struct *p = NULL;
>          |                      ^
>    kernel/sched/rt.c: In function 'update_stats_enqueue_sleeper_rt':
>    kernel/sched/rt.c:1325:22: warning: variable 'p' set but not used [-Wunused-but-set-variable]
>     1325 |  struct task_struct *p = NULL;
>          |                      ^
>    kernel/sched/rt.c: In function 'update_stats_wait_end_rt':
>    kernel/sched/rt.c:1353:22: warning: variable 'p' set but not used [-Wunused-but-set-variable]
>     1353 |  struct task_struct *p = NULL;
>          |                      ^
>    kernel/sched/rt.c: At top level:
>    kernel/sched/rt.c:1698:6: warning: no previous prototype for 'set_next_task_rt' [-Wmissing-prototypes]
>     1698 | void set_next_task_rt(struct rq *rq, struct task_struct *p, bool first)
>          |      ^~~~~~~~~~~~~~~~
>
> vim +/p +1308 kernel/sched/rt.c
>
>   1303
>   1304  static inline void
>   1305  update_stats_wait_start_rt(struct rt_rq *rt_rq, struct sched_rt_entity *rt_se)
>   1306  {
>   1307          struct sched_statistics *stats = NULL;
> > 1308          struct task_struct *p = NULL;
>   1309
>   1310          if (!schedstat_enabled())
>   1311                  return;
>   1312
>   1313          if (rt_entity_is_task(rt_se))
>   1314                  p = rt_task_of(rt_se);
>   1315
>   1316          __schedstat_from_sched_rt_entity(rt_se, &stats);
>   1317
>   1318          __update_stats_wait_start(rq_of_rt_rq(rt_rq), p, stats);
>   1319  }
>   1320
>
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/kbuild-all(a)lists.01.org


Thanks for the report. I will check these build failures.

-- 
Thanks
Yafang

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2020-12-02  2:07 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-01 11:54 [PATCH 0/6] sched: support schedstats for RT sched class Yafang Shao
2020-12-01 11:54 ` [PATCH 1/6] sched: don't include stats.h in sched.h Yafang Shao
2020-12-01 11:54 ` [PATCH 2/6] sched, fair: use __schedstat_set() in set_next_entity() Yafang Shao
2020-12-01 12:28   ` Mel Gorman
2020-12-01 11:54 ` [PATCH 3/6] sched: make struct sched_statistics independent of fair sched class Yafang Shao
2020-12-01 12:41   ` Mel Gorman
2020-12-02  2:06     ` Yafang Shao
2020-12-01 13:19   ` kernel test robot
2020-12-01 13:30   ` kernel test robot
2020-12-01 11:54 ` [PATCH 4/6] sched: make schedstats helpers " Yafang Shao
2020-12-01 12:46   ` Mel Gorman
2020-12-02  2:04     ` Yafang Shao
2020-12-01 13:58   ` kernel test robot
2020-12-01 11:54 ` [PATCH 5/6] sched, rt: support sched_stat_runtime tracepoint for RT " Yafang Shao
2020-12-01 11:54 ` [PATCH 6/6] sched, rt: support schedstats " Yafang Shao
2020-12-01 13:59   ` kernel test robot
2020-12-01 14:13   ` kernel test robot
2020-12-01 14:30   ` kernel test robot
2020-12-02  2:07     ` Yafang Shao

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.