linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH] tg: count the sum wait time of an task group
@ 2018-07-02  7:29 王贇
  2018-07-02 12:03 ` Peter Zijlstra
  2018-07-03  5:42 ` [PATCH] tg: show " 王贇
  0 siblings, 2 replies; 11+ messages in thread
From: 王贇 @ 2018-07-02  7:29 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, linux-kernel

Although we can rely on cpuacct to present the cpu usage of task
group, it is hard to tell how intense the competition is between
these groups on cpu resources.

Monitoring the wait time of each process could cost too much, and
there is no good way to accurately represent the conflict with
these info, we need the wait time on group dimension.

Thus we introduced group's wait_sum provided by kernel to represent
the conflict between task groups, whenever a group's cfs_rq ends
waiting, it's wait time accounted to the sum.

The cpu.stat is modified to show the new statistic, like:

   nr_periods 0
   nr_throttled 0
   throttled_time 0
   wait_sum 2035098795584

Now we can monitor the changing on wait_sum to tell how suffering
a task group is in the fight of cpu resources.

Signed-off-by: Michael Wang <yun.wang@linux.alibaba.com>
---
  kernel/sched/core.c  | 2 ++
  kernel/sched/fair.c  | 4 ++++
  kernel/sched/sched.h | 1 +
  3 files changed, 7 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 78d8fac..ac27b8d 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6787,6 +6787,8 @@ static int cpu_cfs_stat_show(struct seq_file *sf, 
void *v)
  	seq_printf(sf, "nr_periods %d\n", cfs_b->nr_periods);
  	seq_printf(sf, "nr_throttled %d\n", cfs_b->nr_throttled);
  	seq_printf(sf, "throttled_time %llu\n", cfs_b->throttled_time);
+	if (schedstat_enabled())
+		seq_printf(sf, "wait_sum %llu\n", tg->wait_sum);

  	return 0;
  }
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 1866e64..ef82ceb 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -862,6 +862,7 @@ static void update_curr_fair(struct rq *rq)
  static inline void
  update_stats_wait_end(struct cfs_rq *cfs_rq, struct sched_entity *se)
  {
+	struct task_group *tg;
  	struct task_struct *p;
  	u64 delta;

@@ -882,6 +883,9 @@ static void update_curr_fair(struct rq *rq)
  			return;
  		}
  		trace_sched_stat_wait(p, delta);
+	} else {
+		tg = group_cfs_rq(se)->tg;
+		__schedstat_add(tg->wait_sum, delta);
  	}

  	__schedstat_set(se->statistics.wait_max,
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 6601baf..bb9b4fb 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -358,6 +358,7 @@ struct task_group {
  	/* runqueue "owned" by this group on each CPU */
  	struct cfs_rq		**cfs_rq;
  	unsigned long		shares;
+	u64			wait_sum;

  #ifdef	CONFIG_SMP
  	/*
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH] tg: count the sum wait time of an task group
  2018-07-02  7:29 [RFC PATCH] tg: count the sum wait time of an task group 王贇
@ 2018-07-02 12:03 ` Peter Zijlstra
  2018-07-03  2:10   ` 王贇
  2018-07-03  5:42 ` [PATCH] tg: show " 王贇
  1 sibling, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2018-07-02 12:03 UTC (permalink / raw)
  To: 王贇; +Cc: Ingo Molnar, linux-kernel

On Mon, Jul 02, 2018 at 03:29:39PM +0800, 王贇 wrote:
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 1866e64..ef82ceb 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -862,6 +862,7 @@ static void update_curr_fair(struct rq *rq)
>  static inline void
>  update_stats_wait_end(struct cfs_rq *cfs_rq, struct sched_entity *se)
>  {
> +	struct task_group *tg;
>  	struct task_struct *p;
>  	u64 delta;
> 
> @@ -882,6 +883,9 @@ static void update_curr_fair(struct rq *rq)
>  			return;
>  		}
>  		trace_sched_stat_wait(p, delta);
> +	} else {
> +		tg = group_cfs_rq(se)->tg;
> +		__schedstat_add(tg->wait_sum, delta);
>  	}

You're joking right? This patch is both broken and utterly insane.

You're wanting to update an effectively global variable for every
schedule action (and its broken because it is without any serialization
or atomics).

NAK

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH] tg: count the sum wait time of an task group
  2018-07-02 12:03 ` Peter Zijlstra
@ 2018-07-03  2:10   ` 王贇
  0 siblings, 0 replies; 11+ messages in thread
From: 王贇 @ 2018-07-03  2:10 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, linux-kernel

Hi, Peter

On 2018/7/2 下午8:03, Peter Zijlstra wrote:
> On Mon, Jul 02, 2018 at 03:29:39PM +0800, 王贇 wrote:
>> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
>> index 1866e64..ef82ceb 100644
>> --- a/kernel/sched/fair.c
>> +++ b/kernel/sched/fair.c
>> @@ -862,6 +862,7 @@ static void update_curr_fair(struct rq *rq)
>>   static inline void
>>   update_stats_wait_end(struct cfs_rq *cfs_rq, struct sched_entity *se)
>>   {
>> +	struct task_group *tg;
>>   	struct task_struct *p;
>>   	u64 delta;
>>
>> @@ -882,6 +883,9 @@ static void update_curr_fair(struct rq *rq)
>>   			return;
>>   		}
>>   		trace_sched_stat_wait(p, delta);
>> +	} else {
>> +		tg = group_cfs_rq(se)->tg;
>> +		__schedstat_add(tg->wait_sum, delta);
>>   	}
> 
> You're joking right? This patch is both broken and utterly insane.
> 
> You're wanting to update an effectively global variable for every
> schedule action (and its broken because it is without any serialization
> or atomics).

Thanks for the reply and sorry for the thoughtless, I'll rewrite
the code to make it per-cpu variable, then assemble the results
on show.

Regards,
Michael Wang


> 
> NAK
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] tg: show the sum wait time of an task group
  2018-07-02  7:29 [RFC PATCH] tg: count the sum wait time of an task group 王贇
  2018-07-02 12:03 ` Peter Zijlstra
@ 2018-07-03  5:42 ` 王贇
  2018-07-04  3:27   ` [PATCH v2] " 王贇
  1 sibling, 1 reply; 11+ messages in thread
From: 王贇 @ 2018-07-03  5:42 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, linux-kernel

Although we can rely on cpuacct to present the cpu usage of task
group, it is hard to tell how intense the competition is between
these groups on cpu resources.

Monitoring the wait time of each process or sched_debug could cost
too much, and there is no good way to accurately represent the
conflict with these info, we need the wait time on group dimension.

Thus we introduced group's wait_sum represent the conflict between
task groups, which is simply sum the wait time of group's cfs_rq.

The 'cpu.stat' is modified to show the statistic, like:

   nr_periods 0
   nr_throttled 0
   throttled_time 0
   wait_sum 2035098795584

Now we can monitor the changing on wait_sum to tell how suffering
a task group is in the fight of cpu resources.

For example:
   (wait_sum - last_wait_sum) * 100 / (nr_cpu * period_ns) == X%

means the task group paid X percentage of period on waiting
for the cpu.

Signed-off-by: Michael Wang <yun.wang@linux.alibaba.com>
---
Since RFC:
   redesigned the way to acquire wait_sum

  kernel/sched/core.c | 8 ++++++++
  1 file changed, 8 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 78d8fac..cbff06b 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6781,6 +6781,8 @@ static int __cfs_schedulable(struct task_group 
*tg, u64 period, u64 quota)

  static int cpu_cfs_stat_show(struct seq_file *sf, void *v)
  {
+	int i;
+	u64 wait_sum = 0;
  	struct task_group *tg = css_tg(seq_css(sf));
  	struct cfs_bandwidth *cfs_b = &tg->cfs_bandwidth;

@@ -6788,6 +6790,12 @@ static int cpu_cfs_stat_show(struct seq_file *sf, 
void *v)
  	seq_printf(sf, "nr_throttled %d\n", cfs_b->nr_throttled);
  	seq_printf(sf, "throttled_time %llu\n", cfs_b->throttled_time);

+	if (schedstat_enabled()) {
+		for_each_possible_cpu(i)
+			wait_sum += tg->se[i]->statistics.wait_sum;
+		seq_printf(sf, "wait_sum %llu\n", wait_sum);
+	}
+
  	return 0;
  }
  #endif /* CONFIG_CFS_BANDWIDTH */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [PATCH v2] tg: show the sum wait time of an task group
  2018-07-03  5:42 ` [PATCH] tg: show " 王贇
@ 2018-07-04  3:27   ` 王贇
  2018-07-09  9:12     ` 王贇
                       ` (4 more replies)
  0 siblings, 5 replies; 11+ messages in thread
From: 王贇 @ 2018-07-04  3:27 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, linux-kernel

Although we can rely on cpuacct to present the cpu usage of task
group, it is hard to tell how intense the competition is between
these groups on cpu resources.

Monitoring the wait time of each process or sched_debug could cost
too much, and there is no good way to accurately represent the
conflict with these info, we need the wait time on group dimension.

Thus we introduced group's wait_sum represent the conflict between
task groups, which is simply sum the wait time of group's cfs_rq.

The 'cpu.stat' is modified to show the statistic, like:

   nr_periods 0
   nr_throttled 0
   throttled_time 0
   wait_sum 2035098795584

Now we can monitor the changing on wait_sum to tell how suffering
a task group is in the fight of cpu resources.

For example:
   (wait_sum - last_wait_sum) * 100 / (nr_cpu * period_ns) == X%

means the task group paid X percentage of period on waiting
for the cpu.

Signed-off-by: Michael Wang <yun.wang@linux.alibaba.com>
---

Since v1:
   Use schedstat_val to avoid compile error
   Check and skip root_task_group

  kernel/sched/core.c | 8 ++++++++
  1 file changed, 8 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 78d8fac..80ab995 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6781,6 +6781,8 @@ static int __cfs_schedulable(struct task_group *tg, u64 period, u64 quota)

  static int cpu_cfs_stat_show(struct seq_file *sf, void *v)
  {
+	int i;
+	u64 ws = 0;
  	struct task_group *tg = css_tg(seq_css(sf));
  	struct cfs_bandwidth *cfs_b = &tg->cfs_bandwidth;

@@ -6788,6 +6790,12 @@ static int cpu_cfs_stat_show(struct seq_file *sf, void *v)
  	seq_printf(sf, "nr_throttled %d\n", cfs_b->nr_throttled);
  	seq_printf(sf, "throttled_time %llu\n", cfs_b->throttled_time);

+	if (schedstat_enabled() && tg != &root_task_group) {
+		for_each_possible_cpu(i)
+			ws += schedstat_val(tg->se[i]->statistics.wait_sum);
+		seq_printf(sf, "wait_sum %llu\n", ws);
+	}
+
  	return 0;
  }
  #endif /* CONFIG_CFS_BANDWIDTH */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] tg: show the sum wait time of an task group
  2018-07-04  3:27   ` [PATCH v2] " 王贇
@ 2018-07-09  9:12     ` 王贇
  2018-07-17  3:28     ` 王贇
                       ` (3 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: 王贇 @ 2018-07-09  9:12 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, linux-kernel



On 2018/7/4 上午11:27, 王贇 wrote:
> Although we can rely on cpuacct to present the cpu usage of task
> group, it is hard to tell how intense the competition is between
> these groups on cpu resources.
> 
> Monitoring the wait time of each process or sched_debug could cost
> too much, and there is no good way to accurately represent the
> conflict with these info, we need the wait time on group dimension.
> 
> Thus we introduced group's wait_sum represent the conflict between
> task groups, which is simply sum the wait time of group's cfs_rq.
> 
> The 'cpu.stat' is modified to show the statistic, like:
> 
>    nr_periods 0
>    nr_throttled 0
>    throttled_time 0
>    wait_sum 2035098795584
> 
> Now we can monitor the changing on wait_sum to tell how suffering
> a task group is in the fight of cpu resources.
> 
> For example:
>    (wait_sum - last_wait_sum) * 100 / (nr_cpu * period_ns) == X%
> 
> means the task group paid X percentage of period on waiting
> for the cpu.

Hi, Peter

How do you think about this proposal?

There are situation that tasks in some group suffered much more
than others, will be good to have some way to easily locate them.

Regards,
Michael Wang

> 
> Signed-off-by: Michael Wang <yun.wang@linux.alibaba.com>
> ---
> 
> Since v1:
>    Use schedstat_val to avoid compile error
>    Check and skip root_task_group
> 
>   kernel/sched/core.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 78d8fac..80ab995 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -6781,6 +6781,8 @@ static int __cfs_schedulable(struct task_group *tg, u64 period, u64 quota)
> 
>   static int cpu_cfs_stat_show(struct seq_file *sf, void *v)
>   {
> +    int i;
> +    u64 ws = 0;
>       struct task_group *tg = css_tg(seq_css(sf));
>       struct cfs_bandwidth *cfs_b = &tg->cfs_bandwidth;
> 
> @@ -6788,6 +6790,12 @@ static int cpu_cfs_stat_show(struct seq_file *sf, void *v)
>       seq_printf(sf, "nr_throttled %d\n", cfs_b->nr_throttled);
>       seq_printf(sf, "throttled_time %llu\n", cfs_b->throttled_time);
> 
> +    if (schedstat_enabled() && tg != &root_task_group) {
> +        for_each_possible_cpu(i)
> +            ws += schedstat_val(tg->se[i]->statistics.wait_sum);
> +        seq_printf(sf, "wait_sum %llu\n", ws);
> +    }
> +
>       return 0;
>   }
>   #endif /* CONFIG_CFS_BANDWIDTH */

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] tg: show the sum wait time of an task group
  2018-07-04  3:27   ` [PATCH v2] " 王贇
  2018-07-09  9:12     ` 王贇
@ 2018-07-17  3:28     ` 王贇
  2018-07-23  9:31     ` Peter Zijlstra
                       ` (2 subsequent siblings)
  4 siblings, 0 replies; 11+ messages in thread
From: 王贇 @ 2018-07-17  3:28 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, linux-kernel

Hi, folks

On 2018/7/4 上午11:27, 王贇 wrote:
> Although we can rely on cpuacct to present the cpu usage of task
> group, it is hard to tell how intense the competition is between
> these groups on cpu resources.
> 
> Monitoring the wait time of each process or sched_debug could cost
> too much, and there is no good way to accurately represent the
> conflict with these info, we need the wait time on group dimension.
> 
> Thus we introduced group's wait_sum represent the conflict between
> task groups, which is simply sum the wait time of group's cfs_rq.
> 
> The 'cpu.stat' is modified to show the statistic, like:
> 
>    nr_periods 0
>    nr_throttled 0
>    throttled_time 0
>    wait_sum 2035098795584
> 
> Now we can monitor the changing on wait_sum to tell how suffering
> a task group is in the fight of cpu resources.
> 
> For example:
>    (wait_sum - last_wait_sum) * 100 / (nr_cpu * period_ns) == X%
> 
> means the task group paid X percentage of period on waiting
> for the cpu.

Any comments please?

Regards,
Michael Wang


> 
> Signed-off-by: Michael Wang <yun.wang@linux.alibaba.com>
> ---
> 
> Since v1:
>    Use schedstat_val to avoid compile error
>    Check and skip root_task_group
> 
>   kernel/sched/core.c | 8 ++++++++
>   1 file changed, 8 insertions(+)
> 
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 78d8fac..80ab995 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -6781,6 +6781,8 @@ static int __cfs_schedulable(struct task_group *tg, u64 period, u64 quota)
> 
>   static int cpu_cfs_stat_show(struct seq_file *sf, void *v)
>   {
> +    int i;
> +    u64 ws = 0;
>       struct task_group *tg = css_tg(seq_css(sf));
>       struct cfs_bandwidth *cfs_b = &tg->cfs_bandwidth;
> 
> @@ -6788,6 +6790,12 @@ static int cpu_cfs_stat_show(struct seq_file *sf, void *v)
>       seq_printf(sf, "nr_throttled %d\n", cfs_b->nr_throttled);
>       seq_printf(sf, "throttled_time %llu\n", cfs_b->throttled_time);
> 
> +    if (schedstat_enabled() && tg != &root_task_group) {
> +        for_each_possible_cpu(i)
> +            ws += schedstat_val(tg->se[i]->statistics.wait_sum);
> +        seq_printf(sf, "wait_sum %llu\n", ws);
> +    }
> +
>       return 0;
>   }
>   #endif /* CONFIG_CFS_BANDWIDTH */

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] tg: show the sum wait time of an task group
  2018-07-04  3:27   ` [PATCH v2] " 王贇
  2018-07-09  9:12     ` 王贇
  2018-07-17  3:28     ` 王贇
@ 2018-07-23  9:31     ` Peter Zijlstra
  2018-07-23 12:32       ` 王贇
  2018-07-23 13:31     ` [PATCH v3] " 王贇
  2018-07-25 14:23     ` [tip:sched/core] sched/debug: Show the sum wait time of a " tip-bot for Yun Wang
  4 siblings, 1 reply; 11+ messages in thread
From: Peter Zijlstra @ 2018-07-23  9:31 UTC (permalink / raw)
  To: 王贇; +Cc: Ingo Molnar, linux-kernel

On Wed, Jul 04, 2018 at 11:27:27AM +0800, 王贇 wrote:

> @@ -6788,6 +6790,12 @@ static int cpu_cfs_stat_show(struct seq_file *sf, void *v)
>  	seq_printf(sf, "nr_throttled %d\n", cfs_b->nr_throttled);
>  	seq_printf(sf, "throttled_time %llu\n", cfs_b->throttled_time);
> 
> +	if (schedstat_enabled() && tg != &root_task_group) {

I put the variables here.

> +		for_each_possible_cpu(i)
> +			ws += schedstat_val(tg->se[i]->statistics.wait_sum);

This doesn't quite work on 32bit archs, but I'm not sure I care enough
to be bothered about that.

> +		seq_printf(sf, "wait_sum %llu\n", ws);
> +	}

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] tg: show the sum wait time of an task group
  2018-07-23  9:31     ` Peter Zijlstra
@ 2018-07-23 12:32       ` 王贇
  0 siblings, 0 replies; 11+ messages in thread
From: 王贇 @ 2018-07-23 12:32 UTC (permalink / raw)
  To: Peter Zijlstra; +Cc: Ingo Molnar, linux-kernel



On 2018/7/23 下午5:31, Peter Zijlstra wrote:
> On Wed, Jul 04, 2018 at 11:27:27AM +0800, 王贇 wrote:
> 
>> @@ -6788,6 +6790,12 @@ static int cpu_cfs_stat_show(struct seq_file *sf, void *v)
>>   	seq_printf(sf, "nr_throttled %d\n", cfs_b->nr_throttled);
>>   	seq_printf(sf, "throttled_time %llu\n", cfs_b->throttled_time);
>>
>> +	if (schedstat_enabled() && tg != &root_task_group) {
> 
> I put the variables here.

Will do that in next version :-)

> 
>> +		for_each_possible_cpu(i)
>> +			ws += schedstat_val(tg->se[i]->statistics.wait_sum);
> 
> This doesn't quite work on 32bit archs, but I'm not sure I care enough
> to be bothered about that.

Could easily overflow then... hope they won't really care
about the group conflicts.

Regards,
Michael Wang

> 
>> +		seq_printf(sf, "wait_sum %llu\n", ws);
>> +	}

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v3] tg: show the sum wait time of an task group
  2018-07-04  3:27   ` [PATCH v2] " 王贇
                       ` (2 preceding siblings ...)
  2018-07-23  9:31     ` Peter Zijlstra
@ 2018-07-23 13:31     ` 王贇
  2018-07-25 14:23     ` [tip:sched/core] sched/debug: Show the sum wait time of a " tip-bot for Yun Wang
  4 siblings, 0 replies; 11+ messages in thread
From: 王贇 @ 2018-07-23 13:31 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, linux-kernel

Although we can rely on cpuacct to present the cpu usage of task
group, it is hard to tell how intense the competition is between
these groups on cpu resources.

Monitoring the wait time of each process or sched_debug could cost
too much, and there is no good way to accurately represent the
conflict with these info, we need the wait time on group dimension.

Thus we introduced group's wait_sum represent the conflict between
task groups, which is simply sum the wait time of group's cfs_rq.

The 'cpu.stat' is modified to show the statistic, like:

   nr_periods 0
   nr_throttled 0
   throttled_time 0
   wait_sum 2035098795584

Now we can monitor the changing on wait_sum to tell how suffering
a task group is in the fight of cpu resources.

For example:
   (wait_sum - last_wait_sum) * 100 / (nr_cpu * period_ns) == X%

means the task group paid X percentage of period on waiting
for the cpu.

Signed-off-by: Michael Wang <yun.wang@linux.alibaba.com>
---

Since v2:
   Declare variables inside branch (From Peter).


  kernel/sched/core.c | 9 +++++++++
  1 file changed, 9 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 78d8fac..2a7bb7c 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6788,6 +6788,15 @@ static int cpu_cfs_stat_show(struct seq_file *sf, void *v)
  	seq_printf(sf, "nr_throttled %d\n", cfs_b->nr_throttled);
  	seq_printf(sf, "throttled_time %llu\n", cfs_b->throttled_time);

+	if (schedstat_enabled() && tg != &root_task_group) {
+		int i;
+		u64 ws = 0;
+
+		for_each_possible_cpu(i)
+			ws += schedstat_val(tg->se[i]->statistics.wait_sum);
+		seq_printf(sf, "wait_sum %llu\n", ws);
+	}
+
  	return 0;
  }
  #endif /* CONFIG_CFS_BANDWIDTH */
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [tip:sched/core] sched/debug: Show the sum wait time of a task group
  2018-07-04  3:27   ` [PATCH v2] " 王贇
                       ` (3 preceding siblings ...)
  2018-07-23 13:31     ` [PATCH v3] " 王贇
@ 2018-07-25 14:23     ` tip-bot for Yun Wang
  4 siblings, 0 replies; 11+ messages in thread
From: tip-bot for Yun Wang @ 2018-07-25 14:23 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, tglx, peterz, mingo, yun.wang, torvalds

Commit-ID:  3d6c50c27bd6418dceb51642540ecfcb8ca708c2
Gitweb:     https://git.kernel.org/tip/3d6c50c27bd6418dceb51642540ecfcb8ca708c2
Author:     Yun Wang <yun.wang@linux.alibaba.com>
AuthorDate: Wed, 4 Jul 2018 11:27:27 +0800
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Wed, 25 Jul 2018 11:41:05 +0200

sched/debug: Show the sum wait time of a task group

Although we can rely on cpuacct to present the CPU usage of task
groups, it is hard to tell how intense the competition is between
these groups on CPU resources.

Monitoring the wait time or sched_debug of each process could be
very expensive, and there is no good way to accurately represent the
conflict with these info, we need the wait time on group dimension.

Thus we introduce group's wait_sum to represent the resource conflict
between task groups, which is simply the sum of the wait time of
the group's cfs_rq.

The 'cpu.stat' is modified to show the statistic, like:

   nr_periods 0
   nr_throttled 0
   throttled_time 0
   wait_sum 2035098795584

Now we can monitor the changes of wait_sum to tell how much a
a task group is suffering in the fight of CPU resources.

For example:

   (wait_sum - last_wait_sum) * 100 / (nr_cpu * period_ns) == X%

means the task group paid X percentage of period on waiting
for the CPU.

Signed-off-by: Michael Wang <yun.wang@linux.alibaba.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/ff7dae3b-e5f9-7157-1caa-ff02c6b23dc1@linux.alibaba.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/core.c | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index fc177c06e490..2bc391a574e6 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -6748,6 +6748,16 @@ static int cpu_cfs_stat_show(struct seq_file *sf, void *v)
 	seq_printf(sf, "nr_throttled %d\n", cfs_b->nr_throttled);
 	seq_printf(sf, "throttled_time %llu\n", cfs_b->throttled_time);
 
+	if (schedstat_enabled() && tg != &root_task_group) {
+		u64 ws = 0;
+		int i;
+
+		for_each_possible_cpu(i)
+			ws += schedstat_val(tg->se[i]->statistics.wait_sum);
+
+		seq_printf(sf, "wait_sum %llu\n", ws);
+	}
+
 	return 0;
 }
 #endif /* CONFIG_CFS_BANDWIDTH */

^ permalink raw reply related	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2018-07-25 14:23 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-07-02  7:29 [RFC PATCH] tg: count the sum wait time of an task group 王贇
2018-07-02 12:03 ` Peter Zijlstra
2018-07-03  2:10   ` 王贇
2018-07-03  5:42 ` [PATCH] tg: show " 王贇
2018-07-04  3:27   ` [PATCH v2] " 王贇
2018-07-09  9:12     ` 王贇
2018-07-17  3:28     ` 王贇
2018-07-23  9:31     ` Peter Zijlstra
2018-07-23 12:32       ` 王贇
2018-07-23 13:31     ` [PATCH v3] " 王贇
2018-07-25 14:23     ` [tip:sched/core] sched/debug: Show the sum wait time of a " tip-bot for Yun Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).