linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] remove runnable_load_avg and improve group_classify
@ 2020-02-11 17:46 Vincent Guittot
  2020-02-11 17:46 ` [PATCH 1/4] sched/fair: reorder enqueue/dequeue_task_fair path Vincent Guittot
                   ` (4 more replies)
  0 siblings, 5 replies; 32+ messages in thread
From: Vincent Guittot @ 2020-02-11 17:46 UTC (permalink / raw)
  To: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
	mgorman, linux-kernel
  Cc: pauld, parth, valentin.schneider, Vincent Guittot

NUMA load balancing is the last remaining piece of code that uses the 
runnable_load_avg of PELT to balance tasks between nodes. The normal
load_balance has replaced it by a better description of the current state
of the group of cpus.  The same policy can be applied to the numa
balancing.

Once unused, runnable_load_avg can be replaced by a simpler runnable_avg
signal that tracks the waiting time of tasks on rq. Currently, the state
of a group of CPUs is defined thanks to the number of running task and the
level of utilization of rq. But the utilization can be temporarly low
after the migration of a task whereas the rq is still overloaded with
tasks. In such case where tasks were competing for the rq, the
runnable_avg will stay high after the migration.

Some hackbench results:

- small arm64 dual quad cores system
hackbench -l (2560/#grp) -g #grp

grp    tip/sched/core         +patchset              improvement
1       1,327(+/-10,06 %)     1,247(+/-5,45 %)       5,97 %
4       1,250(+/- 2,55 %)     1,207(+/-2,12 %)       3,42 %
8       1,189(+/- 1,47 %)     1,179(+/-1,93 %)       0,90 %
16      1,221(+/- 3,25 %)     1,219(+/-2,44 %)       0,16 %						

- large arm64 2 nodes / 224 cores system
hackbench -l (256000/#grp) -g #grp

grp    tip/sched/core         +patchset              improvement
1      14,197(+/- 2,73 %)     13,917(+/- 2,19 %)     1,98 %
4       6,817(+/- 1,27 %)      6,523(+/-11,96 %)     4,31 %
16      2,930(+/- 1,07 %)      2,911(+/- 1,08 %)     0,66 %
32      2,735(+/- 1,71 %)      2,725(+/- 1,53 %)     0,37 %
64      2,702(+/- 0,32 %)      2,717(+/- 1,07 %)    -0,53 %
128     3,533(+/-14,66 %)     3,123(+/-12,47 %)     11,59 %
256     3,918(+/-19,93 %)     3,390(+/- 5,93 %)     13,47 %

The significant improvement for 128 and 256 should be taken with care
because of some instabilities over iterations without the patchset.

The table below shows figures of the classification of sched group during
load balance (idle, newly or busy lb) with the disribution according to
the number of running tasks for:
    hackbench -l 640 -g 4 on octo cores

                 tip/sched/core  +patchset
state
has spare            3973        1934	
        nr_running					
            0        1965        1858
            1         518          56
            2         369          18
            3         270           2
            4+        851           0
						
fully busy            546        1018	
        nr_running					
            0           0           0
            1         546        1018
            2           0           0
            3           0           0
            4+          0           0
						
overloaded           2109        3056	
        nr_running					
            0           0           0
            1           0           0
            2         241         483
            3         170         348
            4+       1698        2225

total                6628        6008	

Without the patchset, there is a significant number of time that a CPU has
spare capacity with more than 1 running task. Although this is a valid
case, this is not a state that should often happen when 160 tasks are
competing on 8 cores like for this test. The patchset fixes the situation
by taking into account the runnable_avg, which stays high after the
migration of a task on another CPU.

Vincent Guittot (4):
  sched/fair: reorder enqueue/dequeue_task_fair path
  sched/numa: replace runnable_load_avg by load_avg
  sched/fair: replace runnable load average by runnable average
  sched/fair: Take into runnable_avg to classify group

 include/linux/sched.h |  17 ++-
 kernel/sched/core.c   |   2 -
 kernel/sched/debug.c  |  17 +--
 kernel/sched/fair.c   | 335 ++++++++++++++++++++++--------------------
 kernel/sched/pelt.c   |  45 +++---
 kernel/sched/sched.h  |  29 +++-
 6 files changed, 241 insertions(+), 204 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2020-02-14  7:48 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-02-11 17:46 [PATCH 0/4] remove runnable_load_avg and improve group_classify Vincent Guittot
2020-02-11 17:46 ` [PATCH 1/4] sched/fair: reorder enqueue/dequeue_task_fair path Vincent Guittot
2020-02-12 13:20   ` Mel Gorman
2020-02-12 14:47     ` Vincent Guittot
2020-02-12 16:11       ` Mel Gorman
2020-02-11 17:46 ` [RFC 2/4] sched/numa: replace runnable_load_avg by load_avg Vincent Guittot
2020-02-12 13:37   ` Mel Gorman
2020-02-12 15:03     ` Vincent Guittot
2020-02-12 16:04       ` Mel Gorman
2020-02-12 19:49     ` Mel Gorman
2020-02-12 21:29       ` Mel Gorman
2020-02-13  8:05       ` Vincent Guittot
2020-02-13  9:24         ` Mel Gorman
     [not found]         ` <20200213131658.9600-1-hdanton@sina.com>
2020-02-13 13:46           ` Mel Gorman
2020-02-13 15:00             ` Phil Auld
2020-02-13 15:14               ` Mel Gorman
2020-02-13 16:11                 ` Vincent Guittot
2020-02-13 16:34                   ` Mel Gorman
2020-02-13 16:38                     ` Vincent Guittot
2020-02-13 17:02                       ` Mel Gorman
2020-02-13 17:15                         ` Vincent Guittot
2020-02-11 17:46 ` [RFC 3/4] sched/fair: replace runnable load average by runnable average Vincent Guittot
2020-02-12 14:30   ` Mel Gorman
2020-02-14  7:42     ` Vincent Guittot
2020-02-13 17:36   ` Peter Zijlstra
2020-02-14  7:43     ` Vincent Guittot
2020-02-11 17:46 ` [RFC 4/4] sched/fair: Take into runnable_avg to classify group Vincent Guittot
2020-02-13 18:32   ` Valentin Schneider
2020-02-13 18:37     ` Valentin Schneider
2020-02-14  7:48       ` Vincent Guittot
2020-02-11 21:04 ` [PATCH 0/4] remove runnable_load_avg and improve group_classify Mel Gorman
2020-02-12  8:16   ` Vincent Guittot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).