linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC v2 0/2] introduece sched-idle balance
@ 2022-04-09 13:51 Abel Wu
  2022-04-09 13:51 ` [RFC v2 1/2] sched/fair: filter out overloaded cpus in SIS Abel Wu
  2022-04-09 13:51 ` [RFC v2 2/2] sched/fair: introduce sched-idle balance Abel Wu
  0 siblings, 2 replies; 14+ messages in thread
From: Abel Wu @ 2022-04-09 13:51 UTC (permalink / raw)
  To: Peter Zijlstra, Mel Gorman, Vincent Guittot
  Cc: joshdon, linux-kernel, Abel Wu

Overloaded runqueues are those who have more than one pullable non-idle tasks
on them (given the sched-idle cpus are treated as idle cpus). The idle tasks,
which are either assigned SCHED_IDLE policy or in idle cpu cgroup, are tracked
through rq->cfs.idle_h_nr_running.

It would bring benefit if the unoccupied cpus (sched-idle/idle cpus) can start
serving as soon as the non-idle tasks are available. Lots of effort has already
been put into this:

  - Task wakeup: the scheduler tries to find such cpus to make full
    use of cpu capacity. But due to scalability issues, the search
    depth is bounded to a reasonable limit. IOW it's possible that
    a task is woken up on a busy cpu while unoccupied cpus are still
    out there. Fortunately, these imbalance can be fixed by load
    balancers.

  - Load balancing: periodic (normal/idle) and newly-idle balancing.
    The former is regulated by intervals on each sched-domain and
    the intervals can prevent the sched-idle cpus from pulling the
    non-idle tasks. While the latter is triggered only when the cpus
    become really idle, and the sched-idle cpus are not the case.
    The balancing can also be stopped by other constrains.

So the unoccupied cpus could still get a chance to co-exist with overloaded
ones, and in this case the sched-idle balancing will try to fast fix the
imbalance between them at some extent. That is:

  - Record the overloaded cpus so we can know where to pull from.
    This is done in tick to regulate manipulation on shared data.

  - Filter out the overloaded cpus in SIS to improve the idle cpu
    searching efficiency. The more overloaded the system is, the
    less cpus we will search.

  - Quit early in periodic load balancing if the cpu becomes busy.
    This is similar to what we do in newly-idle case in which we
    stop balancing once we got some work to do.

  - The newly-idle balancing will try harder to pull the non-idle
    tasks if overloaded cpus exist.

So the whole thing can be treated as an extension to the existing load balance
mechanisms on sched-idle cpus.

Benchmark
=========

Tests are done in an Intel(R) Xeon(R) Platinum 8260 CPU @ 2.40GHz machine with
2 NUMA nodes each of which has 24 cores with SMT2 enabled, so 96 CPUs in total.
Tests are separated into two parts:

  - quiet: benchmarks running inside a normal cpu cgroup in a clean
    environment

  - noisy: benchmarks running inside a normal cpu cgroup, and noise
    from an idle cpu cgroup, the two cgroups are at same level. The
    noise is produced by perf messaging benchmark which occupies ~20%
    cpu capacity in my server.

	perf bench sched messaging -g 1 -l 2000000000

All of the benchmarks are done by mmtests with "--no-monitor --performance"
parameters, and with cpu turbo disabled.

As Mel required, the SIS filter part is also benchmarded separately, and the
additional SIS statistics comes from his patch [1].

Results
=======

vanilla:  tip sched/core 6255b48aebfd (v5.17-rc5)
filter:   vanilla + patch1
balancer: filter + patch2

a) hackbench-process-pipes

[quiet]
                             vanilla                 filter               balancer
Amean     1        0.3077 (   0.00%)      0.3340 (  -8.56%)      0.2523 (  17.98%)
Amean     4        0.7703 (   0.00%)      0.7360 (   4.46%)      0.7220 *   6.27%*
Amean     7        0.9253 (   0.00%)      0.9320 (  -0.72%)      0.9153 (   1.08%)
Amean     12       1.2397 (   0.00%)      1.1197 *   9.68%*      1.0867 *  12.34%*
Amean     21       2.8003 (   0.00%)      2.4663 *  11.93%*      2.4490 *  12.55%*
Amean     30       5.2430 (   0.00%)      4.1620 *  20.62%*      4.2220 *  19.47%*
Amean     48       7.9023 (   0.00%)      6.7040 *  15.16%*      7.0897 *  10.28%*
Amean     79       9.6197 (   0.00%)      8.6310 *  10.28%*      8.6590 *   9.99%*
Amean     110      9.8170 (   0.00%)      9.3533 (   4.72%)      9.2813 (   5.46%)
Amean     141     11.8070 (   0.00%)     11.3003 *   4.29%*     11.4297 *   3.20%*
Amean     172     14.1017 (   0.00%)     13.3063 *   5.64%*     13.2740 *   5.87%*
Amean     203     15.9723 (   0.00%)     16.0813 (  -0.68%)     15.2627 *   4.44%*
Amean     234     18.6590 (   0.00%)     18.2387 (   2.25%)     17.4267 *   6.60%*
Amean     265     20.8473 (   0.00%)     20.4460 (   1.93%)     19.8227 *   4.92%*
Amean     296     22.5817 (   0.00%)     22.5307 (   0.23%)     21.6657 *   4.06%*

Ops SIS Search Efficiency            16.51          27.35          27.37
Ops SIS Domain Search Eff            16.35          27.08          27.05
Ops SIS Fast Success Rate             1.19           1.32           1.58
Ops SIS Success Rate                  1.96           2.60           3.06

[noisy]
                             vanilla                 filter               balancer
Amean     1        0.3627 (   0.00%)      0.2850 (  21.42%)      0.2830 (  21.97%)
Amean     4        0.7290 (   0.00%)      0.7313 (  -0.32%)      0.7467 (  -2.42%)
Amean     7        0.9353 (   0.00%)      0.9443 (  -0.96%)      0.9107 *   2.64%*
Amean     12       1.1973 (   0.00%)      1.2283 (  -2.59%)      1.1013 *   8.02%*
Amean     21       2.5100 (   0.00%)      2.4190 (   3.63%)      2.3003 *   8.35%*
Amean     30       4.7437 (   0.00%)      3.9367 *  17.01%*      3.7473 *  21.00%*
Amean     48       7.4943 (   0.00%)      6.9470 *   7.30%*      7.0430 (   6.02%)
Amean     79       9.4737 (   0.00%)      8.6923 *   8.25%*      8.8930 *   6.13%*
Amean     110     10.7420 (   0.00%)      9.5363 *  11.22%*      9.2847 *  13.57%*
Amean     141     12.2293 (   0.00%)     11.0513 *   9.63%*     10.9750 *  10.26%*
Amean     172     14.0277 (   0.00%)     13.7407 (   2.05%)     12.9350 *   7.79%*
Amean     203     16.6930 (   0.00%)     15.6677 *   6.14%*     15.1910 *   9.00%*
Amean     234     18.3360 (   0.00%)     17.6750 (   3.60%)     17.1403 *   6.52%*
Amean     265     20.8383 (   0.00%)     19.9793 (   4.12%)     19.4780 *   6.53%*
Amean     296     23.3080 (   0.00%)     21.7693 *   6.60%*     21.3567 *   8.37%*

Ops SIS Search Efficiency            16.53          27.23          27.53
Ops SIS Domain Search Eff            16.27          26.81          27.07
Ops SIS Fast Success Rate             1.87           2.10           2.29
Ops SIS Success Rate                  2.93           3.81           4.01

b) hackbench-process-sockets

[quiet]
                             vanilla                 filter               balancer
Amean     1        0.5213 (   0.00%)      0.5283 (  -1.34%)      0.5210 (   0.06%)
Amean     4        1.4733 (   0.00%)      1.4757 (  -0.16%)      1.4577 *   1.06%*
Amean     7        2.4620 (   0.00%)      2.5083 *  -1.88%*      2.5113 *  -2.00%*
Amean     12       4.1283 (   0.00%)      4.2143 *  -2.08%*      4.1967 *  -1.66%*
Amean     21       7.0153 (   0.00%)      7.1620 *  -2.09%*      7.0977 *  -1.17%*
Amean     30       9.8900 (   0.00%)     10.0380 *  -1.50%*      9.9303 *  -0.41%*
Amean     48      15.6753 (   0.00%)     16.0283 *  -2.25%*     15.8213 *  -0.93%*
Amean     79      26.3443 (   0.00%)     26.7147 *  -1.41%*     26.3397 (   0.02%)
Amean     110     36.5437 (   0.00%)     37.3277 *  -2.15%*     36.6197 *  -0.21%*
Amean     141     46.5327 (   0.00%)     47.4803 *  -2.04%*     46.8620 *  -0.71%*
Amean     172     56.5907 (   0.00%)     58.0840 *  -2.64%*     56.9503 *  -0.64%*
Amean     203     66.8573 (   0.00%)     68.3780 *  -2.27%*     67.2330 *  -0.56%*
Amean     234     77.2470 (   0.00%)     78.8317 *  -2.05%*     77.6773 *  -0.56%*
Amean     265     87.5577 (   0.00%)     89.3343 *  -2.03%*     87.3617 *   0.22%*
Amean     296     97.6160 (   0.00%)     99.6320 *  -2.07%*     97.6450 (  -0.03%)

Ops SIS Search Efficiency            16.50          27.17          27.19
Ops SIS Domain Search Eff            16.32          26.88          26.84
Ops SIS Fast Success Rate             1.32           1.44           1.75
Ops SIS Success Rate                  2.06           2.74           3.34

[noisy]
                             vanilla                 filter               balancer
Amean     1        0.6120 (   0.00%)      0.6760 * -10.46%*      0.6037 (   1.36%)
Amean     4        1.5867 (   0.00%)      1.6540 *  -4.24%*      1.5120 *   4.71%*
Amean     7        2.5940 (   0.00%)      2.6820 *  -3.39%*      2.5047 *   3.44%*
Amean     12       4.3407 (   0.00%)      4.4680 *  -2.93%*      4.1513 *   4.36%*
Amean     21       7.3083 (   0.00%)      7.5073 *  -2.72%*      6.8467 *   6.32%*
Amean     30       9.9750 (   0.00%)     10.4920 *  -5.18%*      9.7220 *   2.54%*
Amean     48      15.9123 (   0.00%)     16.5143 *  -3.78%*     15.2683 *   4.05%*
Amean     79      26.2180 (   0.00%)     27.2497 *  -3.93%*     25.1087 *   4.23%*
Amean     110     36.8237 (   0.00%)     38.8303 *  -5.45%*     35.8823 *   2.56%*
Amean     141     47.3357 (   0.00%)     49.6817 *  -4.96%*     45.5723 *   3.73%*
Amean     172     57.4477 (   0.00%)     60.8553 *  -5.93%*     55.5380 *   3.32%*
Amean     203     67.6290 (   0.00%)     71.8117 *  -6.18%*     65.6033 *   3.00%*
Amean     234     77.8347 (   0.00%)     82.9577 *  -6.58%*     75.8713 *   2.52%*
Amean     265     88.4680 (   0.00%)     94.2737 *  -6.56%*     85.8547 *   2.95%*
Amean     296     99.2210 (   0.00%)    105.9357 *  -6.77%*     95.8777 *   3.37%*

Ops SIS Search Efficiency            16.51          27.21          27.62
Ops SIS Domain Search Eff            16.22          26.74          27.07
Ops SIS Fast Success Rate             2.13           2.38           2.73
Ops SIS Success Rate                  3.21           4.20           4.66

c) hackbench-thread-pipes

[quiet]
                             vanilla                 filter               balancer
Amean     1        0.2770 (   0.00%)      0.2783 (  -0.48%)      0.2777 (  -0.24%)
Amean     4        0.7707 (   0.00%)      0.7770 (  -0.82%)      0.7687 (   0.26%)
Amean     7        0.9400 (   0.00%)      0.9500 (  -1.06%)      0.9230 (   1.81%)
Amean     12       1.4740 (   0.00%)      1.4447 (   1.99%)      1.4213 (   3.57%)
Amean     21       3.8517 (   0.00%)      3.5223 *   8.55%*      3.3837 *  12.15%*
Amean     30       6.7057 (   0.00%)      5.9243 *  11.65%*      5.8200 *  13.21%*
Amean     48       8.9877 (   0.00%)      8.3357 *   7.25%*      8.0573 *  10.35%*
Amean     79      10.3807 (   0.00%)      9.6767 *   6.78%*      9.6947 *   6.61%*
Amean     110     11.1830 (   0.00%)     10.5263 *   5.87%*     10.5247 (   5.89%)
Amean     141     12.9987 (   0.00%)     12.6463 (   2.71%)     12.6697 (   2.53%)
Amean     172     15.2327 (   0.00%)     15.6350 (  -2.64%)     14.6007 (   4.15%)
Amean     203     17.7090 (   0.00%)     17.4287 (   1.58%)     16.9330 (   4.38%)
Amean     234     19.4380 (   0.00%)     19.6747 (  -1.22%)     19.5393 (  -0.52%)
Amean     265     24.2407 (   0.00%)     22.7170 (   6.29%)     21.4700 *  11.43%*
Amean     296     26.5937 (   0.00%)     26.4057 (   0.71%)     24.2627 *   8.77%*

Ops SIS Search Efficiency            16.54          27.49          27.60
Ops SIS Domain Search Eff            16.34          27.18          27.23
Ops SIS Fast Success Rate             1.41           1.52           1.84
Ops SIS Success Rate                  2.21           2.87           3.46

[noisy]
                             vanilla                 filter               balancer
Amean     1        0.3097 (   0.00%)      0.3373 *  -8.93%*      0.3140 (  -1.40%)
Amean     4        0.7730 (   0.00%)      0.7870 (  -1.81%)      0.7500 *   2.98%*
Amean     7        0.9580 (   0.00%)      0.9520 (   0.63%)      0.9270 (   3.24%)
Amean     12       1.4840 (   0.00%)      1.4103 (   4.96%)      1.3970 *   5.86%*
Amean     21       3.4623 (   0.00%)      3.1507 *   9.00%*      3.1517 *   8.97%*
Amean     30       6.1033 (   0.00%)      5.6037 (   8.19%)      5.7150 (   6.36%)
Amean     48       8.9833 (   0.00%)      8.6097 *   4.16%*      8.5367 *   4.97%*
Amean     79      11.0237 (   0.00%)      9.5840 *  13.06%*      9.7860 *  11.23%*
Amean     110     12.4213 (   0.00%)     10.9570 *  11.79%*     10.5110 *  15.38%*
Amean     141     13.4703 (   0.00%)     12.5320 *   6.97%*     12.4137 (   7.84%)
Amean     172     17.0973 (   0.00%)     15.6843 *   8.26%*     14.6183 *  14.50%*
Amean     203     18.8867 (   0.00%)     17.3487 *   8.14%*     17.8260 *   5.62%*
Amean     234     22.0430 (   0.00%)     19.8977 *   9.73%*     19.6240 *  10.97%*
Amean     265     23.9877 (   0.00%)     21.9163 *   8.63%*     22.5933 (   5.81%)
Amean     296     27.1667 (   0.00%)     25.2857 (   6.92%)     23.8423 *  12.24%*

Ops SIS Search Efficiency            16.57          27.57          28.04
Ops SIS Domain Search Eff            16.29          27.11          27.50
Ops SIS Fast Success Rate             2.06           2.29           2.67
Ops SIS Success Rate                  3.11           4.04           4.47

d) hackbench-thread-sockets

[quiet]
                             vanilla                 filter               balancer
Amean     1        0.5773 (   0.00%)      0.5767 (   0.12%)      0.5723 (   0.87%)
Amean     4        1.5083 (   0.00%)      1.5117 (  -0.22%)      1.5027 (   0.38%)
Amean     7        2.5453 (   0.00%)      2.5890 *  -1.72%*      2.5823 *  -1.45%*
Amean     12       4.2763 (   0.00%)      4.3357 *  -1.39%*      4.3203 *  -1.03%*
Amean     21       7.2050 (   0.00%)      7.3777 *  -2.40%*      7.2923 *  -1.21%*
Amean     30      10.1203 (   0.00%)     10.3367 *  -2.14%*     10.2107 *  -0.89%*
Amean     48      16.0403 (   0.00%)     16.3427 *  -1.88%*     16.1080 (  -0.42%)
Amean     79      27.0260 (   0.00%)     27.2193 (  -0.72%)     26.7280 *   1.10%*
Amean     110     37.4073 (   0.00%)     38.1427 *  -1.97%*     37.7580 *  -0.94%*
Amean     141     47.7927 (   0.00%)     48.7607 *  -2.03%*     48.5797 *  -1.65%*
Amean     172     58.1860 (   0.00%)     59.5697 *  -2.38%*     58.7377 *  -0.95%*
Amean     203     68.6033 (   0.00%)     70.6163 *  -2.93%*     69.0957 *  -0.72%*
Amean     234     79.2923 (   0.00%)     81.1143 *  -2.30%*     79.9310 *  -0.81%*
Amean     265     89.6240 (   0.00%)     91.8750 *  -2.51%*     90.4663 *  -0.94%*
Amean     296    100.2680 (   0.00%)    102.9560 *  -2.68%*    101.0817 *  -0.81%*

Ops SIS Search Efficiency            16.58          25.34          24.62
Ops SIS Domain Search Eff            16.12          24.59          23.81
Ops SIS Fast Success Rate             3.30           3.91           4.33
Ops SIS Success Rate                  3.99           5.77           7.09

[noisy]
                             vanilla                 filter               balancer
Amean     1        0.6607 (   0.00%)      0.7033 *  -6.46%*      0.6727 (  -1.82%)
Amean     4        1.6270 (   0.00%)      1.6507 *  -1.45%*      1.5457 *   5.00%*
Amean     7        2.6850 (   0.00%)      2.7483 *  -2.36%*      2.5850 *   3.72%*
Amean     12       4.5273 (   0.00%)      4.6250 *  -2.16%*      4.2457 *   6.22%*
Amean     21       7.5403 (   0.00%)      7.6453 (  -1.39%)      7.1340 *   5.39%*
Amean     30      10.4227 (   0.00%)     10.7350 *  -3.00%*      9.9497 *   4.54%*
Amean     48      16.2257 (   0.00%)     16.9840 *  -4.67%*     15.7340 *   3.03%*
Amean     79      27.2820 (   0.00%)     27.9947 *  -2.61%*     25.9023 *   5.06%*
Amean     110     37.9413 (   0.00%)     40.0053 *  -5.44%*     36.9113 *   2.71%*
Amean     141     48.3913 (   0.00%)     51.3303 *  -6.07%*     47.0660 *   2.74%*
Amean     172     58.9597 (   0.00%)     62.8973 *  -6.68%*     57.1193 *   3.12%*
Amean     203     70.1857 (   0.00%)     74.3620 *  -5.95%*     68.0957 *   2.98%*
Amean     234     80.2250 (   0.00%)     86.1143 *  -7.34%*     78.4873 *   2.17%*
Amean     265     91.2950 (   0.00%)     97.7753 *  -7.10%*     88.9163 *   2.61%*
Amean     296    102.1407 (   0.00%)    109.6700 *  -7.37%*    100.2663 *   1.84%*

Ops SIS Search Efficiency            16.57          24.79          25.86
Ops SIS Domain Search Eff            15.80          23.54          24.40
Ops SIS Fast Success Rate             5.55           6.59           7.48
Ops SIS Success Rate                  7.20          10.39          11.38

e) schbench

[quiet]
                                   vanilla                 filter               balancer
Lat 50.0th-qrtle-1         5.00 (   0.00%)        5.00 (   0.00%)        5.00 (   0.00%)
Lat 75.0th-qrtle-1         5.00 (   0.00%)        5.00 (   0.00%)        5.00 (   0.00%)
Lat 90.0th-qrtle-1         5.00 (   0.00%)        5.00 (   0.00%)        5.00 (   0.00%)
Lat 95.0th-qrtle-1         6.00 (   0.00%)        6.00 (   0.00%)        5.00 (  16.67%)
Lat 99.0th-qrtle-1         6.00 (   0.00%)        7.00 ( -16.67%)        6.00 (   0.00%)
Lat 99.5th-qrtle-1         7.00 (   0.00%)        8.00 ( -14.29%)        6.00 (  14.29%)
Lat 99.9th-qrtle-1         8.00 (   0.00%)       12.00 ( -50.00%)        6.00 (  25.00%)
Lat 50.0th-qrtle-2         6.00 (   0.00%)        6.00 (   0.00%)        6.00 (   0.00%)
Lat 75.0th-qrtle-2         6.00 (   0.00%)        6.00 (   0.00%)        7.00 ( -16.67%)
Lat 90.0th-qrtle-2         7.00 (   0.00%)        7.00 (   0.00%)        7.00 (   0.00%)
Lat 95.0th-qrtle-2         7.00 (   0.00%)        7.00 (   0.00%)        7.00 (   0.00%)
Lat 99.0th-qrtle-2         8.00 (   0.00%)        8.00 (   0.00%)        8.00 (   0.00%)
Lat 99.5th-qrtle-2         9.00 (   0.00%)        8.00 (  11.11%)        9.00 (   0.00%)
Lat 99.9th-qrtle-2         9.00 (   0.00%)        9.00 (   0.00%)        9.00 (   0.00%)
Lat 50.0th-qrtle-4         9.00 (   0.00%)        8.00 (  11.11%)        8.00 (  11.11%)
Lat 75.0th-qrtle-4        10.00 (   0.00%)       10.00 (   0.00%)       10.00 (   0.00%)
Lat 90.0th-qrtle-4        11.00 (   0.00%)       11.00 (   0.00%)       11.00 (   0.00%)
Lat 95.0th-qrtle-4        12.00 (   0.00%)       12.00 (   0.00%)       11.00 (   8.33%)
Lat 99.0th-qrtle-4        13.00 (   0.00%)       13.00 (   0.00%)       13.00 (   0.00%)
Lat 99.5th-qrtle-4        14.00 (   0.00%)       14.00 (   0.00%)       14.00 (   0.00%)
Lat 99.9th-qrtle-4        16.00 (   0.00%)       16.00 (   0.00%)       16.00 (   0.00%)
Lat 50.0th-qrtle-8        13.00 (   0.00%)       12.00 (   7.69%)       12.00 (   7.69%)
Lat 75.0th-qrtle-8        16.00 (   0.00%)       15.00 (   6.25%)       16.00 (   0.00%)
Lat 90.0th-qrtle-8        18.00 (   0.00%)       17.00 (   5.56%)       18.00 (   0.00%)
Lat 95.0th-qrtle-8        19.00 (   0.00%)       18.00 (   5.26%)       18.00 (   5.26%)
Lat 99.0th-qrtle-8        23.00 (   0.00%)       21.00 (   8.70%)       20.00 (  13.04%)
Lat 99.5th-qrtle-8        24.00 (   0.00%)       23.00 (   4.17%)       22.00 (   8.33%)
Lat 99.9th-qrtle-8        29.00 (   0.00%)       25.00 (  13.79%)       26.00 (  10.34%)
Lat 50.0th-qrtle-16       20.00 (   0.00%)       21.00 (  -5.00%)       20.00 (   0.00%)
Lat 75.0th-qrtle-16       27.00 (   0.00%)       28.00 (  -3.70%)       27.00 (   0.00%)
Lat 90.0th-qrtle-16       32.00 (   0.00%)       33.00 (  -3.12%)       31.00 (   3.12%)
Lat 95.0th-qrtle-16       33.00 (   0.00%)       35.00 (  -6.06%)       33.00 (   0.00%)
Lat 99.0th-qrtle-16       38.00 (   0.00%)       40.00 (  -5.26%)       38.00 (   0.00%)
Lat 99.5th-qrtle-16       40.00 (   0.00%)       42.00 (  -5.00%)       41.00 (  -2.50%)
Lat 99.9th-qrtle-16       43.00 (   0.00%)       49.00 ( -13.95%)       50.00 ( -16.28%)
Lat 50.0th-qrtle-32       38.00 (   0.00%)       37.00 (   2.63%)       36.00 (   5.26%)
Lat 75.0th-qrtle-32       55.00 (   0.00%)       54.00 (   1.82%)       53.00 (   3.64%)
Lat 90.0th-qrtle-32       65.00 (   0.00%)       64.00 (   1.54%)       62.00 (   4.62%)
Lat 95.0th-qrtle-32       69.00 (   0.00%)       68.00 (   1.45%)       67.00 (   2.90%)
Lat 99.0th-qrtle-32       80.00 (   0.00%)       80.00 (   0.00%)       76.00 (   5.00%)
Lat 99.5th-qrtle-32       85.00 (   0.00%)       90.00 (  -5.88%)       82.00 (   3.53%)
Lat 99.9th-qrtle-32       93.00 (   0.00%)      135.00 ( -45.16%)       90.00 (   3.23%)
Lat 50.0th-qrtle-47       55.00 (   0.00%)       55.00 (   0.00%)       53.00 (   3.64%)
Lat 75.0th-qrtle-47       81.00 (   0.00%)       81.00 (   0.00%)       77.00 (   4.94%)
Lat 90.0th-qrtle-47       97.00 (   0.00%)       97.00 (   0.00%)       92.00 (   5.15%)
Lat 95.0th-qrtle-47      104.00 (   0.00%)      103.00 (   0.96%)       99.00 (   4.81%)
Lat 99.0th-qrtle-47      120.00 (   0.00%)      120.00 (   0.00%)      119.00 (   0.83%)
Lat 99.5th-qrtle-47      131.00 (   0.00%)      133.00 (  -1.53%)      127.00 (   3.05%)
Lat 99.9th-qrtle-47      161.00 (   0.00%)      163.00 (  -1.24%)      165.00 (  -2.48%)

Ops SIS Search Efficiency            83.44          77.01          84.31
Ops SIS Domain Search Eff             4.56           4.11           4.47
Ops SIS Fast Success Rate            99.05          98.72          99.13
Ops SIS Success Rate                 99.65          99.54          99.77

[noisy]
                                   vanilla                 filter               balancer
Lat 50.0th-qrtle-1         7.00 (   0.00%)        8.00 ( -14.29%)        9.00 ( -28.57%)
Lat 75.0th-qrtle-1         8.00 (   0.00%)        9.00 ( -12.50%)       10.00 ( -25.00%)
Lat 90.0th-qrtle-1         9.00 (   0.00%)       10.00 ( -11.11%)       11.00 ( -22.22%)
Lat 95.0th-qrtle-1         9.00 (   0.00%)       11.00 ( -22.22%)       12.00 ( -33.33%)
Lat 99.0th-qrtle-1        11.00 (   0.00%)       13.00 ( -18.18%)       15.00 ( -36.36%)
Lat 99.5th-qrtle-1        13.00 (   0.00%)       14.00 (  -7.69%)       18.00 ( -38.46%)
Lat 99.9th-qrtle-1        13.00 (   0.00%)       14.00 (  -7.69%)       18.00 ( -38.46%)
Lat 50.0th-qrtle-2         9.00 (   0.00%)        9.00 (   0.00%)        9.00 (   0.00%)
Lat 75.0th-qrtle-2        11.00 (   0.00%)       10.00 (   9.09%)       11.00 (   0.00%)
Lat 90.0th-qrtle-2        12.00 (   0.00%)       11.00 (   8.33%)       12.00 (   0.00%)
Lat 95.0th-qrtle-2        13.00 (   0.00%)       12.00 (   7.69%)       14.00 (  -7.69%)
Lat 99.0th-qrtle-2        15.00 (   0.00%)       15.00 (   0.00%)       15.00 (   0.00%)
Lat 99.5th-qrtle-2        15.00 (   0.00%)       17.00 ( -13.33%)       16.00 (  -6.67%)
Lat 99.9th-qrtle-2        17.00 (   0.00%)       19.00 ( -11.76%)       21.00 ( -23.53%)
Lat 50.0th-qrtle-4        12.00 (   0.00%)       12.00 (   0.00%)       12.00 (   0.00%)
Lat 75.0th-qrtle-4        14.00 (   0.00%)       15.00 (  -7.14%)       14.00 (   0.00%)
Lat 90.0th-qrtle-4        16.00 (   0.00%)       17.00 (  -6.25%)       16.00 (   0.00%)
Lat 95.0th-qrtle-4        17.00 (   0.00%)       18.00 (  -5.88%)       16.00 (   5.88%)
Lat 99.0th-qrtle-4        20.00 (   0.00%)       20.00 (   0.00%)       19.00 (   5.00%)
Lat 99.5th-qrtle-4        21.00 (   0.00%)       20.00 (   4.76%)       21.00 (   0.00%)
Lat 99.9th-qrtle-4        26.00 (   0.00%)       21.00 (  19.23%)       22.00 (  15.38%)
Lat 50.0th-qrtle-8        17.00 (   0.00%)       16.00 (   5.88%)       17.00 (   0.00%)
Lat 75.0th-qrtle-8        22.00 (   0.00%)       21.00 (   4.55%)       21.00 (   4.55%)
Lat 90.0th-qrtle-8        26.00 (   0.00%)       24.00 (   7.69%)       25.00 (   3.85%)
Lat 95.0th-qrtle-8        28.00 (   0.00%)       25.00 (  10.71%)       26.00 (   7.14%)
Lat 99.0th-qrtle-8        32.00 (   0.00%)       29.00 (   9.38%)       29.00 (   9.38%)
Lat 99.5th-qrtle-8        34.00 (   0.00%)       31.00 (   8.82%)       31.00 (   8.82%)
Lat 99.9th-qrtle-8        42.00 (   0.00%)       34.00 (  19.05%)       35.00 (  16.67%)
Lat 50.0th-qrtle-16       29.00 (   0.00%)       30.00 (  -3.45%)       27.00 (   6.90%)
Lat 75.0th-qrtle-16       40.00 (   0.00%)       41.00 (  -2.50%)       37.00 (   7.50%)
Lat 90.0th-qrtle-16       46.00 (   0.00%)       49.00 (  -6.52%)       43.00 (   6.52%)
Lat 95.0th-qrtle-16       49.00 (   0.00%)       53.00 (  -8.16%)       46.00 (   6.12%)
Lat 99.0th-qrtle-16       55.00 (   0.00%)       59.00 (  -7.27%)       52.00 (   5.45%)
Lat 99.5th-qrtle-16       57.00 (   0.00%)       62.00 (  -8.77%)       55.00 (   3.51%)
Lat 99.9th-qrtle-16       63.00 (   0.00%)       84.00 ( -33.33%)       62.00 (   1.59%)
Lat 50.0th-qrtle-32       48.00 (   0.00%)       49.00 (  -2.08%)       49.00 (  -2.08%)
Lat 75.0th-qrtle-32       69.00 (   0.00%)       70.00 (  -1.45%)       70.00 (  -1.45%)
Lat 90.0th-qrtle-32       83.00 (   0.00%)       87.00 (  -4.82%)       85.00 (  -2.41%)
Lat 95.0th-qrtle-32       90.00 (   0.00%)       96.00 (  -6.67%)       91.00 (  -1.11%)
Lat 99.0th-qrtle-32      102.00 (   0.00%)      110.00 (  -7.84%)      104.00 (  -1.96%)
Lat 99.5th-qrtle-32      107.00 (   0.00%)      115.00 (  -7.48%)      109.00 (  -1.87%)
Lat 99.9th-qrtle-32      112.00 (   0.00%)      151.00 ( -34.82%)      118.00 (  -5.36%)
Lat 50.0th-qrtle-47       64.00 (   0.00%)       64.00 (   0.00%)       66.00 (  -3.12%)
Lat 75.0th-qrtle-47       93.00 (   0.00%)       92.00 (   1.08%)       96.00 (  -3.23%)
Lat 90.0th-qrtle-47      113.00 (   0.00%)      110.00 (   2.65%)      116.00 (  -2.65%)
Lat 95.0th-qrtle-47      126.00 (   0.00%)      122.00 (   3.17%)      126.00 (   0.00%)
Lat 99.0th-qrtle-47      159.00 (   0.00%)      137.00 (  13.84%)      143.00 (  10.06%)
Lat 99.5th-qrtle-47      231.00 (   0.00%)      144.00 (  37.66%)      152.00 (  34.20%)
Lat 99.9th-qrtle-47     9136.00 (   0.00%)      181.00 (  98.02%)     1190.00 (  86.97%)

Ops SIS Search Efficiency            94.62          94.47          94.79
Ops SIS Domain Search Eff            15.30          14.91          15.31
Ops SIS Fast Success Rate            98.97          98.97          99.01
Ops SIS Success Rate                 99.98          99.98          99.99

f) tbench4 Throughput

[quiet]
                             vanilla                 filter               balancer
Hmean     1        287.90 (   0.00%)      290.20 *   0.80%*      296.20 *   2.88%*
Hmean     2        582.58 (   0.00%)      586.96 *   0.75%*      599.12 *   2.84%*
Hmean     4       1151.83 (   0.00%)     1165.32 *   1.17%*     1181.72 *   2.60%*
Hmean     8       2317.84 (   0.00%)     2311.67 *  -0.27%*     2344.42 *   1.15%*
Hmean     16      4530.15 (   0.00%)     4555.04 *   0.55%*     4561.38 *   0.69%*
Hmean     32      7643.04 (   0.00%)     7644.20 (   0.02%)     7707.31 *   0.84%*
Hmean     64      9310.48 (   0.00%)     9664.41 *   3.80%*     9615.94 *   3.28%*
Hmean     128    21837.26 (   0.00%)    13628.90 * -37.59%*    15996.13 * -26.75%*
Hmean     256    20789.62 (   0.00%)    22550.77 *   8.47%*    22776.78 *   9.56%*
Hmean     384    19329.85 (   0.00%)    19786.13 *   2.36%*    17499.33 *  -9.47%*

Ops SIS Search Efficiency            22.09          23.39          23.54
Ops SIS Domain Search Eff            14.64          14.78          14.96
Ops SIS Fast Success Rate            39.51          43.21          42.87
Ops SIS Success Rate                 45.19          57.62          54.69

[noisy]
                             vanilla                 filter               balancer
Hmean     1        275.05 (   0.00%)      275.66 *   0.22%*      276.99 *   0.70%*
Hmean     2        543.31 (   0.00%)      548.55 *   0.97%*      549.71 *   1.18%*
Hmean     4       1077.41 (   0.00%)     1082.60 *   0.48%*     1090.07 *   1.18%*
Hmean     8       2119.68 (   0.00%)     2140.60 *   0.99%*     2133.47 *   0.65%*
Hmean     16      3914.25 (   0.00%)     3938.84 *   0.63%*     3914.42 (   0.00%)
Hmean     32      6574.06 (   0.00%)     6650.66 *   1.17%*     6622.36 *   0.73%*
Hmean     64      8757.89 (   0.00%)     9047.06 *   3.30%*     8987.00 *   2.62%*
Hmean     128    20533.22 (   0.00%)    15573.19 * -24.16%*    20746.83 *   1.04%*
Hmean     256    20194.51 (   0.00%)    18961.24 *  -6.11%*    20115.34 *  -0.39%*
Hmean     384    17552.64 (   0.00%)    17949.46 *   2.26%*    19796.18 *  12.78%*

Ops SIS Search Efficiency            22.30          25.21          28.33
Ops SIS Domain Search Eff            14.87          16.69          19.60
Ops SIS Fast Success Rate            39.15          40.58          38.35
Ops SIS Success Rate                 43.88          49.88          43.08

g) netperf-udp

[quiet]
                                    vanilla                 filter               balancer
Hmean     send-64         184.86 (   0.00%)      182.76 *  -1.14%*      185.78 (   0.50%)
Hmean     send-128        368.11 (   0.00%)      362.97 *  -1.40%*      371.56 (   0.94%)
Hmean     send-256        730.95 (   0.00%)      717.07 *  -1.90%*      728.71 (  -0.31%)
Hmean     send-1024      2804.94 (   0.00%)     2782.30 *  -0.81%*     2825.97 (   0.75%)
Hmean     send-2048      5355.49 (   0.00%)     5228.70 *  -2.37%*     5370.87 (   0.29%)
Hmean     send-3312      8235.83 (   0.00%)     8247.11 (   0.14%)     8298.53 (   0.76%)
Hmean     send-4096      9916.04 (   0.00%)    10012.30 *   0.97%*    10086.82 *   1.72%*
Hmean     send-8192     16743.15 (   0.00%)    16847.61 (   0.62%)    16856.65 (   0.68%)
Hmean     send-16384    26512.04 (   0.00%)    26512.69 (   0.00%)    26537.69 (   0.10%)
Hmean     recv-64         184.86 (   0.00%)      182.76 *  -1.14%*      185.78 (   0.50%)
Hmean     recv-128        368.11 (   0.00%)      362.97 *  -1.40%*      371.56 (   0.94%)
Hmean     recv-256        730.95 (   0.00%)      717.07 *  -1.90%*      728.71 (  -0.31%)
Hmean     recv-1024      2804.94 (   0.00%)     2782.30 *  -0.81%*     2825.97 (   0.75%)
Hmean     recv-2048      5355.49 (   0.00%)     5228.70 *  -2.37%*     5370.87 (   0.29%)
Hmean     recv-3312      8235.83 (   0.00%)     8247.11 (   0.14%)     8298.53 (   0.76%)
Hmean     recv-4096      9916.04 (   0.00%)    10012.30 *   0.97%*    10086.78 *   1.72%*
Hmean     recv-8192     16743.10 (   0.00%)    16847.59 (   0.62%)    16856.55 (   0.68%)
Hmean     recv-16384    26512.04 (   0.00%)    26512.68 (   0.00%)    26537.69 (   0.10%)

Ops SIS Search Efficiency           100.00         100.00         100.00
Ops SIS Domain Search Eff            20.00          23.09          24.32
Ops SIS Fast Success Rate           100.00         100.00         100.00
Ops SIS Success Rate                100.00         100.00         100.00

[noisy]
                                    vanilla                 filter               balancer
Hmean     send-64         180.48 (   0.00%)      181.99 (   0.84%)      182.52 (   1.13%)
Hmean     send-128        350.18 (   0.00%)      360.14 *   2.85%*      367.83 *   5.04%*
Hmean     send-256        708.12 (   0.00%)      707.57 (  -0.08%)      723.46 *   2.17%*
Hmean     send-1024      2752.72 (   0.00%)     2757.50 (   0.17%)     2781.04 (   1.03%)
Hmean     send-2048      5218.99 (   0.00%)     5127.64 (  -1.75%)     5332.42 *   2.17%*
Hmean     send-3312      8037.54 (   0.00%)     8054.75 (   0.21%)     8179.42 (   1.77%)
Hmean     send-4096      9834.51 (   0.00%)     9782.38 (  -0.53%)     9901.92 (   0.69%)
Hmean     send-8192     15947.03 (   0.00%)    16072.71 (   0.79%)    16700.60 *   4.73%*
Hmean     send-16384    25479.72 (   0.00%)    24922.78 (  -2.19%)    25751.27 (   1.07%)
Hmean     recv-64         180.45 (   0.00%)      181.97 (   0.84%)      182.49 (   1.13%)
Hmean     recv-128        350.10 (   0.00%)      360.06 *   2.84%*      367.78 *   5.05%*
Hmean     recv-256        707.73 (   0.00%)      707.27 (  -0.06%)      723.00 *   2.16%*
Hmean     recv-1024      2750.10 (   0.00%)     2755.09 (   0.18%)     2778.95 (   1.05%)
Hmean     recv-2048      5213.28 (   0.00%)     5121.07 (  -1.77%)     5326.50 *   2.17%*
Hmean     recv-3312      8026.50 (   0.00%)     8045.41 (   0.24%)     8172.19 (   1.82%)
Hmean     recv-4096      9821.89 (   0.00%)     9769.07 (  -0.54%)     9889.91 (   0.69%)
Hmean     recv-8192     15922.95 (   0.00%)    16052.20 (   0.81%)    16679.50 *   4.75%*
Hmean     recv-16384    25441.27 (   0.00%)    24876.79 (  -2.22%)    25706.80 (   1.04%)

Ops SIS Search Efficiency            98.46          98.50          98.41
Ops SIS Domain Search Eff            24.87          24.95          24.91
Ops SIS Fast Success Rate            99.48          99.50          99.46
Ops SIS Success Rate                100.00         100.00         100.00

h) netperf-tcp

[quiet]
                               vanilla                 filter               balancer
Hmean     64         839.51 (   0.00%)      831.51 (  -0.95%)      869.34 *   3.55%*
Hmean     128       1629.70 (   0.00%)     1635.25 (   0.34%)     1692.19 *   3.83%*
Hmean     256       3046.92 (   0.00%)     3019.34 *  -0.91%*     3106.67 *   1.96%*
Hmean     1024     10164.80 (   0.00%)    10023.82 (  -1.39%)    10317.10 *   1.50%*
Hmean     2048     16942.50 (   0.00%)    16519.73 *  -2.50%*    17435.42 *   2.91%*
Hmean     3312     21379.01 (   0.00%)    21065.52 *  -1.47%*    21642.12 *   1.23%*
Hmean     4096     23452.47 (   0.00%)    23539.17 (   0.37%)    23987.73 *   2.28%*
Hmean     8192     29084.04 (   0.00%)    28763.16 *  -1.10%*    29710.51 *   2.15%*
Hmean     16384    33512.05 (   0.00%)    33124.21 *  -1.16%*    34055.83 *   1.62%*

Ops SIS Search Efficiency           100.00         100.00         100.00
Ops SIS Domain Search Eff            18.08          21.82          25.44
Ops SIS Fast Success Rate           100.00         100.00         100.00
Ops SIS Success Rate                100.00         100.00         100.00

[noisy]
                             vanilla                 filter               balancer
Hmean     64         868.21 (   0.00%)      833.12 *  -4.04%*      805.06 *  -7.27%*
Hmean     128       1657.75 (   0.00%)     1608.11 *  -2.99%*     1554.89 *  -6.20%*
Hmean     256       3093.27 (   0.00%)     2953.87 *  -4.51%*     2950.83 *  -4.60%*
Hmean     1024     10168.94 (   0.00%)     9589.55 *  -5.70%*     9712.72 *  -4.49%*
Hmean     2048     16539.60 (   0.00%)    16031.40 *  -3.07%*    15902.98 *  -3.85%*
Hmean     3312     20710.95 (   0.00%)    20174.85 *  -2.59%*    20144.92 *  -2.73%*
Hmean     4096     22728.08 (   0.00%)    22633.01 (  -0.42%)    22419.51 (  -1.36%)
Hmean     8192     28017.31 (   0.00%)    27761.16 (  -0.91%)    27667.10 (  -1.25%)
Hmean     16384    32314.91 (   0.00%)    32084.40 (  -0.71%)    32090.03 (  -0.70%)

Ops SIS Search Efficiency            97.95          98.04          98.01
Ops SIS Domain Search Eff            24.72          24.73          24.71
Ops SIS Fast Success Rate            99.31          99.34          99.33
Ops SIS Success Rate                100.00         100.00         100.00


Conclusion
==========

The results didn't show a global win, but the balancer did outperform vanilla in
lots of the benchmarks both in quiet and noisy environment. The SIS filter makes
SIS more efficient as expected, and the balancer does even better by making the
overloaded cpu mask more accurate.

The only obvious regression is netperf-tcp in noisy environment, and I haven't
yet figured out why, and the more interesting thing is that the netperf/tbench
results on another machine (used in my patch v1) showed a suspicious 50%~90%
improvement by this patch series. It might be worthy of digging deeper.

Comments and tests are appreciated!

---

v2:
  - several optimizations on sched-idle balancing
  - ignore asym topos in can_migrate_task
  - add more benchmarks including SIS efficiency
  - re-organize patch as suggested by Mel

v1 can be found at [2].

[1] https://lore.kernel.org/lkml/20210726102247.21437-2-mgorman@techsingularity.net/
[2] https://lore.kernel.org/lkml/20220217154403.6497-5-wuyun.abel@bytedance.com/

Abel Wu (2):
  sched/fair: filter out overloaded cpus in SIS
  sched/fair: introduce sched-idle balance

 include/linux/sched/idle.h     |   1 +
 include/linux/sched/topology.h |  12 +++
 kernel/sched/core.c            |   2 +
 kernel/sched/fair.c            | 210 +++++++++++++++++++++++++++++++++++++++--
 kernel/sched/sched.h           |   8 ++
 kernel/sched/topology.c        |   4 +-
 6 files changed, 228 insertions(+), 9 deletions(-)

-- 
2.11.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2022-04-27 13:17 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-09 13:51 [RFC v2 0/2] introduece sched-idle balance Abel Wu
2022-04-09 13:51 ` [RFC v2 1/2] sched/fair: filter out overloaded cpus in SIS Abel Wu
2022-04-12  1:23   ` Josh Don
2022-04-12 17:55     ` Abel Wu
2022-04-13 23:49       ` Josh Don
2022-04-14 15:36         ` Abel Wu
2022-04-15 23:21           ` Josh Don
2022-04-25  7:02   ` [sched/fair] 6b433275e3: stress-ng.sock.ops_per_sec 16.2% improvement kernel test robot
2022-04-09 13:51 ` [RFC v2 2/2] sched/fair: introduce sched-idle balance Abel Wu
2022-04-12  1:59   ` Josh Don
2022-04-12 17:56     ` Abel Wu
2022-04-14  0:08       ` Josh Don
2022-04-14 15:38         ` Abel Wu
2022-04-27 13:15   ` [sched/fair] ae44f2177f: reaim.jobs_per_min 2.3% improvement kernel test robot

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).