linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Question about sched_prio_to_weight values
@ 2019-10-06 22:32 Francesco Poli
  2019-10-07  9:13 ` Valentin Schneider
  0 siblings, 1 reply; 5+ messages in thread
From: Francesco Poli @ 2019-10-06 22:32 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 4003 bytes --]

Hello,
I am trying to understand something about the internals of the
completely fair scheduler (CFS) used in the Linux kernel.

I have a question about the mapping between nice levels and weights,
I hope someone can shed some light on those numbers.
Please note that I am not subscribed to the LKML: could you please Cc
me on replies? Thanks for your understanding.


As far as I can see, the mapping is hard-coded in the
sched_prio_to_weight array, which is defined in kernel/sched/core.c
The [latest code] defines the following values:

  const int sched_prio_to_weight[40] = {
   /* -20 */     88761,     71755,     56483,     46273,     36291,
   /* -15 */     29154,     23254,     18705,     14949,     11916,
   /* -10 */      9548,      7620,      6100,      4904,      3906,
   /*  -5 */      3121,      2501,      1991,      1586,      1277,
   /*   0 */      1024,       820,       655,       526,       423,
   /*   5 */       335,       272,       215,       172,       137,
   /*  10 */       110,        87,        70,        56,        45,
   /*  15 */        36,        29,        23,        18,        15,
  };

but I found the same values in much more ancient versions
(the values probably date back to the first patch that introduced
the CFS in the kernel).

[latest code]: <https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/sched/core.c?h=v5.4-rc1>

Now, the comments in the code suggest that each value is obtained by
dividing the previous one by 1.25: I found [some] [documentation] that
seems to confirm that the "weight is roughly equivalent to
1024/(1.25)^(nice)"

[some]: <https://blog.shichao.io/2015/07/22/relationships_among_nice_priority_and_weight_in_linux_kernel.html>
[documentation]: <www.iaiai.org/journals/index.php/IEE/article/download/9/19>

I tried to see whether I could re-obtain those numbers by rounding (to
integers) the values computed with that formula, but the result does not
look quite right.
By using calc(1), I get:

  ; for (nice = -20; nice < 20; ++nice) { 
  ;;     w = 1024/(1.25)^(nice);
  ;;     printf('%3d %7d (%.2f)\n', nice, round(w), w);
  ;; }
  -20   88818 (~88817.84)
  -19   71054 (~71054.27)
  -18   56843 (~56843.42)
  -17   45475 (~45474.74)
  -16   36380 (~36379.79)
  -15   29104 (~29103.83)
  -14   23283 (~23283.06)
  -13   18626 (~18626.45)
  -12   14901 (~14901.16)
  -11   11921 (~11920.93)
  -10    9537 (~9536.74)
   -9    7629 (~7629.39)
   -8    6104 (~6103.52)
   -7    4883 (~4882.81)
   -6    3906 (3906.25)
   -5    3125 (3125)
   -4    2500 (2500)
   -3    2000 (2000)
   -2    1600 (1600)
   -1    1280 (1280)
    0    1024 (1024)
    1     819 (819.2)
    2     655 (655.36)
    3     524 (~524.29)
    4     419 (~419.43)
    5     336 (~335.54)
    6     268 (~268.44)
    7     215 (~214.75)
    8     172 (~171.80)
    9     137 (~137.44)
   10     110 (~109.95)
   11      88 (~87.96)
   12      70 (~70.37)
   13      56 (~56.29)
   14      45 (~45.04)
   15      36 (~36.03)
   16      29 (~28.82)
   17      23 (~23.06)
   18      18 (~18.45)
   19      15 (~14.76)

where some numbers correspond, while others differ from those found
in kernel/sched/core.c ...

I tried to tweak the 1.25 factor: 1.2499599 makes me get
weight == 88761 for nice == -20, but other numbers still differ.
A non-linear curve fitting with grace(1) leads to a factor of 1.25018,
which still produces values that fail to completely match those found
in kernel/sched/core.c ...

I searched the web and the mailing list archives, but I failed to find
an answer to this question. Could someone please explain me how those
numbers were picked?

Thanks for your time!


-- 
 http://www.inventati.org/frx/
 There's not a second to spare! To the laboratory!
..................................................... Francesco Poli .
 GnuPG key fpr == CA01 1147 9CD2 EFDF FB82  3925 3E1C 27E1 1F69 BFFE

[-- Attachment #2: Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Question about sched_prio_to_weight values
  2019-10-06 22:32 Question about sched_prio_to_weight values Francesco Poli
@ 2019-10-07  9:13 ` Valentin Schneider
  2019-10-07 20:41   ` Francesco Poli
  0 siblings, 1 reply; 5+ messages in thread
From: Valentin Schneider @ 2019-10-07  9:13 UTC (permalink / raw)
  To: Francesco Poli, linux-kernel

Hi Francesco,

On 06/10/2019 23:32, Francesco Poli wrote:
[...]
> I searched the web and the mailing list archives, but I failed to find
> an answer to this question. Could someone please explain me how those
> numbers were picked?
> 

Following the blame rabbit hole I found this:

254753dc321e ("sched: make the multiplication table more accurate")
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=254753dc321ea2b753ca9bc58ac329557a20efac

which sounds like it would explain some small deltas if compared to a 
formula that is set in stone.

Hope that helps.

> Thanks for your time!
> 
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Question about sched_prio_to_weight values
  2019-10-07  9:13 ` Valentin Schneider
@ 2019-10-07 20:41   ` Francesco Poli
  2019-10-10 17:28     ` Valentin Schneider
  0 siblings, 1 reply; 5+ messages in thread
From: Francesco Poli @ 2019-10-07 20:41 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 5244 bytes --]

On Mon, 7 Oct 2019 10:13:23 +0100 Valentin Schneider wrote:

[...]
> Following the blame rabbit hole I found this:
> 
> 254753dc321e ("sched: make the multiplication table more accurate")
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=254753dc321ea2b753ca9bc58ac329557a20efac
> 
> which sounds like it would explain some small deltas if compared to a 
> formula that is set in stone.
> 
> Hope that helps.

It helps a lot, thank you so much for your kind reply!

I am now able to reproduce the numbers (almost), with the following
calc(1) script:

  $ cat prio_to_weight.cal 
  #!/usr/bin/calc -f
  
  oldtilde = config("tilde", "off")
  
  
  printf('nice       mult *   inv_mult   error\n')
  printf('------------------------------------------\n')
  for (nice = -20; nice < 20; ++nice) {
      w = 1024/(1.25)^(nice);
      weight = round(w);
      invmul = round(2^32 / weight);
      error = weight*invmul / 2^32 - 1;
      printf('%3d:%11d *%11d%15.10f\n', nice, weight, invmul, error);
  }
  
  printf('\nnice       mult *   inv_mult   error\n')
  printf('------------------------------------------\n')
  for (nice = -20; nice < 20; ++nice) {
      w = 1024/(1.25)^(nice);
      for (delta = 0; delta < 1000; ++delta) {
          weight = round(w) - delta;
          invmul = round(2^32 / weight);
          error = weight*invmul / 2^32 - 1;
          if (abs(error) < 1e-8) { break; }
          weight = round(w) + delta;
          invmul = round(2^32 / weight);
          error = weight*invmul / 2^32 - 1;
          if (abs(error) < 1e-8) { break; }
      }
      printf('%3d:%11d *%11d%15.10f\n', nice, weight, invmul, error);
  }


After running the script:

  $ ./prio_to_weight.cal | tail -n 42 | tee prio_to_weight.out
  nice       mult *   inv_mult   error
  ------------------------------------------
  -20:      88761 *      48388  -0.0000000065
  -19:      71755 *      59856  -0.0000000037
  -18:      56483 *      76040   0.0000000056
  -17:      46273 *      92818   0.0000000042
  -16:      36291 *     118348  -0.0000000065
  -15:      29154 *     147320  -0.0000000037
  -14:      23254 *     184698  -0.0000000009
  -13:      18705 *     229616  -0.0000000037
  -12:      14949 *     287308  -0.0000000009
  -11:      11916 *     360437  -0.0000000009
  -10:       9548 *     449829  -0.0000000009
   -9:       7620 *     563644  -0.0000000037
   -8:       6100 *     704093   0.0000000009
   -7:       4904 *     875809   0.0000000093
   -6:       3906 *    1099582  -0.0000000009
   -5:       3121 *    1376151  -0.0000000058
   -4:       2501 *    1717300   0.0000000009
   -3:       1991 *    2157191  -0.0000000035
   -2:       1586 *    2708050   0.0000000009
   -1:       1277 *    3363326   0.0000000014
    0:       1024 *    4194304              0
    1:        820 *    5237765   0.0000000009
    2:        655 *    6557202   0.0000000033
    3:        526 *    8165337  -0.0000000079
    4:        423 *   10153587   0.0000000012
    5:        335 *   12820798   0.0000000079
    6:        272 *   15790321   0.0000000037
    7:        215 *   19976592  -0.0000000037
    8:        172 *   24970740  -0.0000000037
    9:        137 *   31350126  -0.0000000079
   10:        110 *   39045157  -0.0000000061
   11:         88 *   48806447   0.0000000093
   12:         70 *   61356676   0.0000000056
   13:         56 *   76695845   0.0000000056
   14:         45 *   95443718   0.0000000033
   15:         36 *  119304647  -0.0000000009
   16:         29 *  148102321   0.0000000030
   17:         23 *  186737709   0.0000000026
   18:         18 *  238609294  -0.0000000009
   19:         15 *  286331153  -0.0000000002

I get an output which is almost identical to the one copied from
the commit message (target.out):

  $ diff target.out prio_to_weight.out
  34c34
  <  11:         87 *   49367440  -0.0000000037
  ---
  >  11:         88 *   48806447   0.0000000093
  36,37c36,37
  <  13:         56 *   76695844  -0.0000000075
  <  14:         45 *   95443717  -0.0000000072
  ---
  >  13:         56 *   76695845   0.0000000056
  >  14:         45 *   95443718   0.0000000033
  39,40c39,40
  <  16:         29 *  148102320  -0.0000000037
  <  17:         23 *  186737708  -0.0000000028
  ---
  >  16:         29 *  148102321   0.0000000030
  >  17:         23 *  186737709   0.0000000026

The differences are probably due to the different precision
of the computations: I don't know the precision of those originally
carried out by Ingo Molnar (single precision? double?), but calc(1)
is an arbitrary precision calculator and, by default, performs
calculations with epsilon = 1e-20 !

Please note that, except for the first one, all the differing
values obtained with the calc(1) script have slightly better
errors than the ones found in kernel/sched/core.c ...


Thanks again for the clarification!
Bye.

-- 
 http://www.inventati.org/frx/
 There's not a second to spare! To the laboratory!
..................................................... Francesco Poli .
 GnuPG key fpr == CA01 1147 9CD2 EFDF FB82  3925 3E1C 27E1 1F69 BFFE

[-- Attachment #2: Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Question about sched_prio_to_weight values
  2019-10-07 20:41   ` Francesco Poli
@ 2019-10-10 17:28     ` Valentin Schneider
  2019-10-10 22:51       ` Francesco Poli
  0 siblings, 1 reply; 5+ messages in thread
From: Valentin Schneider @ 2019-10-10 17:28 UTC (permalink / raw)
  To: Francesco Poli, linux-kernel

On 07/10/2019 21:41, Francesco Poli wrote:
> The differences are probably due to the different precision
> of the computations: I don't know the precision of those originally
> carried out by Ingo Molnar (single precision? double?), but calc(1)
> is an arbitrary precision calculator and, by default, performs
> calculations with epsilon = 1e-20 !
> 
> Please note that, except for the first one, all the differing
> values obtained with the calc(1) script have slightly better
> errors than the ones found in kernel/sched/core.c ...
> 

As always patches are welcome, but I don't know how much there is to gain
from a tiny error correction in those factors.

Out of curiosity, what led you to stare at those numbers?

> 
> Thanks again for the clarification!
> Bye.
> 

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Question about sched_prio_to_weight values
  2019-10-10 17:28     ` Valentin Schneider
@ 2019-10-10 22:51       ` Francesco Poli
  0 siblings, 0 replies; 5+ messages in thread
From: Francesco Poli @ 2019-10-10 22:51 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1732 bytes --]

On Thu, 10 Oct 2019 18:28:36 +0100 Valentin Schneider wrote:

> On 07/10/2019 21:41, Francesco Poli wrote:
> > The differences are probably due to the different precision
> > of the computations: I don't know the precision of those originally
> > carried out by Ingo Molnar (single precision? double?), but calc(1)
> > is an arbitrary precision calculator and, by default, performs
> > calculations with epsilon = 1e-20 !
> > 
> > Please note that, except for the first one, all the differing
> > values obtained with the calc(1) script have slightly better
> > errors than the ones found in kernel/sched/core.c ...
> > 
> 
> As always patches are welcome, but I don't know how much there is to gain
> from a tiny error correction in those factors.

I can of course prepare a patch (a trivial adjustment of some of those
numbers), if there's interest about it, but I'll leave to you kernel
hackers to decide whether the modification may be worth doing (I am no
scheduler expert, I don't even know exactly how to test the scheduler
and assess whether a given patch is beneficial or not)...

> 
> Out of curiosity, what led you to stare at those numbers?

While reading a chapter of a book on operating systems, I encountered
the nice-level-to-weight mapping, as defined in the CFS: I became
obsessively curious and wanted by any means to understand how those
numbers were decided. That's why I was trying to reproduce them with
some criterion which could make sense.


-- 
 http://www.inventati.org/frx/
 There's not a second to spare! To the laboratory!
..................................................... Francesco Poli .
 GnuPG key fpr == CA01 1147 9CD2 EFDF FB82  3925 3E1C 27E1 1F69 BFFE

[-- Attachment #2: Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2019-10-10 23:03 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-06 22:32 Question about sched_prio_to_weight values Francesco Poli
2019-10-07  9:13 ` Valentin Schneider
2019-10-07 20:41   ` Francesco Poli
2019-10-10 17:28     ` Valentin Schneider
2019-10-10 22:51       ` Francesco Poli

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).