linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* RFC: documentation of the autogroup feature
@ 2016-11-22 15:59 Michael Kerrisk (man-pages)
  2016-11-23 10:33 ` [patch] sched/autogroup: Fix 64bit kernel nice adjustment Mike Galbraith
  2016-11-23 11:39 ` RFC: documentation of the autogroup feature Mike Galbraith
  0 siblings, 2 replies; 38+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-11-22 15:59 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: mtk.manpages, Peter Zijlstra, Ingo Molnar, linux-man, lkml,
	Thomas Gleixner

Hello Mike and others,

The autogroup feature that you added in 2.6.38 remains poorly 
documented, so I took a stab at adding some text to the sched(7) 
manual page. There are still a few pieces to be fixed, and you 
may also see some other pieces that should be added. Could I 
ask you to take a look at the text below?

Cheers,

Michael

   The autogroup feature
       Since Linux 2.6.38, the kernel  provides  a  feature  known  as
       autogrouping  to improve interactive desktop performance in the
       face of multiprocess CPU-intensive workloads such  as  building
       the Linux kernel with large numbers of parallel build processes
       (i.e., the make(1) -j flag).

       This feature operates in conjunction with the CFS scheduler and
       requires  a  kernel  that is configured with CONFIG_SCHED_AUTO‐
       GROUP.  On a running system, this feature is  enabled  or  dis‐
       abled  via the file /proc/sys/kernel/sched_autogroup_enabled; a
       value of 0 disables the feature, while a value of 1 enables it.
       The  default  value  in  this  file is 1, unless the kernel was
       booted with the noautogroup parameter.

       When  autogrouping  is  enabled,  processes  are  automatically
       placed  into  "task groups" for the purposes of scheduling.  In
       the current implementation, a new task group is created when  a
       new  session is created via setsid(2), as happens, for example,
       when a new terminal window is created.  A task group  is  auto‐
       matically  destroyed  when the last process in the group termi‐
       nates.



       ┌─────────────────────────────────────────────────────┐
       │FIXME                                                │
       ├─────────────────────────────────────────────────────┤
       │The following is a little vague. Does it need to  be │
       │made more precise?                                   │
       └─────────────────────────────────────────────────────┘
       The CFS scheduler employs an algorithm that distributes the CPU
       across task groups.  As a result of this  algorithm,  the  pro‐
       cesses  in task groups that contain multiple CPU-intensive pro‐
       cesses are in effect disfavored by the scheduler.

       A process's autogroup (task group) membership can be viewed via
       the file /proc/[pid]/autogroup:

           $ cat /proc/1/autogroup
           /autogroup-1 nice 0

       This  file  can  also be used to modify the CPU bandwidth allo‐
       cated to a task group.  This is done by writing a number in the
       "nice"  range  to  the file to set the task group's nice value.
       The allowed range is from +19 (low priority) to -20 (high  pri‐
       ority).   Note that all values in this range cause a task group
       to be further disfavored by the scheduler, with  -20  resulting
       in  the  scheduler  mildy  disfavoring  the  task group and +19
       greatly disfavoring it.


       ┌─────────────────────────────────────────────────────┐
       │FIXME                                                │
       ├─────────────────────────────────────────────────────┤
       │Regarding the previous paragraph...  My tests  indi‐ │
       │cate  that writing *any* value to the autogroup file │
       │causes the task group to get a lower priority.  This │
       │somewhat surprised me, since I assumed (based on the │
       │parallel with the process nice(2) value) that  nega‐ │
       │tive  values  might  boost the task group's priority │
       │above a task group whose autogroup file had not been │
       │touched.                                             │
       │                                                     │
       │Is this the expected behavior? I presume it is...    │
       │                                                     │
       │But  then there's a small surprise in the interface. │
       │Suppose that the value 0 is written to the autogroup │
       │file, then this results in the task group being sig‐ │
       │nificantly disfavored. But, the nice  value  *shown* │
       │in  the  autogroup  file  will be the same as if the │
       │file had not been modified. So, the user has no  way │
       │of discovering the difference. That seems odd.  Am I │
       │missing something?                                   │
       └─────────────────────────────────────────────────────┘



       ┌─────────────────────────────────────────────────────┐
       │FIXME                                                │
       ├─────────────────────────────────────────────────────┤
       │Is the following correct? Does the statement need to │
       │be  more  precise? (E.g., in precisely which circum‐ │
       │stances does the use of cgroups override autogroup?) │
       └─────────────────────────────────────────────────────┘
       The use of the cgroups(7) CPU controller overrides  the  effect
       of autogrouping.


       ┌─────────────────────────────────────────────────────┐
       │FIXME                                                │
       ├─────────────────────────────────────────────────────┤
       │What  needs to be said about autogroup and real-time │
       │tasks?                                               │
       └─────────────────────────────────────────────────────┘


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [patch] sched/autogroup: Fix 64bit kernel nice adjustment
  2016-11-22 15:59 RFC: documentation of the autogroup feature Michael Kerrisk (man-pages)
@ 2016-11-23 10:33 ` Mike Galbraith
  2016-11-23 13:47   ` Michael Kerrisk (man-pages)
  2016-11-24  6:24   ` [tip:sched/urgent] sched/autogroup: Fix 64-bit kernel nice level adjustment tip-bot for Mike Galbraith
  2016-11-23 11:39 ` RFC: documentation of the autogroup feature Mike Galbraith
  1 sibling, 2 replies; 38+ messages in thread
From: Mike Galbraith @ 2016-11-23 10:33 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Peter Zijlstra, Ingo Molnar, linux-man, lkml, Thomas Gleixner

On Tue, 2016-11-22 at 16:59 +0100, Michael Kerrisk (man-pages) wrote:

>        ┌─────────────────────────────────────────────────────┐
>        │FIXME                                                │
>        ├─────────────────────────────────────────────────────┤
>        │Regarding the previous paragraph...  My tests  indi‐ │
>        │cate  that writing *any* value to the autogroup file │
>        │causes the task group to get a lower priority.  This │

Because autogroup didn't call the then meaningless scale_load()...


Autogroup nice level adjustment has been broken ever since load
resolution was increased for 64bit kernels.  Use scale_load() to
scale group weight.

Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Reported-by: Michael Kerrisk <mtk.manpages@gmail.com>
Cc: stable@vger.kernel.org
---
 kernel/sched/auto_group.c |    4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--- a/kernel/sched/auto_group.c
+++ b/kernel/sched/auto_group.c
@@ -192,6 +192,7 @@ int proc_sched_autogroup_set_nice(struct
 {
 	static unsigned long next = INITIAL_JIFFIES;
 	struct autogroup *ag;
+	unsigned long shares;
 	int err;
 
 	if (nice < MIN_NICE || nice > MAX_NICE)
@@ -210,9 +211,10 @@ int proc_sched_autogroup_set_nice(struct
 
 	next = HZ / 10 + jiffies;
 	ag = autogroup_task_get(p);
+	shares = scale_load(sched_prio_to_weight[nice + 20]);
 
 	down_write(&ag->lock);
-	err = sched_group_set_shares(ag->tg, sched_prio_to_weight[nice + 20]);
+	err = sched_group_set_shares(ag->tg, shares);
 	if (!err)
 		ag->nice = nice;
 	up_write(&ag->lock);

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: RFC: documentation of the autogroup feature
  2016-11-22 15:59 RFC: documentation of the autogroup feature Michael Kerrisk (man-pages)
  2016-11-23 10:33 ` [patch] sched/autogroup: Fix 64bit kernel nice adjustment Mike Galbraith
@ 2016-11-23 11:39 ` Mike Galbraith
  2016-11-23 13:54   ` Michael Kerrisk (man-pages)
  1 sibling, 1 reply; 38+ messages in thread
From: Mike Galbraith @ 2016-11-23 11:39 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Peter Zijlstra, Ingo Molnar, linux-man, lkml, Thomas Gleixner

On Tue, 2016-11-22 at 16:59 +0100, Michael Kerrisk (man-pages) wrote:

>        ┌─────────────────────────────────────────────────────┐
>        │FIXME                                                │
>        ├─────────────────────────────────────────────────────┤
>        │The following is a little vague. Does it need to  be │
>        │made more precise?                                   │
>        └─────────────────────────────────────────────────────┘
>        The CFS scheduler employs an algorithm that distributes the CPU
>        across task groups.  As a result of this  algorithm,  the  pro‐
>        cesses  in task groups that contain multiple CPU-intensive pro‐
>        cesses are in effect disfavored by the scheduler.

Mmmm, they're actually equalized (modulo smp fairness goop), but I see
what you mean.

>        A process's autogroup (task group) membership can be viewed via
>        the file /proc/[pid]/autogroup:
> 
>            $ cat /proc/1/autogroup
>            /autogroup-1 nice 0
> 
>        This  file  can  also be used to modify the CPU bandwidth allo‐
>        cated to a task group.  This is done by writing a number in the
>        "nice"  range  to  the file to set the task group's nice value.
>        The allowed range is from +19 (low priority) to -20 (high  pri‐
>        ority).   Note that all values in this range cause a task group
>        to be further disfavored by the scheduler, with  -20  resulting
>        in  the  scheduler  mildy  disfavoring  the  task group and +19
>        greatly disfavoring it.

Group nice levels exactly work the same as task nice levels, ie
negative nice increases share, positive nice decreases it relative to
the default nice 0.

>        ┌─────────────────────────────────────────────────────┐
>        │FIXME                                                │
>        ├─────────────────────────────────────────────────────┤
>        │Regarding the previous paragraph...  My tests  indi‐ │
>        │cate  that writing *any* value to the autogroup file │
>        │causes the task group to get a lower priority.

(patchlet.. I'd prefer to whack the knob, but like the on/off switch,
it may be in use, so I guess we're stuck with it)

>        ┌─────────────────────────────────────────────────────┐
>        │FIXME                                                │
>        ├─────────────────────────────────────────────────────┤
>        │Is the following correct? Does the statement need to │
>        │be  more  precise? (E.g., in precisely which circum‐ │
>        │stances does the use of cgroups override autogroup?) │
>        └─────────────────────────────────────────────────────┘
>        The use of the cgroups(7) CPU controller overrides  the  effect
>        of autogrouping.

Correct, autogroup defers to cgroups.  Perhaps mention that moving a
task back to the root task group will result in the autogroup again
taking effect.

>        ┌─────────────────────────────────────────────────────┐
>        │FIXME                                                │
>        ├─────────────────────────────────────────────────────┤
>        │What  needs to be said about autogroup and real-time │
>        │tasks?                                               │
>        └─────────────────────────────────────────────────────┘

That it does not group realtime tasks, they are auto-deflected to the
root task group.

	-Mike

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch] sched/autogroup: Fix 64bit kernel nice adjustment
  2016-11-23 10:33 ` [patch] sched/autogroup: Fix 64bit kernel nice adjustment Mike Galbraith
@ 2016-11-23 13:47   ` Michael Kerrisk (man-pages)
  2016-11-23 14:12     ` Mike Galbraith
  2016-11-24  6:24   ` [tip:sched/urgent] sched/autogroup: Fix 64-bit kernel nice level adjustment tip-bot for Mike Galbraith
  1 sibling, 1 reply; 38+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-11-23 13:47 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: mtk.manpages, Peter Zijlstra, Ingo Molnar, linux-man, lkml,
	Thomas Gleixner

Hello Mike,

On 11/23/2016 11:33 AM, Mike Galbraith wrote:
> On Tue, 2016-11-22 at 16:59 +0100, Michael Kerrisk (man-pages) wrote:
> 
>>        ┌─────────────────────────────────────────────────────┐
>>        │FIXME                                                │
>>        ├─────────────────────────────────────────────────────┤
>>        │Regarding the previous paragraph...  My tests  indi‐ │
>>        │cate  that writing *any* value to the autogroup file │
>>        │causes the task group to get a lower priority.  This │
> 
> Because autogroup didn't call the then meaningless scale_load()...

So, does that mean that this buglet kicked in starting (only) in 
Linux 4.7 with commit 2159197d66770ec01f75c93fb11dc66df81fd45b?

> Autogroup nice level adjustment has been broken ever since load
> resolution was increased for 64bit kernels.  Use scale_load() to
> scale group weight.

Tested-by: Michael Kerrisk <mtk.manpages@gmail.com>

Applied and tested against 4.9-rc6 on an Intel u7 (4 cores).
Test setup:

Terminal window 1: running 40 CPU burner jobs
Terminal window 2: running 40 CPU burner jobs
Terminal window 1: running 1 CPU burner job

Demonstrated that:
* Writing "0" to the autogroup file for TW1 now causes no change
  to the rate at which the process on the terminal consume CPU.
* Writing -20 to the autogroup file for TW1 caused those processes
  to get the lion's share of CPU while TW2 TW3 get a tiny amount.
* Writing -20 to the autogroup files for TW1 and TW3 allowed the
  process on TW3 to get as much CPU as it was getting as when
  the autogroup nice values for both terminals were 0.
   
Thanks,

Michael

> Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
> Reported-by: Michael Kerrisk <mtk.manpages@gmail.com>
> Cc: stable@vger.kernel.org
> ---
>  kernel/sched/auto_group.c |    4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> --- a/kernel/sched/auto_group.c
> +++ b/kernel/sched/auto_group.c
> @@ -192,6 +192,7 @@ int proc_sched_autogroup_set_nice(struct
>  {
>  	static unsigned long next = INITIAL_JIFFIES;
>  	struct autogroup *ag;
> +	unsigned long shares;
>  	int err;
>  
>  	if (nice < MIN_NICE || nice > MAX_NICE)
> @@ -210,9 +211,10 @@ int proc_sched_autogroup_set_nice(struct
>  
>  	next = HZ / 10 + jiffies;
>  	ag = autogroup_task_get(p);
> +	shares = scale_load(sched_prio_to_weight[nice + 20]);
>  
>  	down_write(&ag->lock);
> -	err = sched_group_set_shares(ag->tg, sched_prio_to_weight[nice + 20]);
> +	err = sched_group_set_shares(ag->tg, shares);
>  	if (!err)
>  		ag->nice = nice;
>  	up_write(&ag->lock);
> 


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: RFC: documentation of the autogroup feature
  2016-11-23 11:39 ` RFC: documentation of the autogroup feature Mike Galbraith
@ 2016-11-23 13:54   ` Michael Kerrisk (man-pages)
  2016-11-23 15:33     ` Mike Galbraith
  0 siblings, 1 reply; 38+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-11-23 13:54 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: mtk.manpages, Peter Zijlstra, Ingo Molnar, linux-man, lkml,
	Thomas Gleixner

Hi Mike,

First off, I better say that I'm not at all intimate with the details
of the scheduler, so bear with me...

On 11/23/2016 12:39 PM, Mike Galbraith wrote:
> On Tue, 2016-11-22 at 16:59 +0100, Michael Kerrisk (man-pages) wrote:
> 
>>        ┌─────────────────────────────────────────────────────┐
>>        │FIXME                                                │
>>        ├─────────────────────────────────────────────────────┤
>>        │The following is a little vague. Does it need to  be │
>>        │made more precise?                                   │
>>        └─────────────────────────────────────────────────────┘
>>        The CFS scheduler employs an algorithm that distributes the CPU
>>        across task groups.  As a result of this  algorithm,  the  pro‐
>>        cesses  in task groups that contain multiple CPU-intensive pro‐
>>        cesses are in effect disfavored by the scheduler.
> 
> Mmmm, they're actually equalized (modulo smp fairness goop), but I see
> what you mean.

I couldn't quite grok that sentence. My problem is resolving "they".
Do you mean: "the CPU scheduler equalizes the distribution of
CPU cycles across task groups"?

> 
>>        A process's autogroup (task group) membership can be viewed via
>>        the file /proc/[pid]/autogroup:
>>
>>            $ cat /proc/1/autogroup
>>            /autogroup-1 nice 0
>>
>>        This  file  can  also be used to modify the CPU bandwidth allo‐
>>        cated to a task group.  This is done by writing a number in the
>>        "nice"  range  to  the file to set the task group's nice value.
>>        The allowed range is from +19 (low priority) to -20 (high  pri‐
>>        ority).   Note that all values in this range cause a task group
>>        to be further disfavored by the scheduler, with  -20  resulting
>>        in  the  scheduler  mildy  disfavoring  the  task group and +19
>>        greatly disfavoring it.
> 
> Group nice levels exactly work the same as task nice levels, ie
> negative nice increases share, positive nice decreases it relative to
> the default nice 0.

Yes, got it now.

>>        ┌─────────────────────────────────────────────────────┐
>>        │FIXME                                                │
>>        ├─────────────────────────────────────────────────────┤
>>        │Regarding the previous paragraph...  My tests  indi‐ │
>>        │cate  that writing *any* value to the autogroup file │
>>        │causes the task group to get a lower priority.
> 
> (patchlet.. 

Writing documentation finds bugs. Who knew? ;-)

> I'd prefer to whack the knob, but like the on/off switch,
> it may be in use, so I guess we're stuck with it)
> 
>>        ┌─────────────────────────────────────────────────────┐
>>        │FIXME                                                │
>>        ├─────────────────────────────────────────────────────┤
>>        │Is the following correct? Does the statement need to │
>>        │be  more  precise? (E.g., in precisely which circum‐ │
>>        │stances does the use of cgroups override autogroup?) │
>>        └─────────────────────────────────────────────────────┘
>>        The use of the cgroups(7) CPU controller overrides  the  effect
>>        of autogrouping.
> 
> Correct, autogroup defers to cgroups.  Perhaps mention that moving a
> task back to the root task group will result in the autogroup again
> taking effect.

In what circumstances does a process get moved back to the root 
task group? 

Actually, can you define for me what the root task group is, and 
why it exists? That may be worth some words in this man page.

>>        ┌─────────────────────────────────────────────────────┐
>>        │FIXME                                                │
>>        ├─────────────────────────────────────────────────────┤
>>        │What  needs to be said about autogroup and real-time │
>>        │tasks?                                               │
>>        └─────────────────────────────────────────────────────┘
> 
> That it does not group realtime tasks, they are auto-deflected to the
> root task group.

Okay. Thanks.

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch] sched/autogroup: Fix 64bit kernel nice adjustment
  2016-11-23 13:47   ` Michael Kerrisk (man-pages)
@ 2016-11-23 14:12     ` Mike Galbraith
  2016-11-23 14:20       ` Michael Kerrisk (man-pages)
  0 siblings, 1 reply; 38+ messages in thread
From: Mike Galbraith @ 2016-11-23 14:12 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Peter Zijlstra, Ingo Molnar, linux-man, lkml, Thomas Gleixner

On Wed, 2016-11-23 at 14:47 +0100, Michael Kerrisk (man-pages) wrote:
> Hello Mike,
> 
> On 11/23/2016 11:33 AM, Mike Galbraith wrote:
> > On Tue, 2016-11-22 at 16:59 +0100, Michael Kerrisk (man-pages)
> > wrote:
> > 
> > >        ┌─────────────────────────────────────────────────────┐
> > >        │FIXME                                                │
> > >        ├─────────────────────────────────────────────────────┤
> > >        │Regarding the previous paragraph...  My tests  indi‐ │
> > >        │cate  that writing *any* value to the autogroup file │
> > >        │causes the task group to get a lower priority.  This │
> > 
> > Because autogroup didn't call the then meaningless scale_load()...
> 
> So, does that mean that this buglet kicked in starting (only) in 
> Linux 4.7 with commit 2159197d66770ec01f75c93fb11dc66df81fd45b?

Yeah, that gave it teeth.

	-Mike

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch] sched/autogroup: Fix 64bit kernel nice adjustment
  2016-11-23 14:12     ` Mike Galbraith
@ 2016-11-23 14:20       ` Michael Kerrisk (man-pages)
  2016-11-23 15:55         ` Mike Galbraith
  0 siblings, 1 reply; 38+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-11-23 14:20 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: mtk.manpages, Peter Zijlstra, Ingo Molnar, linux-man, lkml,
	Thomas Gleixner

On 11/23/2016 03:12 PM, Mike Galbraith wrote:
> On Wed, 2016-11-23 at 14:47 +0100, Michael Kerrisk (man-pages) wrote:
>> Hello Mike,
>>
>> On 11/23/2016 11:33 AM, Mike Galbraith wrote:
>>> On Tue, 2016-11-22 at 16:59 +0100, Michael Kerrisk (man-pages)
>>> wrote:
>>>
>>>>        ┌─────────────────────────────────────────────────────┐
>>>>        │FIXME                                                │
>>>>        ├─────────────────────────────────────────────────────┤
>>>>        │Regarding the previous paragraph...  My tests  indi‐ │
>>>>        │cate  that writing *any* value to the autogroup file │
>>>>        │causes the task group to get a lower priority.  This │
>>>
>>> Because autogroup didn't call the then meaningless scale_load()...
>>
>> So, does that mean that this buglet kicked in starting (only) in 
>> Linux 4.7 with commit 2159197d66770ec01f75c93fb11dc66df81fd45b?
> 
> Yeah, that gave it teeth.

Thanks for the confirmation. Are you aiming to see the fix 
merged for 4.9, or will this wait for 4.10?

Cheers,

Michael



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: RFC: documentation of the autogroup feature
  2016-11-23 13:54   ` Michael Kerrisk (man-pages)
@ 2016-11-23 15:33     ` Mike Galbraith
  2016-11-23 16:04       ` Michael Kerrisk (man-pages)
                         ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: Mike Galbraith @ 2016-11-23 15:33 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Peter Zijlstra, Ingo Molnar, linux-man, lkml, Thomas Gleixner

On Wed, 2016-11-23 at 14:54 +0100, Michael Kerrisk (man-pages) wrote:
> Hi Mike,
> 
> First off, I better say that I'm not at all intimate with the details
> of the scheduler, so bear with me...
> 
> On 11/23/2016 12:39 PM, Mike Galbraith wrote:
> > On Tue, 2016-11-22 at 16:59 +0100, Michael Kerrisk (man-pages) wrote:
> > 
> > >        ┌─────────────────────────────────────────────────────┐
> > >        │FIXME                                                │
> > >        ├─────────────────────────────────────────────────────┤
> > >        │The following is a little vague. Does it need to  be │
> > >        │made more precise?                                   │
> > >        └─────────────────────────────────────────────────────┘
> > >        The CFS scheduler employs an algorithm that distributes the CPU
> > >        across task groups.  As a result of this  algorithm,  the  pro‐
> > >        cesses  in task groups that contain multiple CPU-intensive pro‐
> > >        cesses are in effect disfavored by the scheduler.
> > 
> > Mmmm, they're actually equalized (modulo smp fairness goop), but I see
> > what you mean.
> 
> I couldn't quite grok that sentence. My problem is resolving "they".
> Do you mean: "the CPU scheduler equalizes the distribution of
> CPU cycles across task groups"?

Sort of.  "They" are scheduler entities, runqueue (group) or task.  The
scheduler equalizes entity vruntimes.
 
> > >        │FIXME                                                │
> > >        ├─────────────────────────────────────────────────────┤
> > >        │Is the following correct? Does the statement need to │
> > >        │be  more  precise? (E.g., in precisely which circum‐ │
> > >        │stances does the use of cgroups override autogroup?) │
> > >        └─────────────────────────────────────────────────────┘
> > >        The use of the cgroups(7) CPU controller overrides  the  effect
> > >        of autogrouping.
> > 
> > Correct, autogroup defers to cgroups.  Perhaps mention that moving a
> > task back to the root task group will result in the autogroup again
> > taking effect.
> 
> In what circumstances does a process get moved back to the root 
> task group?

Userspace actions, tool or human fingers.
 
 
> Actually, can you define for me what the root task group is, and 
> why it exists? That may be worth some words in this man page.

I don't think we need group scheduling details, there's plenty of
documentation elsewhere for those who want theory.  Autogroup is for
those who don't want to have to care (which is also why it should have
never grown nice knob).

	-Mike

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [patch] sched/autogroup: Fix 64bit kernel nice adjustment
  2016-11-23 14:20       ` Michael Kerrisk (man-pages)
@ 2016-11-23 15:55         ` Mike Galbraith
  0 siblings, 0 replies; 38+ messages in thread
From: Mike Galbraith @ 2016-11-23 15:55 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Peter Zijlstra, Ingo Molnar, linux-man, lkml, Thomas Gleixner

On Wed, 2016-11-23 at 15:20 +0100, Michael Kerrisk (man-pages) wrote:

> Thanks for the confirmation. Are you aiming to see the fix 
> merged for 4.9, or will this wait for 4.10?

Dunno, that's up to Peter/Ingo.  It's unlikely that anyone other than
we two will notice a thing either way :) 

	-Mike

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: RFC: documentation of the autogroup feature
  2016-11-23 15:33     ` Mike Galbraith
@ 2016-11-23 16:04       ` Michael Kerrisk (man-pages)
  2016-11-23 17:11         ` Mike Galbraith
  2016-11-23 16:05       ` RFC: documentation of the autogroup feature Michael Kerrisk (man-pages)
  2016-11-27 21:13       ` Michael Kerrisk (man-pages)
  2 siblings, 1 reply; 38+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-11-23 16:04 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: mtk.manpages, Peter Zijlstra, Ingo Molnar, linux-man, lkml,
	Thomas Gleixner

Hi Mike,

On 11/23/2016 04:33 PM, Mike Galbraith wrote:
> On Wed, 2016-11-23 at 14:54 +0100, Michael Kerrisk (man-pages) wrote:
>> Hi Mike,
>>
>> First off, I better say that I'm not at all intimate with the details
>> of the scheduler, so bear with me...
>>
>> On 11/23/2016 12:39 PM, Mike Galbraith wrote:
>>> On Tue, 2016-11-22 at 16:59 +0100, Michael Kerrisk (man-pages) wrote:
>>>
>>>>        ┌─────────────────────────────────────────────────────┐
>>>>        │FIXME                                                │
>>>>        ├─────────────────────────────────────────────────────┤
>>>>        │The following is a little vague. Does it need to  be │
>>>>        │made more precise?                                   │
>>>>        └─────────────────────────────────────────────────────┘
>>>>        The CFS scheduler employs an algorithm that distributes the CPU
>>>>        across task groups.  As a result of this  algorithm,  the  pro‐
>>>>        cesses  in task groups that contain multiple CPU-intensive pro‐
>>>>        cesses are in effect disfavored by the scheduler.
>>>
>>> Mmmm, they're actually equalized (modulo smp fairness goop), but I see
>>> what you mean.
>>
>> I couldn't quite grok that sentence. My problem is resolving "they".
>> Do you mean: "the CPU scheduler equalizes the distribution of
>> CPU cycles across task groups"?
> 
> Sort of.  "They" are scheduler entities, runqueue (group) or task.  The
> scheduler equalizes entity vruntimes.

Okay -- I'll see if I can come up with some wording there.

>  
>>>>        │FIXME                                                │
>>>>        ├─────────────────────────────────────────────────────┤
>>>>        │Is the following correct? Does the statement need to │
>>>>        │be  more  precise? (E.g., in precisely which circum‐ │
>>>>        │stances does the use of cgroups override autogroup?) │
>>>>        └─────────────────────────────────────────────────────┘
>>>>        The use of the cgroups(7) CPU controller overrides  the  effect
>>>>        of autogrouping.
>>>
>>> Correct, autogroup defers to cgroups.  Perhaps mention that moving a
>>> task back to the root task group will result in the autogroup again
>>> taking effect.
>>
>> In what circumstances does a process get moved back to the root 
>> task group?
> 
> Userspace actions, tool or human fingers.

Could you say a little more please. What Kernel-user-space 
APIs/system calls/etc. cause this to happen?

>> Actually, can you define for me what the root task group is, and 
>> why it exists? That may be worth some words in this man page.
> 
> I don't think we need group scheduling details, there's plenty of
> documentation elsewhere for those who want theory.  

Well, you suggested above 

    Perhaps mention that moving a task back to the root task
    group will result in the autogroup again taking effect.

So, that inevitable would lead me and the reader of the man page
to ask: what's the root task group?

> Autogroup is for
> those who don't want to have to care (which is also why it should have
> never grown nice knob).

Yes, that I understand that much :-).

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: RFC: documentation of the autogroup feature
  2016-11-23 15:33     ` Mike Galbraith
  2016-11-23 16:04       ` Michael Kerrisk (man-pages)
@ 2016-11-23 16:05       ` Michael Kerrisk (man-pages)
  2016-11-23 17:19         ` Mike Galbraith
  2016-11-27 21:13       ` Michael Kerrisk (man-pages)
  2 siblings, 1 reply; 38+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-11-23 16:05 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: mtk.manpages, Peter Zijlstra, Ingo Molnar, linux-man, lkml,
	Thomas Gleixner

> I don't think we need group scheduling details, there's plenty of
> documentation elsewhere for those who want theory.  

Actually, which documentation were you referring to here?

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: RFC: documentation of the autogroup feature
  2016-11-23 16:04       ` Michael Kerrisk (man-pages)
@ 2016-11-23 17:11         ` Mike Galbraith
  2016-11-24 21:41           ` RFC: documentation of the autogroup feature [v2] Michael Kerrisk (man-pages)
  0 siblings, 1 reply; 38+ messages in thread
From: Mike Galbraith @ 2016-11-23 17:11 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Peter Zijlstra, Ingo Molnar, linux-man, lkml, Thomas Gleixner

On Wed, 2016-11-23 at 17:04 +0100, Michael Kerrisk (man-pages) wrote:

> > > In what circumstances does a process get moved back to the root 
> > > task group?
> > 
> > Userspace actions, tool or human fingers.
> 
> Could you say a little more please. What Kernel-user-space 
> APIs/system calls/etc. cause this to happen?

Well, the system call would be write(), scribbling in the cgroups vfs
interface.. not all that helpful without ever more technical detail.

> > > Actually, can you define for me what the root task group is, and 
> > > why it exists? That may be worth some words in this man page.
> > 
> > I don't think we need group scheduling details, there's plenty of
> > documentation elsewhere for those who want theory.  
> 
> Well, you suggested above 
> 
>     Perhaps mention that moving a task back to the root task
>     group will result in the autogroup again taking effect.

Dang, evolution doesn't have an unsend button :)

	-Mike

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: RFC: documentation of the autogroup feature
  2016-11-23 16:05       ` RFC: documentation of the autogroup feature Michael Kerrisk (man-pages)
@ 2016-11-23 17:19         ` Mike Galbraith
  2016-11-23 22:12           ` Michael Kerrisk (man-pages)
  0 siblings, 1 reply; 38+ messages in thread
From: Mike Galbraith @ 2016-11-23 17:19 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Peter Zijlstra, Ingo Molnar, linux-man, lkml, Thomas Gleixner

On Wed, 2016-11-23 at 17:05 +0100, Michael Kerrisk (man-pages) wrote:
> > I don't think we need group scheduling details, there's plenty of
> > documentation elsewhere for those who want theory.  
> 
> Actually, which documentation were you referring to here?

Documentation/scheduler/*

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: RFC: documentation of the autogroup feature
  2016-11-23 17:19         ` Mike Galbraith
@ 2016-11-23 22:12           ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 38+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-11-23 22:12 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: mtk.manpages, Peter Zijlstra, Ingo Molnar, linux-man, lkml,
	Thomas Gleixner

On 11/23/2016 06:19 PM, Mike Galbraith wrote:
> On Wed, 2016-11-23 at 17:05 +0100, Michael Kerrisk (man-pages) wrote:
>>> I don't think we need group scheduling details, there's plenty of
>>> documentation elsewhere for those who want theory.  
>>
>> Actually, which documentation were you referring to here?
> 
> Documentation/scheduler/*

I think there's a lot less information in there than you think...
Certainly, I can't get any big picture from reading those docs.

Cheers

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [tip:sched/urgent] sched/autogroup: Fix 64-bit kernel nice level adjustment
  2016-11-23 10:33 ` [patch] sched/autogroup: Fix 64bit kernel nice adjustment Mike Galbraith
  2016-11-23 13:47   ` Michael Kerrisk (man-pages)
@ 2016-11-24  6:24   ` tip-bot for Mike Galbraith
  1 sibling, 0 replies; 38+ messages in thread
From: tip-bot for Mike Galbraith @ 2016-11-24  6:24 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: efault, linux-kernel, peterz, mtk.manpages, umgwanakikbuti,
	torvalds, tglx, a.p.zijlstra, hpa, linux-man, mingo

Commit-ID:  83929cce95251cc77e5659bf493bd424ae0e7a67
Gitweb:     http://git.kernel.org/tip/83929cce95251cc77e5659bf493bd424ae0e7a67
Author:     Mike Galbraith <efault@gmx.de>
AuthorDate: Wed, 23 Nov 2016 11:33:37 +0100
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 24 Nov 2016 05:45:02 +0100

sched/autogroup: Fix 64-bit kernel nice level adjustment

Michael Kerrisk reported:

> Regarding the previous paragraph...  My tests indicate
> that writing *any* value to the autogroup [nice priority level]
> file causes the task group to get a lower priority.

Because autogroup didn't call the then meaningless scale_load()...

Autogroup nice level adjustment has been broken ever since load
resolution was increased for 64-bit kernels.  Use scale_load() to
scale group weight.

Michael Kerrisk tested this patch to fix the problem:

> Applied and tested against 4.9-rc6 on an Intel u7 (4 cores).
> Test setup:
>
> Terminal window 1: running 40 CPU burner jobs
> Terminal window 2: running 40 CPU burner jobs
> Terminal window 1: running  1 CPU burner job
>
> Demonstrated that:
> * Writing "0" to the autogroup file for TW1 now causes no change
>   to the rate at which the process on the terminal consume CPU.
> * Writing -20 to the autogroup file for TW1 caused those processes
>   to get the lion's share of CPU while TW2 TW3 get a tiny amount.
> * Writing -20 to the autogroup files for TW1 and TW3 allowed the
>   process on TW3 to get as much CPU as it was getting as when
>   the autogroup nice values for both terminals were 0.

Reported-by: Michael Kerrisk <mtk.manpages@gmail.com>
Tested-by: Michael Kerrisk <mtk.manpages@gmail.com>
Signed-off-by: Mike Galbraith <umgwanakikbuti@gmail.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-man <linux-man@vger.kernel.org>
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/1479897217.4306.6.camel@gmx.de
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/auto_group.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/auto_group.c b/kernel/sched/auto_group.c
index f1c8fd5..da39489 100644
--- a/kernel/sched/auto_group.c
+++ b/kernel/sched/auto_group.c
@@ -212,6 +212,7 @@ int proc_sched_autogroup_set_nice(struct task_struct *p, int nice)
 {
 	static unsigned long next = INITIAL_JIFFIES;
 	struct autogroup *ag;
+	unsigned long shares;
 	int err;
 
 	if (nice < MIN_NICE || nice > MAX_NICE)
@@ -230,9 +231,10 @@ int proc_sched_autogroup_set_nice(struct task_struct *p, int nice)
 
 	next = HZ / 10 + jiffies;
 	ag = autogroup_task_get(p);
+	shares = scale_load(sched_prio_to_weight[nice + 20]);
 
 	down_write(&ag->lock);
-	err = sched_group_set_shares(ag->tg, sched_prio_to_weight[nice + 20]);
+	err = sched_group_set_shares(ag->tg, shares);
 	if (!err)
 		ag->nice = nice;
 	up_write(&ag->lock);

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* RFC: documentation of the autogroup feature [v2]
  2016-11-23 17:11         ` Mike Galbraith
@ 2016-11-24 21:41           ` Michael Kerrisk (man-pages)
  2016-11-25 12:52             ` Afzal Mohammed
  2016-11-25 13:02             ` Mike Galbraith
  0 siblings, 2 replies; 38+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-11-24 21:41 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: mtk.manpages, Peter Zijlstra, Ingo Molnar, linux-man, lkml,
	Thomas Gleixner

Hi Mike,

I reworked the text on autogroups, and in the process learned
something/have another question. Could you tell me if anything 
in the below needs fixing/improving, and also let me know about
the FIXME?

Thanks,

Michael

   The autogroup feature
       Since Linux 2.6.38, the kernel  provides  a  feature  known  as
       autogrouping  to improve interactive desktop performance in the
       face of multiprocess, CPU-intensive workloads such as  building
       the Linux kernel with large numbers of parallel build processes
       (i.e., the make(1) -j flag).

       This feature operates in conjunction with the CFS scheduler and
       requires  a  kernel  that is configured with CONFIG_SCHED_AUTO‐
       GROUP.  On a running system, this feature is  enabled  or  dis‐
       abled  via the file /proc/sys/kernel/sched_autogroup_enabled; a
       value of 0 disables the feature, while a value of 1 enables it.
       The  default  value  in  this  file is 1, unless the kernel was
       booted with the noautogroup parameter.

       A new autogroup is created created when a new session  is  cre‐
       ated  via setsid(2); this happens, for example, when a new ter‐
       minal window is started.  A  new  process  created  by  fork(2)
       inherits  its  parent's autogroup membership.  Thus, all of the
       processes in a session are members of the same  autogroup.   An
       autogroup  is  automatically destroyed when the last process in
       the group terminates.

       When autogrouping is enabled, all of the members  of  an  auto‐
       group  are  placed  in  the same kernel scheduler "task group".
       The CFS scheduler employs an algorithm that equalizes the  dis‐
       tribution  of  CPU  cycles across task groups.  The benefits of
       this for interactive desktop performance can be  described  via
       the following example.

       Suppose  that  there  are two autogroups competing for the same
       CPU.  The first group contains ten CPU-bound processes  from  a
       kernel build started with make -j10.  The other contains a sin‐
       gle CPU-bound process: a video player.   The  effect  of  auto‐
       grouping  is  that the two groups will each receive half of the
       CPU cycles.  That is, the video player will receive 50% of  the
       CPU  cycles,  rather  just 9% of the cycles, which would likely
       lead to degraded video playback.  Or to put things another way:
       an  autogroup  that  contains  a large number of CPU-bound pro‐
       cesses does not end up overwhelming the CPU at the  expense  of
       the other jobs on the system.

       A process's autogroup (task group) membership can be viewed via
       the file /proc/[pid]/autogroup:

           $ cat /proc/1/autogroup
           /autogroup-1 nice 0

       This file can also be used to modify the  CPU  bandwidth  allo‐
       cated to an autogroup.  This is done by writing a number in the
       "nice" range to the file to set  the  autogroup's  nice  value.
       The  allowed range is from +19 (low priority) to -20 (high pri‐
       ority), and the setting has the same effect  as  modifying  the
       nice  level  via getpriority(2).  (For a discussion of the nice
       value, see getpriority(2).)


       ┌─────────────────────────────────────────────────────┐
       │FIXME                                                │
       ├─────────────────────────────────────────────────────┤
       │How do the nice value of  a  process  and  the  nice │
       │value of an autogroup interact? Which has priority?  │
       │                                                     │
       │It  *appears*  that the autogroup nice value is used │
       │for CPU distribution between task groups,  and  that │
       │the  process nice value has no effect there.  (I.e., │
       │suppose two  autogroups  each  contain  a  CPU-bound │
       │process,  with  one  process  having nice==0 and the │
       │other having nice==19.  It appears  that  they  each │
       │get  50%  of  the CPU.)  It appears that the process │
       │nice value has effect only with respect to  schedul‐ │
       │ing  relative to other processes in the *same* auto‐ │
       │group.  Is this correct?                             │
       └─────────────────────────────────────────────────────┘

       The use of the cgroups(7) CPU controller overrides  the  effect
       of autogrouping.

       The  autogroup feature does not group processes that are sched‐
       uled under a real-time and deadline policies.  Those  processes
       are scheduled according to the rules described earlier.


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: RFC: documentation of the autogroup feature [v2]
  2016-11-24 21:41           ` RFC: documentation of the autogroup feature [v2] Michael Kerrisk (man-pages)
@ 2016-11-25 12:52             ` Afzal Mohammed
  2016-11-25 13:04               ` Michael Kerrisk (man-pages)
  2016-11-25 13:02             ` Mike Galbraith
  1 sibling, 1 reply; 38+ messages in thread
From: Afzal Mohammed @ 2016-11-25 12:52 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Mike Galbraith, Peter Zijlstra, Ingo Molnar, linux-man, lkml,
	Thomas Gleixner

Hi,

On Thu, Nov 24, 2016 at 10:41:29PM +0100, Michael Kerrisk (man-pages) wrote:

>        Suppose  that  there  are two autogroups competing for the same
>        CPU.  The first group contains ten CPU-bound processes  from  a
>        kernel build started with make -j10.  The other contains a sin‐
>        gle CPU-bound process: a video player.   The  effect  of  auto‐
>        grouping  is  that the two groups will each receive half of the
>        CPU cycles.  That is, the video player will receive 50% of  the
>        CPU  cycles,  rather  just 9% of the cycles, which would likely
                            ^^^^
                            than ?

Regards
afzal

>        lead to degraded video playback.  Or to put things another way:
>        an  autogroup  that  contains  a large number of CPU-bound pro‐
>        cesses does not end up overwhelming the CPU at the  expense  of
>        the other jobs on the system.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: RFC: documentation of the autogroup feature [v2]
  2016-11-24 21:41           ` RFC: documentation of the autogroup feature [v2] Michael Kerrisk (man-pages)
  2016-11-25 12:52             ` Afzal Mohammed
@ 2016-11-25 13:02             ` Mike Galbraith
  2016-11-25 15:04               ` Michael Kerrisk (man-pages)
  1 sibling, 1 reply; 38+ messages in thread
From: Mike Galbraith @ 2016-11-25 13:02 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Peter Zijlstra, Ingo Molnar, linux-man, lkml, Thomas Gleixner

On Thu, 2016-11-24 at 22:41 +0100, Michael Kerrisk (man-pages) wrote:

>        Suppose  that  there  are two autogroups competing for the same
>        CPU.  The first group contains ten CPU-bound processes  from  a
>        kernel build started with make -j10.  The other contains a sin‐
>        gle CPU-bound process: a video player.   The  effect  of  auto‐
>        grouping  is  that the two groups will each receive half of the
>        CPU cycles.  That is, the video player will receive 50% of  the
>        CPU  cycles,  rather  just 9% of the cycles, which would likely
>        lead to degraded video playback.  Or to put things another way:
>        an  autogroup  that  contains  a large number of CPU-bound pro‐
>        cesses does not end up overwhelming the CPU at the  expense  of
>        the other jobs on the system.

I'd say something more wishy-washy here, like cycles are distributed
fairly across groups and leave it at that, as your detailed example is
incorrect due to SMP fairness (which I don't like much because [very
unlikely] worst case scenario renders a box sized group incapable of
utilizing more that a single CPU total).  For example, if a group of
NR_CPUS size competes with a singleton, load balancing will try to give
the singleton a full CPU of its very own.  If groups intersect for
whatever reason on say my quad lappy, distribution is 80/20 in favor of
the singleton.

>        ┌─────────────────────────────────────────────────────┐
>        │FIXME                                                │
>        ├─────────────────────────────────────────────────────┤
>        │How do the nice value of  a  process  and  the  nice │
>        │value of an autogroup interact? Which has priority?  │
>        │                                                     │
>        │It  *appears*  that the autogroup nice value is used │
>        │for CPU distribution between task groups,  and  that │
>        │the  process nice value has no effect there.  (I.e., │
>        │suppose two  autogroups  each  contain  a  CPU-bound │
>        │process,  with  one  process  having nice==0 and the │
>        │other having nice==19.  It appears  that  they  each │
>        │get  50%  of  the CPU.)  It appears that the process │
>        │nice value has effect only with respect to  schedul‐ │
>        │ing  relative to other processes in the *same* auto‐ │
>        │group.  Is this correct?                             │
>        └─────────────────────────────────────────────────────┘

Yup, entity nice level affects distribution among peer entities.

	-Mike

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: RFC: documentation of the autogroup feature [v2]
  2016-11-25 12:52             ` Afzal Mohammed
@ 2016-11-25 13:04               ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 38+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-11-25 13:04 UTC (permalink / raw)
  To: Afzal Mohammed
  Cc: mtk.manpages, Mike Galbraith, Peter Zijlstra, Ingo Molnar,
	linux-man, lkml, Thomas Gleixner

On 11/25/2016 01:52 PM, Afzal Mohammed wrote:
> Hi,
> 
> On Thu, Nov 24, 2016 at 10:41:29PM +0100, Michael Kerrisk (man-pages) wrote:
> 
>>        Suppose  that  there  are two autogroups competing for the same
>>        CPU.  The first group contains ten CPU-bound processes  from  a
>>        kernel build started with make -j10.  The other contains a sin‐
>>        gle CPU-bound process: a video player.   The  effect  of  auto‐
>>        grouping  is  that the two groups will each receive half of the
>>        CPU cycles.  That is, the video player will receive 50% of  the
>>        CPU  cycles,  rather  just 9% of the cycles, which would likely
>                             ^^^^
>                             than ?
> 
> Regards
> afzal

Thanks, Afzal. Fixed!

Cheers,

Michael

> 
>>        lead to degraded video playback.  Or to put things another way:
>>        an  autogroup  that  contains  a large number of CPU-bound pro‐
>>        cesses does not end up overwhelming the CPU at the  expense  of
>>        the other jobs on the system.
> 


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: RFC: documentation of the autogroup feature [v2]
  2016-11-25 13:02             ` Mike Galbraith
@ 2016-11-25 15:04               ` Michael Kerrisk (man-pages)
  2016-11-25 15:48                 ` Michael Kerrisk (man-pages)
                                   ` (2 more replies)
  0 siblings, 3 replies; 38+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-11-25 15:04 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: mtk.manpages, Peter Zijlstra, Ingo Molnar, linux-man, lkml,
	Thomas Gleixner

Hi Mike,

On 11/25/2016 02:02 PM, Mike Galbraith wrote:
> On Thu, 2016-11-24 at 22:41 +0100, Michael Kerrisk (man-pages) wrote:
> 
>>        Suppose  that  there  are two autogroups competing for the same
>>        CPU.  The first group contains ten CPU-bound processes  from  a
>>        kernel build started with make -j10.  The other contains a sin‐
>>        gle CPU-bound process: a video player.   The  effect  of  auto‐
>>        grouping  is  that the two groups will each receive half of the
>>        CPU cycles.  That is, the video player will receive 50% of  the
>>        CPU  cycles,  rather  just 9% of the cycles, which would likely
>>        lead to degraded video playback.  Or to put things another way:
>>        an  autogroup  that  contains  a large number of CPU-bound pro‐
>>        cesses does not end up overwhelming the CPU at the  expense  of
>>        the other jobs on the system.
> 
> I'd say something more wishy-washy here, like cycles are distributed
> fairly across groups and leave it at that, 

I see where you want to go, but the problem is that the word "fair"
will invoke different interpretations for different people. So, I
think one does need to be a little more concrete.

> as your detailed example is
> incorrect due to SMP fairness 

Well, I was trying to exclude SMP from the discussion by saying
"competing for the same CPU". Here I was meaning that we involve
taskset(1) to confine everyone to the same CPU. Then, I think
my example is correct. (I did some light testing before writing
that text.) But I guess my meaning wasn't clear enough, and
it is a slightly contrived scenario anyway. I'll add some words
to clarify my example, and also add something to say that the
situation is more complex on an SMP system. Something like
the following:

       Suppose that there are two autogroups competing for the  same  CPU
       (i.e., presume either a single CPU system or the use of taskset(1)
       to confine all the processes to the same CPU on  an  SMP  system).
       The  first  group  contains  ten CPU-bound processes from a kernel
       build started with make -j10.  The other contains  a  single  CPU-
       bound process: a video player.  The effect of autogrouping is that
       the two groups will each receive half of the CPU cycles.  That is,
       the  video  player will receive 50% of the CPU cycles, rather than
       just 9% of the cycles, which would likely lead to  degraded  video
       playback.  The situation on an SMP system is more complex, but the
       general effect is the same: the scheduler distributes  CPU  cycles
       across  task  groups  such that an autogroup that contains a large
       number of CPU-bound processes does not end up hoffing  CPU  cycles
       at the expense of the other jobs on the system.

> (which I don't like much because [very
> unlikely] worst case scenario renders a box sized group incapable of
> utilizing more that a single CPU total).  For example, if a group of
> NR_CPUS size competes with a singleton, load balancing will try to give
> the singleton a full CPU of its very own.  If groups intersect for
> whatever reason on say my quad lappy, distribution is 80/20 in favor of
> the singleton.

Thanks for the additional info. Good for educating me, but I think
you'll agree it's more than we need for the man page.

>>        ┌─────────────────────────────────────────────────────┐
>>        │FIXME                                                │
>>        ├─────────────────────────────────────────────────────┤
>>        │How do the nice value of  a  process  and  the  nice │
>>        │value of an autogroup interact? Which has priority?  │
>>        │                                                     │
>>        │It  *appears*  that the autogroup nice value is used │
>>        │for CPU distribution between task groups,  and  that │
>>        │the  process nice value has no effect there.  (I.e., │
>>        │suppose two  autogroups  each  contain  a  CPU-bound │
>>        │process,  with  one  process  having nice==0 and the │
>>        │other having nice==19.  It appears  that  they  each │
>>        │get  50%  of  the CPU.)  It appears that the process │
>>        │nice value has effect only with respect to  schedul‐ │
>>        │ing  relative to other processes in the *same* auto‐ │
>>        │group.  Is this correct?                             │
>>        └─────────────────────────────────────────────────────┘
> 
> Yup, entity nice level affects distribution among peer entities.

Huh! I only just learned about this via my experiments while
investigating autogroups. 

How long have things been like this? Always? (I don't think
so.) Since the arrival of CFS? Since the arrival of
autogrouping? (I'm guessing not.) Since some other point?
(When?)

It seems to me that this renders the traditional process
nice pretty much useless. (I bet I'm not the only one who'd 
be surprised by the current behavior.)

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: RFC: documentation of the autogroup feature [v2]
  2016-11-25 15:04               ` Michael Kerrisk (man-pages)
@ 2016-11-25 15:48                 ` Michael Kerrisk (man-pages)
  2016-11-25 15:51                 ` Mike Galbraith
  2016-11-25 16:04                 ` Peter Zijlstra
  2 siblings, 0 replies; 38+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-11-25 15:48 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: mtk.manpages, Peter Zijlstra, Ingo Molnar, linux-man, lkml,
	Thomas Gleixner

On 11/25/2016 04:04 PM, Michael Kerrisk (man-pages) wrote:
> Hi Mike,
> 
> On 11/25/2016 02:02 PM, Mike Galbraith wrote:
>>>        ┌─────────────────────────────────────────────────────┐
>>>        │FIXME                                                │
>>>        ├─────────────────────────────────────────────────────┤
>>>        │How do the nice value of  a  process  and  the  nice │
>>>        │value of an autogroup interact? Which has priority?  │
>>>        │                                                     │
>>>        │It  *appears*  that the autogroup nice value is used │
>>>        │for CPU distribution between task groups,  and  that │
>>>        │the  process nice value has no effect there.  (I.e., │
>>>        │suppose two  autogroups  each  contain  a  CPU-bound │
>>>        │process,  with  one  process  having nice==0 and the │
>>>        │other having nice==19.  It appears  that  they  each │
>>>        │get  50%  of  the CPU.)  It appears that the process │
>>>        │nice value has effect only with respect to  schedul‐ │
>>>        │ing  relative to other processes in the *same* auto‐ │
>>>        │group.  Is this correct?                             │
>>>        └─────────────────────────────────────────────────────┘
>>
>> Yup, entity nice level affects distribution among peer entities.
> 
> Huh! I only just learned about this via my experiments while
> investigating autogroups. 
> 
> How long have things been like this? Always? (I don't think
> so.) Since the arrival of CFS? Since the arrival of
> autogrouping? (I'm guessing not.) Since some other point?
> (When?)

Okay, things changed sometime after 2.6.31, at least.
(Just tested on an old box.) So, presumably with the arrival
of either CFS or autogrouping? Next comment certainly applies:

> It seems to me that this renders the traditional process
> nice pretty much useless. (I bet I'm not the only one who'd 
> be surprised by the current behavior.)

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: RFC: documentation of the autogroup feature [v2]
  2016-11-25 15:04               ` Michael Kerrisk (man-pages)
  2016-11-25 15:48                 ` Michael Kerrisk (man-pages)
@ 2016-11-25 15:51                 ` Mike Galbraith
  2016-11-25 16:08                   ` Michael Kerrisk (man-pages)
  2016-11-25 16:04                 ` Peter Zijlstra
  2 siblings, 1 reply; 38+ messages in thread
From: Mike Galbraith @ 2016-11-25 15:51 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Peter Zijlstra, Ingo Molnar, linux-man, lkml, Thomas Gleixner

On Fri, 2016-11-25 at 16:04 +0100, Michael Kerrisk (man-pages) wrote:

> > >        ┌─────────────────────────────────────────────────────┐
> > >        │FIXME                                                │
> > >        ├─────────────────────────────────────────────────────┤
> > >        │How do the nice value of  a  process  and  the  nice │
> > >        │value of an autogroup interact? Which has priority?  │
> > >        │                                                     │
> > >        │It  *appears*  that the autogroup nice value is used │
> > >        │for CPU distribution between task groups,  and  that │
> > >        │the  process nice value has no effect there.  (I.e., │
> > >        │suppose two  autogroups  each  contain  a  CPU-bound │
> > >        │process,  with  one  process  having nice==0 and the │
> > >        │other having nice==19.  It appears  that  they  each │
> > >        │get  50%  of  the CPU.)  It appears that the process │
> > >        │nice value has effect only with respect to  schedul‐ │
> > >        │ing  relative to other processes in the *same* auto‐ │
> > >        │group.  Is this correct?                             │
> > >        └─────────────────────────────────────────────────────┘
> > 
> > Yup, entity nice level affects distribution among peer entities.
> 
> Huh! I only just learned about this via my experiments while
> investigating autogroups. 
> 
> How long have things been like this? Always? (I don't think
> so.) Since the arrival of CFS? Since the arrival of
> autogrouping? (I'm guessing not.) Since some other point?
> (When?)

Always.  Before CFS there just were no non-peers :)

> It seems to me that this renders the traditional process
> nice pretty much useless. (I bet I'm not the only one who'd 
> be surprised by the current behavior.)

Yup, group scheduling is not a single edged sword, those don't exist. 
 Box wide nice loss is not the only thing that can bite you, fairness,
whether group or task oriented cuts both ways.

	-Mike

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: RFC: documentation of the autogroup feature [v2]
  2016-11-25 15:04               ` Michael Kerrisk (man-pages)
  2016-11-25 15:48                 ` Michael Kerrisk (man-pages)
  2016-11-25 15:51                 ` Mike Galbraith
@ 2016-11-25 16:04                 ` Peter Zijlstra
  2016-11-25 16:13                   ` Peter Zijlstra
  2016-11-25 16:33                   ` Michael Kerrisk (man-pages)
  2 siblings, 2 replies; 38+ messages in thread
From: Peter Zijlstra @ 2016-11-25 16:04 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Mike Galbraith, Ingo Molnar, linux-man, lkml, Thomas Gleixner

On Fri, Nov 25, 2016 at 04:04:25PM +0100, Michael Kerrisk (man-pages) wrote:
> >>        ┌─────────────────────────────────────────────────────┐
> >>        │FIXME                                                │
> >>        ├─────────────────────────────────────────────────────┤
> >>        │How do the nice value of  a  process  and  the  nice │
> >>        │value of an autogroup interact? Which has priority?  │
> >>        │                                                     │
> >>        │It  *appears*  that the autogroup nice value is used │
> >>        │for CPU distribution between task groups,  and  that │
> >>        │the  process nice value has no effect there.  (I.e., │
> >>        │suppose two  autogroups  each  contain  a  CPU-bound │
> >>        │process,  with  one  process  having nice==0 and the │
> >>        │other having nice==19.  It appears  that  they  each │
> >>        │get  50%  of  the CPU.)  It appears that the process │
> >>        │nice value has effect only with respect to  schedul‐ │
> >>        │ing  relative to other processes in the *same* auto‐ │
> >>        │group.  Is this correct?                             │
> >>        └─────────────────────────────────────────────────────┘
> > 
> > Yup, entity nice level affects distribution among peer entities.
> 
> Huh! I only just learned about this via my experiments while
> investigating autogroups. 
> 
> How long have things been like this? Always? (I don't think
> so.) Since the arrival of CFS? Since the arrival of
> autogrouping? (I'm guessing not.) Since some other point?
> (When?)

Ever since cfs-cgroup, this is a fundamental design point of cgroups,
and has therefore always been the case for autogroups (as that is
nothing more than an application of the cgroup code).

> It seems to me that this renders the traditional process
> nice pretty much useless. (I bet I'm not the only one who'd 
> be surprised by the current behavior.)

Its really rather fundamental to how the whole hierarchical things
works.

CFS is a weighted fair queueing scheduler; this means each entity
receives:

               w_i
  dt_i = dt --------
	    \Sum w_j


		CPU
	  ______/ \______
	 /    |     |	 \
        A     B     C     D


So if each entity {A,B,C,D} has equal weight, then they will receive
equal time. Explicitly, for C you get:


                      w_C
  dt_C = dt -----------------------
            (w_A + w_B + w_C + w_D)


Extending this to a hierarchy, we get:


		CPU
	  ______/ \______
	 /    |     |	 \
        A     B     C     D
	           / \
		  E   F

Where C becomes a 'server' for entities {E,F}. The weight of C does not
depend on its child entities. This way the time of {E,F} becomes a
straight product of their ratio with C. That is; the whole thing
becomes, where l denotes the level in the hierarchy and i an
entity on that level:

                 l      w_g,i
  dt_l,i = dt \Prod  ----------
                g=0  \Sum w_g,j


Or more concretely, for E:

                      w_E
  dt_1,E = dt_0,C -----------
                  (w_E + w_F)

                        w_C               w_E
         = dt ----------------------- -----------
              (w_A + w_B + w_C + w_D) (w_E + w_F)


And this 'trivially' extends to SMP, with the tricky bit being that the
sums over all entities end up being machine wide, instead of per CPU,
which is a real and royal pain for performance.


Note that this property, where the weight of the server entity is
independent from its child entities is a desired feature. Without that
it would be impossible to control the relative weights of groups, and
that is the sole parameter of the WFQ model.

It is also why Linus so likes autogroups, each session competes equally
amongst one another.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: RFC: documentation of the autogroup feature [v2]
  2016-11-25 15:51                 ` Mike Galbraith
@ 2016-11-25 16:08                   ` Michael Kerrisk (man-pages)
  2016-11-25 16:18                     ` Peter Zijlstra
  0 siblings, 1 reply; 38+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-11-25 16:08 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: mtk.manpages, Peter Zijlstra, Ingo Molnar, linux-man, lkml,
	Thomas Gleixner

On 11/25/2016 04:51 PM, Mike Galbraith wrote:
> On Fri, 2016-11-25 at 16:04 +0100, Michael Kerrisk (man-pages) wrote:
> 
>>>>        ┌─────────────────────────────────────────────────────┐
>>>>        │FIXME                                                │
>>>>        ├─────────────────────────────────────────────────────┤
>>>>        │How do the nice value of  a  process  and  the  nice │
>>>>        │value of an autogroup interact? Which has priority?  │
>>>>        │                                                     │
>>>>        │It  *appears*  that the autogroup nice value is used │
>>>>        │for CPU distribution between task groups,  and  that │
>>>>        │the  process nice value has no effect there.  (I.e., │
>>>>        │suppose two  autogroups  each  contain  a  CPU-bound │
>>>>        │process,  with  one  process  having nice==0 and the │
>>>>        │other having nice==19.  It appears  that  they  each │
>>>>        │get  50%  of  the CPU.)  It appears that the process │
>>>>        │nice value has effect only with respect to  schedul‐ │
>>>>        │ing  relative to other processes in the *same* auto‐ │
>>>>        │group.  Is this correct?                             │
>>>>        └─────────────────────────────────────────────────────┘
>>>
>>> Yup, entity nice level affects distribution among peer entities.
>>
>> Huh! I only just learned about this via my experiments while
>> investigating autogroups. 
>>
>> How long have things been like this? Always? (I don't think
>> so.) Since the arrival of CFS? Since the arrival of
>> autogrouping? (I'm guessing not.) Since some other point?
>> (When?)
> 
> Always.  Before CFS there just were no non-peers :)

Well that's one way of looking at it. So, the change 
that I'm talking about came in 2.6.32 with CFS then?

>> It seems to me that this renders the traditional process
>> nice pretty much useless. (I bet I'm not the only one who'd 
>> be surprised by the current behavior.)
> 
> Yup, group scheduling is not a single edged sword, those don't exist. 
>  Box wide nice loss is not the only thing that can bite you, fairness,
> whether group or task oriented cuts both ways.

Understood. But again I'll say, I bet a lot of old-time users
(and maybe many newer) would be surprised by the fact that 
nice(1) / setpriority(2) have effectively been rendered no-ops
in many use cases. At the very least, it'd have been nice
if someone had sent a man pages patch or at least a note...

Cheers,

Michael



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: RFC: documentation of the autogroup feature [v2]
  2016-11-25 16:04                 ` Peter Zijlstra
@ 2016-11-25 16:13                   ` Peter Zijlstra
  2016-11-25 16:33                   ` Michael Kerrisk (man-pages)
  1 sibling, 0 replies; 38+ messages in thread
From: Peter Zijlstra @ 2016-11-25 16:13 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Mike Galbraith, Ingo Molnar, linux-man, lkml, Thomas Gleixner

On Fri, Nov 25, 2016 at 05:04:56PM +0100, Peter Zijlstra wrote:
> That is; the whole thing
> becomes, where l denotes the level in the hierarchy and i an
> entity on that level:
> 
>                  l      w_g,i
>   dt_l,i = dt \Prod  ----------
>                 g=0  \Sum w_g,j
> 
> 
> Or more concretely, for E:
> 
>                       w_E
>   dt_1,E = dt_0,C -----------
>                   (w_E + w_F)
> 
>                         w_C               w_E
>          = dt ----------------------- -----------
>               (w_A + w_B + w_C + w_D) (w_E + w_F)
> 

And this also immediately shows one of the 'problems' with it. Since we
don't have floating point in kernel, these fractions are evaluated with
fixed-point arithmetic. Traditionally (and on 32bit) we use 10bit fixed
point, recently we switched to 20bit for 64bit machines.

That change is what bit you on the nice testing.

But it also means that once we run out of fractional bits things go
wobbly. The fractions, as per the above, increase the deeper the group
hierarchy goes but are also affected by the number of CPUs in the system
(not immediately represented in that equation).

Not to mention that many scheduler operations become O(depth) in cost,
which also hurts. An obvious example being task selection, we pick a
runnable entity for each level, until the resulting entity has no
further children (iow. is a task).

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: RFC: documentation of the autogroup feature [v2]
  2016-11-25 16:08                   ` Michael Kerrisk (man-pages)
@ 2016-11-25 16:18                     ` Peter Zijlstra
  2016-11-25 16:34                       ` Michael Kerrisk (man-pages)
  0 siblings, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2016-11-25 16:18 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Mike Galbraith, Ingo Molnar, linux-man, lkml, Thomas Gleixner

On Fri, Nov 25, 2016 at 05:08:44PM +0100, Michael Kerrisk (man-pages) wrote:
> On 11/25/2016 04:51 PM, Mike Galbraith wrote:
> Well that's one way of looking at it. So, the change 
> that I'm talking about came in 2.6.32 with CFS then?

cfs-cgroup landed later I think, and it was fairly wobbly in the first
few release (as per usual I'd say for major features).

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: RFC: documentation of the autogroup feature [v2]
  2016-11-25 16:04                 ` Peter Zijlstra
  2016-11-25 16:13                   ` Peter Zijlstra
@ 2016-11-25 16:33                   ` Michael Kerrisk (man-pages)
  2016-11-25 22:48                     ` Peter Zijlstra
  1 sibling, 1 reply; 38+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-11-25 16:33 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: mtk.manpages, Mike Galbraith, Ingo Molnar, linux-man, lkml,
	Thomas Gleixner

Hi Peter,

On 11/25/2016 05:04 PM, Peter Zijlstra wrote:
> On Fri, Nov 25, 2016 at 04:04:25PM +0100, Michael Kerrisk (man-pages) wrote:
>>>>        ┌─────────────────────────────────────────────────────┐
>>>>        │FIXME                                                │
>>>>        ├─────────────────────────────────────────────────────┤
>>>>        │How do the nice value of  a  process  and  the  nice │
>>>>        │value of an autogroup interact? Which has priority?  │
>>>>        │                                                     │
>>>>        │It  *appears*  that the autogroup nice value is used │
>>>>        │for CPU distribution between task groups,  and  that │
>>>>        │the  process nice value has no effect there.  (I.e., │
>>>>        │suppose two  autogroups  each  contain  a  CPU-bound │
>>>>        │process,  with  one  process  having nice==0 and the │
>>>>        │other having nice==19.  It appears  that  they  each │
>>>>        │get  50%  of  the CPU.)  It appears that the process │
>>>>        │nice value has effect only with respect to  schedul‐ │
>>>>        │ing  relative to other processes in the *same* auto‐ │
>>>>        │group.  Is this correct?                             │
>>>>        └─────────────────────────────────────────────────────┘
>>>
>>> Yup, entity nice level affects distribution among peer entities.
>>
>> Huh! I only just learned about this via my experiments while
>> investigating autogroups. 
>>
>> How long have things been like this? Always? (I don't think
>> so.) Since the arrival of CFS? Since the arrival of
>> autogrouping? (I'm guessing not.) Since some other point?
>> (When?)
> 
> Ever since cfs-cgroup, 

Okay. That begs the question still though.

> this is a fundamental design point of cgroups,
> and has therefore always been the case for autogroups (as that is
> nothing more than an application of the cgroup code).

Understood. 

>> It seems to me that this renders the traditional process
>> nice pretty much useless. (I bet I'm not the only one who'd 
>> be surprised by the current behavior.)
> 
> Its really rather fundamental to how the whole hierarchical things
> works.
> 
> CFS is a weighted fair queueing scheduler; this means each entity
> receives:
> 
>                w_i
>   dt_i = dt --------
> 	    \Sum w_j
> 
> 
> 		CPU
> 	  ______/ \______
> 	 /    |     |	 \
>       A     B     C     D
> 
> 
> So if each entity {A,B,C,D} has equal weight, then they will receive
> equal time. Explicitly, for C you get:
> 
> 
>                       w_C
>   dt_C = dt -----------------------
>             (w_A + w_B + w_C + w_D)
> 
> 
> Extending this to a hierarchy, we get:
> 
> 
> 		CPU
> 	  ______/ \______
> 	 /    |     |	 \
>       A     B     C     D
> 	           / \
> 		  E   F
> 
> Where C becomes a 'server' for entities {E,F}. The weight of C does not
> depend on its child entities. This way the time of {E,F} becomes a
> straight product of their ratio with C. That is; the whole thing
> becomes, where l denotes the level in the hierarchy and i an
> entity on that level:
> 
>                  l      w_g,i
>   dt_l,i = dt \Prod  ----------
>                 g=0  \Sum w_g,j
> 
> 
> Or more concretely, for E:
> 
>                       w_E
>   dt_1,E = dt_0,C -----------
>                   (w_E + w_F)
> 
>                         w_C               w_E
>          = dt ----------------------- -----------
>               (w_A + w_B + w_C + w_D) (w_E + w_F)
> 
> 
> And this 'trivially' extends to SMP, with the tricky bit being that the
> sums over all entities end up being machine wide, instead of per CPU,
> which is a real and royal pain for performance.

Okay -- you're really quite the ASCII artist. And somehow,
I think you needed to compose the mail in LaTeX. But thanks
for the detail. It's helpful, for me at least.

> Note that this property, where the weight of the server entity is
> independent from its child entities is a desired feature. Without that
> it would be impossible to control the relative weights of groups, and
> that is the sole parameter of the WFQ model.
> 
> It is also why Linus so likes autogroups, each session competes equally
> amongst one another.

I get it. But, the behavior changes for the process nice value are
undocumented, and they should be documented. I understand
what the behavior change was. But not yet when.

Cheers,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: RFC: documentation of the autogroup feature [v2]
  2016-11-25 16:18                     ` Peter Zijlstra
@ 2016-11-25 16:34                       ` Michael Kerrisk (man-pages)
  2016-11-25 20:54                         ` Michael Kerrisk (man-pages)
  0 siblings, 1 reply; 38+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-11-25 16:34 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: mtk.manpages, Mike Galbraith, Ingo Molnar, linux-man, lkml,
	Thomas Gleixner

On 11/25/2016 05:18 PM, Peter Zijlstra wrote:
> On Fri, Nov 25, 2016 at 05:08:44PM +0100, Michael Kerrisk (man-pages) wrote:
>> On 11/25/2016 04:51 PM, Mike Galbraith wrote:
>> Well that's one way of looking at it. So, the change 
>> that I'm talking about came in 2.6.32 with CFS then?
> 
> cfs-cgroup landed later I think, and it was fairly wobbly in the first
> few release (as per usual I'd say for major features).

So I've been searching git logs and elsewhere, but didn't yet
find a likely commit(s). Any clues what I should be looking for.
I'd like this info, because while documenting the changes, I'd
also like to document when they occurred.

Cheers,

Michael



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: RFC: documentation of the autogroup feature [v2]
  2016-11-25 16:34                       ` Michael Kerrisk (man-pages)
@ 2016-11-25 20:54                         ` Michael Kerrisk (man-pages)
  2016-11-25 21:49                           ` Peter Zijlstra
  0 siblings, 1 reply; 38+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-11-25 20:54 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: mtk.manpages, Mike Galbraith, Ingo Molnar, linux-man, lkml,
	Thomas Gleixner

Hi Peter,

On 11/25/2016 05:34 PM, Michael Kerrisk (man-pages) wrote:
> On 11/25/2016 05:18 PM, Peter Zijlstra wrote:
>> On Fri, Nov 25, 2016 at 05:08:44PM +0100, Michael Kerrisk (man-pages) wrote:
>>> On 11/25/2016 04:51 PM, Mike Galbraith wrote:
>>> Well that's one way of looking at it. So, the change 
>>> that I'm talking about came in 2.6.32 with CFS then?
>>
>> cfs-cgroup landed later I think, and it was fairly wobbly in the first
>> few release (as per usual I'd say for major features).
> 
> So I've been searching git logs and elsewhere, but didn't yet
> find a likely commit(s). Any clues what I should be looking for.
> I'd like this info, because while documenting the changes, I'd
> also like to document when they occurred.

So, part of what I was struggling with was what you meant by cfs-cgroup.
Do you mean the CFS bandwidth control features added in Linux 3.2?

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: RFC: documentation of the autogroup feature [v2]
  2016-11-25 20:54                         ` Michael Kerrisk (man-pages)
@ 2016-11-25 21:49                           ` Peter Zijlstra
  2016-11-29  7:43                             ` Michael Kerrisk (man-pages)
  0 siblings, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2016-11-25 21:49 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Mike Galbraith, Ingo Molnar, linux-man, lkml, Thomas Gleixner

On Fri, Nov 25, 2016 at 09:54:05PM +0100, Michael Kerrisk (man-pages) wrote:
> So, part of what I was struggling with was what you meant by cfs-cgroup.
> Do you mean the CFS bandwidth control features added in Linux 3.2?

Nope, /me digs around for a bit... around here I suppose:

 68318b8e0b61 ("Hook up group scheduler with control groups")

68318b8e0b61 v2.6.24-rc1~151

But I really have no idea what that looked like.

In any case, for the case of autogroup, the behaviour has always been,
autogroups came quite late.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: RFC: documentation of the autogroup feature [v2]
  2016-11-25 16:33                   ` Michael Kerrisk (man-pages)
@ 2016-11-25 22:48                     ` Peter Zijlstra
  0 siblings, 0 replies; 38+ messages in thread
From: Peter Zijlstra @ 2016-11-25 22:48 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Mike Galbraith, Ingo Molnar, linux-man, lkml, Thomas Gleixner

On Fri, Nov 25, 2016 at 05:33:23PM +0100, Michael Kerrisk (man-pages) wrote:

> Okay -- you're really quite the ASCII artist. And somehow,
> I think you needed to compose the mail in LaTeX. But thanks
> for the detail. It's helpful, for me at least.

Hehe, its been a while since I did LaTeX, so I'd probably make a mess of
it :-) Glad my ramblings made sense.

> > Note that this property, where the weight of the server entity is
> > independent from its child entities is a desired feature. Without that
> > it would be impossible to control the relative weights of groups, and
> > that is the sole parameter of the WFQ model.
> > 
> > It is also why Linus so likes autogroups, each session competes equally
> > amongst one another.
> 
> I get it. But, the behavior changes for the process nice value are
> undocumented, and they should be documented. I understand
> what the behavior change was. But not yet when.

Well, its all undocumented -- I suppose you're about to go fix that :-)

But think of it differently, think of the group as a container, then the
behaviour inside the container is exactly as expected.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: RFC: documentation of the autogroup feature
  2016-11-23 15:33     ` Mike Galbraith
  2016-11-23 16:04       ` Michael Kerrisk (man-pages)
  2016-11-23 16:05       ` RFC: documentation of the autogroup feature Michael Kerrisk (man-pages)
@ 2016-11-27 21:13       ` Michael Kerrisk (man-pages)
  2016-11-28  1:46         ` Mike Galbraith
  2 siblings, 1 reply; 38+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-11-27 21:13 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: mtk.manpages, Peter Zijlstra, Ingo Molnar, linux-man, lkml,
	Thomas Gleixner

Hi Mike,

On 11/23/2016 04:33 PM, Mike Galbraith wrote:
> On Wed, 2016-11-23 at 14:54 +0100, Michael Kerrisk (man-pages) wrote:
>> Hi Mike,

[...]

>> Actually, can you define for me what the root task group is, and 
>> why it exists? That may be worth some words in this man page.
> 
> I don't think we need group scheduling details, there's plenty of
> documentation elsewhere for those who want theory.  Autogroup is for
> those who don't want to have to care (which is also why it should have
> never grown nice knob).

Actually, the more I think about this, the more I think we *do*
need a few details on group scheduling. Otherwise, it's difficult
to explain to the use why nice(1) no longer works as traditionally
expected.

Here's my attempt to define the root task group:

       *  If autogrouping is disabled, then all processes in the root CPU
          cgroup form a scheduling group (sometimes called the "root task
          group").

Can you improve on this?

Cheers,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: RFC: documentation of the autogroup feature
  2016-11-27 21:13       ` Michael Kerrisk (man-pages)
@ 2016-11-28  1:46         ` Mike Galbraith
       [not found]           ` <1127218a-dd9b-71a8-845d-3a83969632fc@gmail.com>
  0 siblings, 1 reply; 38+ messages in thread
From: Mike Galbraith @ 2016-11-28  1:46 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Peter Zijlstra, Ingo Molnar, linux-man, lkml, Thomas Gleixner

On Sun, 2016-11-27 at 22:13 +0100, Michael Kerrisk (man-pages) wrote:

> Here's my attempt to define the root task group:
> 
>        *  If autogrouping is disabled, then all processes in the root CPU
>           cgroup form a scheduling group (sometimes called the "root task
>           group").
> 
> Can you improve on this?

A task group is a set of percpu runqueues.  The root task group is the
top level set in a hierarchy of such sets when group scheduling is
enabled, or the only set when group scheduling is not enabled.  The
autogroup hierarchy has a depth of one, ie all autogroups are peers
who's common parent is the root task group.

	-Mike

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: RFC: documentation of the autogroup feature [v2]
  2016-11-25 21:49                           ` Peter Zijlstra
@ 2016-11-29  7:43                             ` Michael Kerrisk (man-pages)
  2016-11-29 11:46                               ` Peter Zijlstra
  0 siblings, 1 reply; 38+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-11-29  7:43 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: mtk.manpages, Mike Galbraith, Ingo Molnar, linux-man, lkml,
	Thomas Gleixner

Hi Peter,

On 11/25/2016 10:49 PM, Peter Zijlstra wrote:
> On Fri, Nov 25, 2016 at 09:54:05PM +0100, Michael Kerrisk (man-pages) wrote:
>> So, part of what I was struggling with was what you meant by cfs-cgroup.
>> Do you mean the CFS bandwidth control features added in Linux 3.2?
> 
> Nope, /me digs around for a bit... around here I suppose:
> 
>  68318b8e0b61 ("Hook up group scheduler with control groups")

Thanks. The pieces are starting to fall into place now.

> 68318b8e0b61 v2.6.24-rc1~151
> 
> But I really have no idea what that looked like.
> 
> In any case, for the case of autogroup, the behaviour has always been,
> autogroups came quite late.

This ("the behavior has always been") isn't quite true. Yes, group
scheduling has been around since Linux 2.6.24, but in terms of the
semantics of the thread nice value, there was no visible change
then, *unless* explicit action was taken to create cgroups.

The arrival of autogroups in Linux 2.6.38 was different. 
With this feature enabled (which is the default), task
groups were implicitly created *without the user needing to
do anything*. Thus, [two terminal windows] == [two task groups]
and in those two terminal windows, nice(1) on a CPU-bound
command in one terminal did nothing in terms of improving
CPU access for a CPU-bound tasks running on the other terminal
window.

Put more succinctly: in Linux 2.6.38, autogrouping broke nice(1)
for many use cases.

Once I came to that simple summary it was easy to find multiple
reports of problems from users:

http://serverfault.com/questions/405092/nice-level-not-working-on-linux
http://superuser.com/questions/805599/nice-has-no-effect-in-linux-unless-the-same-shell-is-used
https://www.reddit.com/r/linux/comments/1c4jew/nice_has_no_effect/
http://stackoverflow.com/questions/10342470/process-niceness-priority-setting-has-no-effect-on-linux

Someone else quickly pointed out to me another such report:

https://bbs.archlinux.org/viewtopic.php?id=149553

And when I quickly surveyed a few more or less savvy Linux users
in one room, most understood what nice does, but none of them knew
about the behavior change wrought by autogroup.

I haven't looked at all of the mails in the old threads that 
discussed the implementation of this feature, but so far none of
those that I saw mentioned this behavior change. It's unfortunate
that it never even got documented.

Cheers,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: RFC: documentation of the autogroup feature
       [not found]           ` <1127218a-dd9b-71a8-845d-3a83969632fc@gmail.com>
@ 2016-11-29  9:10             ` Michael Kerrisk (man-pages)
  2016-11-29 13:46               ` Mike Galbraith
  0 siblings, 1 reply; 38+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-11-29  9:10 UTC (permalink / raw)
  To: Mike Galbraith
  Cc: Michael Kerrisk, Peter Zijlstra, Ingo Molnar, linux-man, lkml,
	Thomas Gleixner

[Resending because of bounces from the lists. (Somehow my mailer
messed up the MIME labeling)]

Hi Mike,

On 11/28/2016 02:46 AM, Mike Galbraith wrote:
> On Sun, 2016-11-27 at 22:13 +0100, Michael Kerrisk (man-pages) wrote:
>
>> Here's my attempt to define the root task group:
>>
>>        *  If autogrouping is disabled, then all processes in the root CPU
>>           cgroup form a scheduling group (sometimes called the "root task
>>           group").
>>
>> Can you improve on this?

The below is helpful, but...

> A task group is a set of percpu runqueues.

The explanation needs really to be in terms of what user-space
understands and sees. "Runqueues" are a kernel scheduler implementation
detail.

> The root task group is the
> top level set in a hierarchy of such sets when group scheduling is
> enabled, or the only set when group scheduling is not enabled.  The
> autogroup hierarchy has a depth of one, ie all autogroups are peers
> who's common parent is the root task group.

Let's try and go further. How's this:

       When scheduling non-real-time  processes  (i.e.,  those  scheduled
       under  the SCHED_OTHER, SCHED_BATCH, and SCHED_IDLE policies), the
       CFS scheduler employs a technique known as "group scheduling",  if
       the  kernel was configured with the CONFIG_FAIR_GROUP_SCHED option
       (which is typical).

       Under group scheduling, threads are scheduled  in  "task  groups".
       Task  groups  have  a  hierarchical relationship, rooted under the
       initial task group on the system, known as the "root task  group".
       Task groups are formed in the following circumstances:

       *  All of the threads in a CPU cgroup form a task group.  The par‐
          ent of this task group is the task group of  the  corresponding
          parent cgroup.

       *  If  autogrouping  is  enabled, then all of the threads that are
          (implicitly) placed in an autogroup (i.e., the same session, as
          created by setsid(2)) form a task group.  Each new autogroup is
          thus a separate task group.  The root task group is the  parent
          of all such autogroups.

       *  If  autogrouping  is enabled, then the root task group consists
          of all processes in the root CPU cgroup that were not otherwise
          implicitly placed into a new autogroup.

       *  If  autogrouping is disabled, then the root task group consists
          of all processes in the root CPU cgroup.

       *  If group scheduling was disabled (i.e., the kernel was  config‐
          ured  without  CONFIG_FAIR_GROUP_SCHED),  then  all of the pro‐
          cesses on the system are notionally placed  in  a  single  task
          group.

       [To be followed by a discussion of the nice value and task groups]

?

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: RFC: documentation of the autogroup feature [v2]
  2016-11-29  7:43                             ` Michael Kerrisk (man-pages)
@ 2016-11-29 11:46                               ` Peter Zijlstra
  2016-11-29 13:44                                 ` Michael Kerrisk (man-pages)
  0 siblings, 1 reply; 38+ messages in thread
From: Peter Zijlstra @ 2016-11-29 11:46 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Mike Galbraith, Ingo Molnar, linux-man, lkml, Thomas Gleixner

On Tue, Nov 29, 2016 at 08:43:33AM +0100, Michael Kerrisk (man-pages) wrote:
> > 
> > In any case, for the case of autogroup, the behaviour has always been,
> > autogroups came quite late.
> 
> This ("the behavior has always been") isn't quite true. Yes, group
> scheduling has been around since Linux 2.6.24, but in terms of the
> semantics of the thread nice value, there was no visible change
> then, *unless* explicit action was taken to create cgroups.
> 
> The arrival of autogroups in Linux 2.6.38 was different. 
> With this feature enabled (which is the default), task

I don't think the SCHED_AUTOGROUP symbol is default y, most distros
might have default enabled it, but that's not something I can help.

> groups were implicitly created *without the user needing to
> do anything*. Thus, [two terminal windows] == [two task groups]
> and in those two terminal windows, nice(1) on a CPU-bound
> command in one terminal did nothing in terms of improving
> CPU access for a CPU-bound tasks running on the other terminal
> window.
> 
> Put more succinctly: in Linux 2.6.38, autogrouping broke nice(1)
> for many use cases.
> 
> Once I came to that simple summary it was easy to find multiple
> reports of problems from users:
> 
> http://serverfault.com/questions/405092/nice-level-not-working-on-linux
> http://superuser.com/questions/805599/nice-has-no-effect-in-linux-unless-the-same-shell-is-used
> https://www.reddit.com/r/linux/comments/1c4jew/nice_has_no_effect/
> http://stackoverflow.com/questions/10342470/process-niceness-priority-setting-has-no-effect-on-linux
> 
> Someone else quickly pointed out to me another such report:
> 
> https://bbs.archlinux.org/viewtopic.php?id=149553

Well, none of that ever got back to me, so again, nothing I could do
about that.

> And when I quickly surveyed a few more or less savvy Linux users
> in one room, most understood what nice does, but none of them knew
> about the behavior change wrought by autogroup.
> 
> I haven't looked at all of the mails in the old threads that 
> discussed the implementation of this feature, but so far none of
> those that I saw mentioned this behavior change. It's unfortunate
> that it never even got documented.

Well, when we added the feature people (most notable Linus) understood
what cgroups did. So no surprises for any of us.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: RFC: documentation of the autogroup feature [v2]
  2016-11-29 11:46                               ` Peter Zijlstra
@ 2016-11-29 13:44                                 ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 38+ messages in thread
From: Michael Kerrisk (man-pages) @ 2016-11-29 13:44 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Mike Galbraith, Ingo Molnar, linux-man, lkml, Thomas Gleixner

Hi Peter,

On 29 November 2016 at 12:46, Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, Nov 29, 2016 at 08:43:33AM +0100, Michael Kerrisk (man-pages) wrote:
>> >
>> > In any case, for the case of autogroup, the behaviour has always been,
>> > autogroups came quite late.
>>
>> This ("the behavior has always been") isn't quite true. Yes, group
>> scheduling has been around since Linux 2.6.24, but in terms of the
>> semantics of the thread nice value, there was no visible change
>> then, *unless* explicit action was taken to create cgroups.
>>
>> The arrival of autogroups in Linux 2.6.38 was different.
>> With this feature enabled (which is the default), task
>
> I don't think the SCHED_AUTOGROUP symbol is default y, most distros
> might have default enabled it, but that's not something I can help.

Actually, it looks to me like it is the default. But that isn't really
the point. Even if the default was off, it's the way of things that
distros will generally default "on" things, because some users want
them. That's a repeated and to be expected pattern.

>> groups were implicitly created *without the user needing to
>> do anything*. Thus, [two terminal windows] == [two task groups]
>> and in those two terminal windows, nice(1) on a CPU-bound
>> command in one terminal did nothing in terms of improving
>> CPU access for a CPU-bound tasks running on the other terminal
>> window.
>>
>> Put more succinctly: in Linux 2.6.38, autogrouping broke nice(1)
>> for many use cases.
>>
>> Once I came to that simple summary it was easy to find multiple
>> reports of problems from users:
>>
>> http://serverfault.com/questions/405092/nice-level-not-working-on-linux
>> http://superuser.com/questions/805599/nice-has-no-effect-in-linux-unless-the-same-shell-is-used
>> https://www.reddit.com/r/linux/comments/1c4jew/nice_has_no_effect/
>> http://stackoverflow.com/questions/10342470/process-niceness-priority-setting-has-no-effect-on-linux
>>
>> Someone else quickly pointed out to me another such report:
>>
>> https://bbs.archlinux.org/viewtopic.php?id=149553
>
> Well, none of that ever got back to me, so again, nothing I could do
> about that.

I understand. It's just unfortunate that the (as far as I can see) the
implications were not fully considered before making the change. Such
consideration often springs out of writing comprehensive
documentation, I find ;-).

>> And when I quickly surveyed a few more or less savvy Linux users
>> in one room, most understood what nice does, but none of them knew
>> about the behavior change wrought by autogroup.
>>
>> I haven't looked at all of the mails in the old threads that
>> discussed the implementation of this feature, but so far none of
>> those that I saw mentioned this behavior change. It's unfortunate
>> that it never even got documented.
>
> Well, when we added the feature people (most notable Linus) understood
> what cgroups did. So no surprises for any of us.

Sure, but cgroups is different. It requires explicit action by the
ueser (creating cgroups) to see the behavior.

With autogroups, the change kicks in on the desktop without the user
needing to do anything, and changes desktop behavior in a way that was
unexpected.

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: RFC: documentation of the autogroup feature
  2016-11-29  9:10             ` Michael Kerrisk (man-pages)
@ 2016-11-29 13:46               ` Mike Galbraith
  0 siblings, 0 replies; 38+ messages in thread
From: Mike Galbraith @ 2016-11-29 13:46 UTC (permalink / raw)
  To: mtk.manpages
  Cc: Peter Zijlstra, Ingo Molnar, linux-man, lkml, Thomas Gleixner

On Tue, 2016-11-29 at 10:10 +0100, Michael Kerrisk (man-pages) wrote:
> Let's try and go further. How's this:
> 
>        When scheduling non-real-time  processes  (i.e.,  those  scheduled
>        under  the SCHED_OTHER, SCHED_BATCH, and SCHED_IDLE policies), the
>        CFS scheduler employs a technique known as "group scheduling",  if
>        the  kernel was configured with the CONFIG_FAIR_GROUP_SCHED option
>        (which is typical).
> 
>        Under group scheduling, threads are scheduled  in  "task  groups".
>        Task  groups  have  a  hierarchical relationship, rooted under the
>        initial task group on the system, known as the "root task  group".
>        Task groups are formed in the following circumstances:
> 
>        *  All of the threads in a CPU cgroup form a task group.  The par‐
>           ent of this task group is the task group of  the  corresponding
>           parent cgroup.
> 
>        *  If  autogrouping  is  enabled, then all of the threads that are
>           (implicitly) placed in an autogroup (i.e., the same session, as
>           created by setsid(2)) form a task group.  Each new autogroup is
>           thus a separate task group.  The root task group is the  parent
>           of all such autogroups.
> 
>        *  If  autogrouping  is enabled, then the root task group consists
>           of all processes in the root CPU cgroup that were not otherwise
>           implicitly placed into a new autogroup.
> 
>        *  If  autogrouping is disabled, then the root task group consists
>           of all processes in the root CPU cgroup.
> 
>        *  If group scheduling was disabled (i.e., the kernel was  config‐
>           ured  without  CONFIG_FAIR_GROUP_SCHED),  then  all of the pro‐
>           cesses on the system are notionally placed  in  a  single  task
>           group.

Notionally works for me.

	-Mike

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2016-11-29 13:48 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-22 15:59 RFC: documentation of the autogroup feature Michael Kerrisk (man-pages)
2016-11-23 10:33 ` [patch] sched/autogroup: Fix 64bit kernel nice adjustment Mike Galbraith
2016-11-23 13:47   ` Michael Kerrisk (man-pages)
2016-11-23 14:12     ` Mike Galbraith
2016-11-23 14:20       ` Michael Kerrisk (man-pages)
2016-11-23 15:55         ` Mike Galbraith
2016-11-24  6:24   ` [tip:sched/urgent] sched/autogroup: Fix 64-bit kernel nice level adjustment tip-bot for Mike Galbraith
2016-11-23 11:39 ` RFC: documentation of the autogroup feature Mike Galbraith
2016-11-23 13:54   ` Michael Kerrisk (man-pages)
2016-11-23 15:33     ` Mike Galbraith
2016-11-23 16:04       ` Michael Kerrisk (man-pages)
2016-11-23 17:11         ` Mike Galbraith
2016-11-24 21:41           ` RFC: documentation of the autogroup feature [v2] Michael Kerrisk (man-pages)
2016-11-25 12:52             ` Afzal Mohammed
2016-11-25 13:04               ` Michael Kerrisk (man-pages)
2016-11-25 13:02             ` Mike Galbraith
2016-11-25 15:04               ` Michael Kerrisk (man-pages)
2016-11-25 15:48                 ` Michael Kerrisk (man-pages)
2016-11-25 15:51                 ` Mike Galbraith
2016-11-25 16:08                   ` Michael Kerrisk (man-pages)
2016-11-25 16:18                     ` Peter Zijlstra
2016-11-25 16:34                       ` Michael Kerrisk (man-pages)
2016-11-25 20:54                         ` Michael Kerrisk (man-pages)
2016-11-25 21:49                           ` Peter Zijlstra
2016-11-29  7:43                             ` Michael Kerrisk (man-pages)
2016-11-29 11:46                               ` Peter Zijlstra
2016-11-29 13:44                                 ` Michael Kerrisk (man-pages)
2016-11-25 16:04                 ` Peter Zijlstra
2016-11-25 16:13                   ` Peter Zijlstra
2016-11-25 16:33                   ` Michael Kerrisk (man-pages)
2016-11-25 22:48                     ` Peter Zijlstra
2016-11-23 16:05       ` RFC: documentation of the autogroup feature Michael Kerrisk (man-pages)
2016-11-23 17:19         ` Mike Galbraith
2016-11-23 22:12           ` Michael Kerrisk (man-pages)
2016-11-27 21:13       ` Michael Kerrisk (man-pages)
2016-11-28  1:46         ` Mike Galbraith
     [not found]           ` <1127218a-dd9b-71a8-845d-3a83969632fc@gmail.com>
2016-11-29  9:10             ` Michael Kerrisk (man-pages)
2016-11-29 13:46               ` Mike Galbraith

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).