linux-man.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* CLONE_INTO_CGROUP documentation?
@ 2020-04-08 11:12 Michael Kerrisk (man-pages)
  2020-04-10 10:41 ` [PATCH] clone.2: Document CLONE_INTO_CGROUP Christian Brauner
  2020-05-18 17:55 ` [PATCH v2] " Christian Brauner
  0 siblings, 2 replies; 11+ messages in thread
From: Michael Kerrisk (man-pages) @ 2020-04-08 11:12 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Oleg Nesterov, Linux API, linux-man, lkml, Tejun Heo,
	open list:CONTROL GROUP (CGROUP)

Hi Christian,

I see that CLONE_INTO_CGROUP has been merged? WOuld you be able to
send a man-pages patch documenting this please?

Thanks,

Michael


--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH] clone.2: Document CLONE_INTO_CGROUP
  2020-04-08 11:12 CLONE_INTO_CGROUP documentation? Michael Kerrisk (man-pages)
@ 2020-04-10 10:41 ` Christian Brauner
  2020-04-10 20:18   ` Michael Kerrisk (man-pages)
  2020-05-18 17:55 ` [PATCH v2] " Christian Brauner
  1 sibling, 1 reply; 11+ messages in thread
From: Christian Brauner @ 2020-04-10 10:41 UTC (permalink / raw)
  To: mtk.manpages
  Cc: cgroups, christian.brauner, linux-api, linux-kernel, linux-man, oleg, tj

From: Christian Brauner <christian.brauner@ubuntu.com>

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
 man2/clone.2 | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

diff --git a/man2/clone.2 b/man2/clone.2
index 39cec4c86..8d9aa9f99 100644
--- a/man2/clone.2
+++ b/man2/clone.2
@@ -197,6 +197,7 @@ struct clone_args {
     u64 tls;          /* Location of new TLS */
     u64 set_tid;      /* Pointer to a \fIpid_t\fP array */
     u64 set_tid_size; /* Number of elements in \fIset_tid\fP */
+    u64 cgroup;       /* Target cgroup file descriptor for the child process */
 };
 .EE
 .in
@@ -448,6 +449,25 @@ Specifying this flag together with
 .B CLONE_SIGHAND
 is nonsensical and disallowed.
 .TP
+.BR CLONE_INTO_CGROUP " (since Linux 5.7)"
+.\" commit ef2c41cf38a7559bbf91af42d5b6a4429db8fc68
+By default, the child process will belong to the same cgroup as its parent.
+If this flag is specified the child process will be created in a
+different cgroup than its parent.
+
+When using
+.RB clone3 ()
+the target cgroup can be specified by setting the
+.I cl_args.cgroup
+member to the file descriptor of the target cgroup. The cgroup file
+descriptor must refer to a cgroup in a cgroup v2 hierarchy
+(see
+.BR cgroup (2)).
+
+Note that all usual cgroup v2 process migration restrictions apply. See
+.BR cgroup (2)
+for detailed information.
+.TP
 .BR CLONE_DETACHED " (historical)"
 For a while (during the Linux 2.5 development series)
 .\" added in 2.5.32; removed in 2.6.0-test4

base-commit: ff5de6ecc4338f4b62c3459c99bd1a3a75ee2808
-- 
2.26.0


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH] clone.2: Document CLONE_INTO_CGROUP
  2020-04-10 10:41 ` [PATCH] clone.2: Document CLONE_INTO_CGROUP Christian Brauner
@ 2020-04-10 20:18   ` Michael Kerrisk (man-pages)
  2020-04-21 14:30     ` Michael Kerrisk (man-pages)
  0 siblings, 1 reply; 11+ messages in thread
From: Michael Kerrisk (man-pages) @ 2020-04-10 20:18 UTC (permalink / raw)
  To: Christian Brauner
  Cc: mtk.manpages, cgroups, christian.brauner, linux-api,
	linux-kernel, linux-man, oleg, tj

Hi Christian,

Thank you for writing this!

On 4/10/20 12:41 PM, Christian Brauner wrote:
> From: Christian Brauner <christian.brauner@ubuntu.com>
> 
> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> ---
>  man2/clone.2 | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> diff --git a/man2/clone.2 b/man2/clone.2
> index 39cec4c86..8d9aa9f99 100644
> --- a/man2/clone.2
> +++ b/man2/clone.2
> @@ -197,6 +197,7 @@ struct clone_args {
>      u64 tls;          /* Location of new TLS */
>      u64 set_tid;      /* Pointer to a \fIpid_t\fP array */
>      u64 set_tid_size; /* Number of elements in \fIset_tid\fP */
> +    u64 cgroup;       /* Target cgroup file descriptor for the child process */
>  };
>  .EE
>  .in
> @@ -448,6 +449,25 @@ Specifying this flag together with
>  .B CLONE_SIGHAND
>  is nonsensical and disallowed.
>  .TP
> +.BR CLONE_INTO_CGROUP " (since Linux 5.7)"
> +.\" commit ef2c41cf38a7559bbf91af42d5b6a4429db8fc68
> +By default, the child process will belong to the same cgroup as its parent.

s/belong to/be placed in/

s/cgroup/version 2 cgroup/

> +If this flag is specified the child process will be created in a
> +different cgroup than its parent.
> +
> +When using
> +.RB clone3 ()
> +the target cgroup can be specified by setting the
> +.I cl_args.cgroup
> +member to the file descriptor of the target cgroup. The cgroup file

We need to say something about how this file descriptor is
obtained. Is it by opening a directory in the v2 cgroup hierarchy?
With what flags? O_RDONLY? or is O_PATH also possible? Yes, these
are some rhetorical questions (I read your nice commit message);
these things need to be explicit in the manual page though.

Also, your commit message mentions a nice list of use cases.
I think it would be well worth capturing those in a paragraph
in the manual page text.

> +descriptor must refer to a cgroup in a cgroup v2 hierarchy
> +(see
> +.BR cgroup (2)).

s/cgroup/cgroups/
s/2/7/

> +
> +Note that all usual cgroup v2 process migration restrictions apply. See
> +.BR cgroup (2)

s/cgroup/cgroups/
s/2/7/

Thanks,

Michael

-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] clone.2: Document CLONE_INTO_CGROUP
  2020-04-10 20:18   ` Michael Kerrisk (man-pages)
@ 2020-04-21 14:30     ` Michael Kerrisk (man-pages)
  2020-04-23 10:14       ` Christian Brauner
  0 siblings, 1 reply; 11+ messages in thread
From: Michael Kerrisk (man-pages) @ 2020-04-21 14:30 UTC (permalink / raw)
  To: Christian Brauner
  Cc: open list:CONTROL GROUP (CGROUP),
	Christian Brauner, Linux API, lkml, linux-man, Oleg Nesterov,
	Tejun Heo

Hi Christian,

Ping!

Cheers,

Michael

On Fri, 10 Apr 2020 at 22:18, Michael Kerrisk (man-pages)
<mtk.manpages@gmail.com> wrote:
>
> Hi Christian,
>
> Thank you for writing this!
>
> On 4/10/20 12:41 PM, Christian Brauner wrote:
> > From: Christian Brauner <christian.brauner@ubuntu.com>
> >
> > Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> > ---
> >  man2/clone.2 | 20 ++++++++++++++++++++
> >  1 file changed, 20 insertions(+)
> >
> > diff --git a/man2/clone.2 b/man2/clone.2
> > index 39cec4c86..8d9aa9f99 100644
> > --- a/man2/clone.2
> > +++ b/man2/clone.2
> > @@ -197,6 +197,7 @@ struct clone_args {
> >      u64 tls;          /* Location of new TLS */
> >      u64 set_tid;      /* Pointer to a \fIpid_t\fP array */
> >      u64 set_tid_size; /* Number of elements in \fIset_tid\fP */
> > +    u64 cgroup;       /* Target cgroup file descriptor for the child process */
> >  };
> >  .EE
> >  .in
> > @@ -448,6 +449,25 @@ Specifying this flag together with
> >  .B CLONE_SIGHAND
> >  is nonsensical and disallowed.
> >  .TP
> > +.BR CLONE_INTO_CGROUP " (since Linux 5.7)"
> > +.\" commit ef2c41cf38a7559bbf91af42d5b6a4429db8fc68
> > +By default, the child process will belong to the same cgroup as its parent.
>
> s/belong to/be placed in/
>
> s/cgroup/version 2 cgroup/
>
> > +If this flag is specified the child process will be created in a
> > +different cgroup than its parent.
> > +
> > +When using
> > +.RB clone3 ()
> > +the target cgroup can be specified by setting the
> > +.I cl_args.cgroup
> > +member to the file descriptor of the target cgroup. The cgroup file
>
> We need to say something about how this file descriptor is
> obtained. Is it by opening a directory in the v2 cgroup hierarchy?
> With what flags? O_RDONLY? or is O_PATH also possible? Yes, these
> are some rhetorical questions (I read your nice commit message);
> these things need to be explicit in the manual page though.
>
> Also, your commit message mentions a nice list of use cases.
> I think it would be well worth capturing those in a paragraph
> in the manual page text.
>
> > +descriptor must refer to a cgroup in a cgroup v2 hierarchy
> > +(see
> > +.BR cgroup (2)).
>
> s/cgroup/cgroups/
> s/2/7/
>
> > +
> > +Note that all usual cgroup v2 process migration restrictions apply. See
> > +.BR cgroup (2)
>
> s/cgroup/cgroups/
> s/2/7/
>
> Thanks,
>
> Michael
>
> --
> Michael Kerrisk
> Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
> Linux/UNIX System Programming Training: http://man7.org/training/



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] clone.2: Document CLONE_INTO_CGROUP
  2020-04-21 14:30     ` Michael Kerrisk (man-pages)
@ 2020-04-23 10:14       ` Christian Brauner
  2020-05-15 11:41         ` Michael Kerrisk (man-pages)
  0 siblings, 1 reply; 11+ messages in thread
From: Christian Brauner @ 2020-04-23 10:14 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Christian Brauner, open list:CONTROL GROUP (CGROUP),
	Linux API, lkml, linux-man, Oleg Nesterov, Tejun Heo

On Tue, Apr 21, 2020 at 04:30:46PM +0200, Michael Kerrisk (man-pages) wrote:
> Hi Christian,
> 
> Ping!

Will likely take a few days until I can get around to prepare a second
version. Sorry for the delay!

Christian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] clone.2: Document CLONE_INTO_CGROUP
  2020-04-23 10:14       ` Christian Brauner
@ 2020-05-15 11:41         ` Michael Kerrisk (man-pages)
  2020-05-15 11:59           ` Christian Brauner
  0 siblings, 1 reply; 11+ messages in thread
From: Michael Kerrisk (man-pages) @ 2020-05-15 11:41 UTC (permalink / raw)
  To: Christian Brauner
  Cc: Christian Brauner, open list:CONTROL GROUP (CGROUP),
	Linux API, lkml, linux-man, Oleg Nesterov, Tejun Heo

Hello Christian,

Ping!

Thanks,

Michael

On Thu, 23 Apr 2020 at 12:14, Christian Brauner
<christian.brauner@ubuntu.com> wrote:
>
> On Tue, Apr 21, 2020 at 04:30:46PM +0200, Michael Kerrisk (man-pages) wrote:
> > Hi Christian,
> >
> > Ping!
>
> Will likely take a few days until I can get around to prepare a second
> version. Sorry for the delay!
>
> Christian



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] clone.2: Document CLONE_INTO_CGROUP
  2020-05-15 11:41         ` Michael Kerrisk (man-pages)
@ 2020-05-15 11:59           ` Christian Brauner
  0 siblings, 0 replies; 11+ messages in thread
From: Christian Brauner @ 2020-05-15 11:59 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Christian Brauner, open list:CONTROL GROUP (CGROUP),
	Linux API, lkml, linux-man, Oleg Nesterov, Tejun Heo

On Fri, May 15, 2020 at 01:41:46PM +0200, Michael Kerrisk (man-pages) wrote:
> Hello Christian,
> 
> Ping!

Yes, I just thought of this when I saw your mail to Aleksa fly by. ;)
Christian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH v2] clone.2: Document CLONE_INTO_CGROUP
  2020-04-08 11:12 CLONE_INTO_CGROUP documentation? Michael Kerrisk (man-pages)
  2020-04-10 10:41 ` [PATCH] clone.2: Document CLONE_INTO_CGROUP Christian Brauner
@ 2020-05-18 17:55 ` Christian Brauner
  2020-05-19 13:36   ` Michael Kerrisk (man-pages)
  1 sibling, 1 reply; 11+ messages in thread
From: Christian Brauner @ 2020-05-18 17:55 UTC (permalink / raw)
  To: mtk.manpages
  Cc: cgroups, christian.brauner, linux-api, linux-kernel, linux-man, oleg, tj

From: Christian Brauner <christian.brauner@ubuntu.com>

Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
---
/* v2 */
- Michael Kerrisk (man-pages) <mtk.manpages@gmail.com>:
  - Fix various types and add examples and how to specify the file
    descriptor.
---
 man2/clone.2 | 43 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 43 insertions(+)

diff --git a/man2/clone.2 b/man2/clone.2
index 8b70b78a4..33594ddc5 100644
--- a/man2/clone.2
+++ b/man2/clone.2
@@ -197,6 +197,7 @@ struct clone_args {
     u64 tls;          /* Location of new TLS */
     u64 set_tid;      /* Pointer to a \fIpid_t\fP array */
     u64 set_tid_size; /* Number of elements in \fIset_tid\fP */
+    u64 cgroup;       /* Target cgroup file descriptor for the child process */
 };
 .EE
 .in
@@ -448,6 +449,48 @@ Specifying this flag together with
 .B CLONE_SIGHAND
 is nonsensical and disallowed.
 .TP
+.BR CLONE_INTO_CGROUP " (since Linux 5.7)"
+.\" commit ef2c41cf38a7559bbf91af42d5b6a4429db8fc68
+By default, the child process will be placed in the same version 2
+cgroup as its parent.
+If this flag is specified the child process will be created in a
+different cgroup than its parent.
+Note, that
+.BR CLONE_INTO_CGROUP
+is limited to version 2 cgroups. To use this feature, callers
+need to raise
+.BR CLONE_INTO_CGROUP
+in
+.I cl_args.flags
+and pass a directory file descriptor (see the
+.BR O_DIRECTORY
+flag for the
+.BR open (2)
+syscall) in the
+.I cl_args.cgroup.
+The caller may also pass an
+.BR O_PATH
+(see
+.BR open (2))
+file descriptor for the target cgroup.
+Note, that all usual version 2 cgroup migration restrictions (see
+.BR cgroups (7)
+for details) apply.
+
+Spawning a process into a cgroup different from the parent's cgroup
+makes it possible for a service manager to directly spawn new
+services into dedicated cgroups. This allows eliminating accounting
+jitter which would be caused by the new process living in the
+parent's cgroup for a short amount of time before being
+moved into the target cgroup. This flag also allows the creation of
+frozen child process by spawning them into a frozen cgroup (see
+.BR cgroups (7)
+for a description of the freezer feature in version 2 cgroups).
+For threaded applications or even thread implementations which
+make use of cgroups to limit individual threads it is possible to
+establish a fixed cgroup layout before spawning each thread
+directly into its target cgroup.
+.TP
 .BR CLONE_DETACHED " (historical)"
 For a while (during the Linux 2.5 development series)
 .\" added in 2.5.32; removed in 2.6.0-test4

base-commit: aa02339ca45030711b42a1af12e3ee3405c1c5c7
-- 
2.26.2


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] clone.2: Document CLONE_INTO_CGROUP
  2020-05-18 17:55 ` [PATCH v2] " Christian Brauner
@ 2020-05-19 13:36   ` Michael Kerrisk (man-pages)
  2020-05-19 13:51     ` Christian Brauner
  0 siblings, 1 reply; 11+ messages in thread
From: Michael Kerrisk (man-pages) @ 2020-05-19 13:36 UTC (permalink / raw)
  To: Christian Brauner
  Cc: mtk.manpages, cgroups, christian.brauner, linux-api,
	linux-kernel, linux-man, oleg, tj

Hello Christian,

Thanks for this patch!

On 5/18/20 7:55 PM, Christian Brauner wrote:
> From: Christian Brauner <christian.brauner@ubuntu.com>
> 
> Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> ---
> /* v2 */
> - Michael Kerrisk (man-pages) <mtk.manpages@gmail.com>:
>   - Fix various types and add examples and how to specify the file
>     descriptor.
> ---
>  man2/clone.2 | 43 +++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 43 insertions(+)
> 
> diff --git a/man2/clone.2 b/man2/clone.2
> index 8b70b78a4..33594ddc5 100644
> --- a/man2/clone.2
> +++ b/man2/clone.2
> @@ -197,6 +197,7 @@ struct clone_args {
>      u64 tls;          /* Location of new TLS */
>      u64 set_tid;      /* Pointer to a \fIpid_t\fP array */
>      u64 set_tid_size; /* Number of elements in \fIset_tid\fP */
> +    u64 cgroup;       /* Target cgroup file descriptor for the child process */
>  };
>  .EE
>  .in
> @@ -448,6 +449,48 @@ Specifying this flag together with
>  .B CLONE_SIGHAND
>  is nonsensical and disallowed.
>  .TP
> +.BR CLONE_INTO_CGROUP " (since Linux 5.7)"
> +.\" commit ef2c41cf38a7559bbf91af42d5b6a4429db8fc68
> +By default, the child process will be placed in the same version 2
> +cgroup as its parent.
> +If this flag is specified the child process will be created in a
> +different cgroup than its parent.
> +Note, that
> +.BR CLONE_INTO_CGROUP
> +is limited to version 2 cgroups. To use this feature, callers
> +need to raise
> +.BR CLONE_INTO_CGROUP
> +in
> +.I cl_args.flags
> +and pass a directory file descriptor (see the
> +.BR O_DIRECTORY
> +flag for the
> +.BR open (2)
> +syscall) in the

I think the mention of O_DIRECTORY here is a bit misleading. That
flag does not need to be used. O_RDONLY or O_PATH suffices; I 
reworded somewhat.

> +.I cl_args.cgroup.
> +The caller may also pass an
> +.BR O_PATH
> +(see
> +.BR open (2))
> +file descriptor for the target cgroup.
> +Note, that all usual version 2 cgroup migration restrictions (see
> +.BR cgroups (7)
> +for details) apply.

Here I presume you mean things like the "no internal processes 
rule" and the restriction around putting a process into a
"domain invalid" cgroup, right? I reworded a things and added
a couple of cases in ERRORS.

> +
> +Spawning a process into a cgroup different from the parent's cgroup
> +makes it possible for a service manager to directly spawn new
> +services into dedicated cgroups. This allows eliminating accounting
> +jitter which would be caused by the new process living in the
> +parent's cgroup for a short amount of time before being
> +moved into the target cgroup. This flag also allows the creation of
> +frozen child process by spawning them into a frozen cgroup (see
> +.BR cgroups (7)
> +for a description of the freezer feature in version 2 cgroups).
> +For threaded applications or even thread implementations which
> +make use of cgroups to limit individual threads it is possible to
> +establish a fixed cgroup layout before spawning each thread
> +directly into its target cgroup.

Thanks for these use cases; that's great!

So, I did some fairly heavy editing, which resulted in the
following (the sum of the diffs is shown at the end of this
mail):

       CLONE_INTO_CGROUP (since Linux 5.7)
              By default, a child process is placed in the same version 2
              cgroup  as  its  parent.   The CLONE_INTO_CGROUP allows the
              child process to  be  created  in  a  different  version  2
              cgroup.   (Note  that CLONE_INTO_CGROUP has effect only for
              version 2 cgroups.)

              In order to place the child process in a different  cgroup,
              the caller specifies CLONE_INTO_CGROUP in cl_args.flags and
              passes a file descriptor that refers to a version 2  cgroup
              in  the cl_args.cgroup field.  (This file descriptor can be
              obtained by opening a cgroup v2 directory file using either
              the  O_RDONLY  or  the  O_PATH flag.)  Note that all of the
              usual restrictions (described in cgroups(7)) on  placing  a
              process into a version 2 cgroup apply.

              Spawning  a  process  into a cgroup different from the par‐
              ent's cgroup makes it possible for  a  service  manager  to
              directly  spawn  new services into dedicated cgroups.  This
              eliminates the accounting jitter that would  be  caused  if
              the  child  process was first created in the same cgroup as
              the parent and then moved  into  the  target  cgroup.   The
              CLONE_INTO_CGROUP  flag  also allows the creation of frozen
              child processes by spawning  them  into  a  frozen  cgroup.
              (See  cgroups(7)  for  a  description  of  the freezer con‐
              troller.)  For threaded applications (or even thread imple‐
              mentations  which  make  use of cgroups to limit individual
              threads), it is possible to establish a fixed cgroup layout
              before  spawning  each  thread  directly  into  its  target
              cgroup.

ERRORS
       EBUSY (clone3() only)
              CLONE_INTO_CGROUP  was  specified in cl_args.flags, but the
              file descriptor specified in  cl_args.cgroup  refers  to  a
              version 2 cgroup in which a domain controller is enabled.

       EOPNOTSUP (clone3() only)
              CLONE_INTO_CGROUP  was  specified in cl_args.flags, but the
              file descriptor specified in  cl_args.cgroup  refers  to  a
              version 2 cgroup that is in the domain invalid state.

Look okay to you?

Thanks,

Michael


diff --git a/man2/clone.2 b/man2/clone.2
index 8b70b78a4..a4ce0d412 100644
--- a/man2/clone.2
+++ b/man2/clone.2
@@ -195,8 +195,12 @@ struct clone_args {
     u64 stack;        /* Pointer to lowest byte of stack */
     u64 stack_size;   /* Size of stack */
     u64 tls;          /* Location of new TLS */
-    u64 set_tid;      /* Pointer to a \fIpid_t\fP array */
-    u64 set_tid_size; /* Number of elements in \fIset_tid\fP */
+    u64 set_tid;      /* Pointer to a \fIpid_t\fP array
+                         (since Linux 5.5) */
+    u64 set_tid_size; /* Number of elements in \fIset_tid\fP
+                         (since Linux 5.5) */
+    u64 cgroup;       /* File descriptor for target cgroup
+                         of child (since Linux 5.7) */
 };
 .EE
 .in
@@ -266,6 +270,7 @@ stack	stack
 tls	tls	See CLONE_SETTLS
 \fP---\fP	set_tid	See below for details
 \fP---\fP	set_tid_size
+\fP---\fP	cgroup	See CLONE_INTO_CGROUP
 .TE
 .RE
 .\"
@@ -448,6 +453,54 @@ Specifying this flag together with
 .B CLONE_SIGHAND
 is nonsensical and disallowed.
 .TP
+.BR CLONE_INTO_CGROUP " (since Linux 5.7)"
+.\" commit ef2c41cf38a7559bbf91af42d5b6a4429db8fc68
+By default, a child process is placed in the same version 2
+cgroup as its parent.
+The
+.B CLONE_INTO_CGROUP
+allows the child process to be created in a different version 2 cgroup.
+(Note that
+.BR CLONE_INTO_CGROUP
+has effect only for version 2 cgroups.)
+.IP
+In order to place the child process in a different cgroup,
+the caller specifies
+.BR CLONE_INTO_CGROUP
+in
+.I cl_args.flags
+and passes a file descriptor that refers to a version 2 cgroup in the
+.I cl_args.cgroup
+field.
+(This file descriptor can be obtained by opening a cgroup v2 directory file
+using either the
+.B O_RDONLY
+or the
+.B O_PATH
+flag.)
+Note that all of the usual restrictions (described in
+.BR cgroups (7))
+on placing a process into a version 2 cgroup apply.
+.IP
+Spawning a process into a cgroup different from the parent's cgroup
+makes it possible for a service manager to directly spawn new
+services into dedicated cgroups.
+This eliminates the accounting
+jitter that would be caused if the child process was first created in the
+same cgroup as the parent and then
+moved into the target cgroup.
+The
+.BR CLONE_INTO_CGROUP
+flag also allows the creation of
+frozen child processes by spawning them into a frozen cgroup.
+(See
+.BR cgroups (7)
+for a description of the freezer controller.)
+For threaded applications (or even thread implementations which
+make use of cgroups to limit individual threads), it is possible to
+establish a fixed cgroup layout before spawning each thread
+directly into its target cgroup.
+.TP
 .BR CLONE_DETACHED " (historical)"
 For a while (during the Linux 2.5 development series)
 .\" added in 2.5.32; removed in 2.6.0-test4
@@ -1304,6 +1357,14 @@ will be set appropriately.
 Too many processes are already running; see
 .BR fork (2).
 .TP
+.BR EBUSY " (" clone3 "() only)"
+.B CLONE_INTO_CGROUP
+was specified in
+.IR cl_args.flags ,
+but the file descriptor specified in
+.IR cl_args.cgroup
+refers to a version 2 cgroup in which a domain controller is enabled.
+.TP
 .BR EEXIST " (" clone3 "() only)"
 One (or more) of the PIDs specified in
 .I set_tid
@@ -1546,6 +1607,16 @@ to be exceeded.
 For further details, see
 .BR namespaces (7).
 .TP
+.BR EOPNOTSUP " (" clone3 "() only)"
+.B CLONE_INTO_CGROUP
+was specified in
+.IR cl_args.flags ,
+but the file descriptor specified in
+.IR cl_args.cgroup
+refers to a version 2 cgroup that is in the
+.IR "domain invalid"
+state.
+.TP
 .B EPERM
 .BR CLONE_NEWCGROUP ,
 .BR CLONE_NEWIPC ,



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] clone.2: Document CLONE_INTO_CGROUP
  2020-05-19 13:36   ` Michael Kerrisk (man-pages)
@ 2020-05-19 13:51     ` Christian Brauner
  2020-05-19 19:44       ` Michael Kerrisk (man-pages)
  0 siblings, 1 reply; 11+ messages in thread
From: Christian Brauner @ 2020-05-19 13:51 UTC (permalink / raw)
  To: Michael Kerrisk (man-pages)
  Cc: Christian Brauner, cgroups, linux-api, linux-kernel, linux-man, oleg, tj

On Tue, May 19, 2020 at 03:36:28PM +0200, Michael Kerrisk (man-pages) wrote:
> Hello Christian,
> 
> Thanks for this patch!

Thanks for making it palatable. :)

> 
> On 5/18/20 7:55 PM, Christian Brauner wrote:
> > From: Christian Brauner <christian.brauner@ubuntu.com>
> > 
> > Signed-off-by: Christian Brauner <christian.brauner@ubuntu.com>
> > ---
> > /* v2 */
> > - Michael Kerrisk (man-pages) <mtk.manpages@gmail.com>:
> >   - Fix various types and add examples and how to specify the file
> >     descriptor.
> > ---
> >  man2/clone.2 | 43 +++++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 43 insertions(+)
> > 
> > diff --git a/man2/clone.2 b/man2/clone.2
> > index 8b70b78a4..33594ddc5 100644
> > --- a/man2/clone.2
> > +++ b/man2/clone.2
> > @@ -197,6 +197,7 @@ struct clone_args {
> >      u64 tls;          /* Location of new TLS */
> >      u64 set_tid;      /* Pointer to a \fIpid_t\fP array */
> >      u64 set_tid_size; /* Number of elements in \fIset_tid\fP */
> > +    u64 cgroup;       /* Target cgroup file descriptor for the child process */
> >  };
> >  .EE
> >  .in
> > @@ -448,6 +449,48 @@ Specifying this flag together with
> >  .B CLONE_SIGHAND
> >  is nonsensical and disallowed.
> >  .TP
> > +.BR CLONE_INTO_CGROUP " (since Linux 5.7)"
> > +.\" commit ef2c41cf38a7559bbf91af42d5b6a4429db8fc68
> > +By default, the child process will be placed in the same version 2
> > +cgroup as its parent.
> > +If this flag is specified the child process will be created in a
> > +different cgroup than its parent.
> > +Note, that
> > +.BR CLONE_INTO_CGROUP
> > +is limited to version 2 cgroups. To use this feature, callers
> > +need to raise
> > +.BR CLONE_INTO_CGROUP
> > +in
> > +.I cl_args.flags
> > +and pass a directory file descriptor (see the
> > +.BR O_DIRECTORY
> > +flag for the
> > +.BR open (2)
> > +syscall) in the
> 
> I think the mention of O_DIRECTORY here is a bit misleading. That
> flag does not need to be used. O_RDONLY or O_PATH suffices; I 
> reworded somewhat.
> 
> > +.I cl_args.cgroup.
> > +The caller may also pass an
> > +.BR O_PATH
> > +(see
> > +.BR open (2))
> > +file descriptor for the target cgroup.
> > +Note, that all usual version 2 cgroup migration restrictions (see
> > +.BR cgroups (7)
> > +for details) apply.
> 
> Here I presume you mean things like the "no internal processes 
> rule" and the restriction around putting a process into a
> "domain invalid" cgroup, right? I reworded a things and added
> a couple of cases in ERRORS.
> 
> > +
> > +Spawning a process into a cgroup different from the parent's cgroup
> > +makes it possible for a service manager to directly spawn new
> > +services into dedicated cgroups. This allows eliminating accounting
> > +jitter which would be caused by the new process living in the
> > +parent's cgroup for a short amount of time before being
> > +moved into the target cgroup. This flag also allows the creation of
> > +frozen child process by spawning them into a frozen cgroup (see
> > +.BR cgroups (7)
> > +for a description of the freezer feature in version 2 cgroups).
> > +For threaded applications or even thread implementations which
> > +make use of cgroups to limit individual threads it is possible to
> > +establish a fixed cgroup layout before spawning each thread
> > +directly into its target cgroup.
> 
> Thanks for these use cases; that's great!
> 
> So, I did some fairly heavy editing, which resulted in the
> following (the sum of the diffs is shown at the end of this
> mail):
> 
>        CLONE_INTO_CGROUP (since Linux 5.7)
>               By default, a child process is placed in the same version 2
>               cgroup  as  its  parent.   The CLONE_INTO_CGROUP allows the

Not a native speaker, but is this missing a noun like "flag"?
"The CLONE_INTO_CGROUP {flag,feature} allows the [...]"?

>               child process to  be  created  in  a  different  version  2
>               cgroup.   (Note  that CLONE_INTO_CGROUP has effect only for
>               version 2 cgroups.)
> 
>               In order to place the child process in a different  cgroup,
>               the caller specifies CLONE_INTO_CGROUP in cl_args.flags and
>               passes a file descriptor that refers to a version 2  cgroup
>               in  the cl_args.cgroup field.  (This file descriptor can be
>               obtained by opening a cgroup v2 directory file using either

Should this just be "opening a cgroup v2 directory" and not "directory
file"? Feels redundant.

>               the  O_RDONLY  or  the  O_PATH flag.)  Note that all of the
>               usual restrictions (described in cgroups(7)) on  placing  a
>               process into a version 2 cgroup apply.
> 
>               Spawning  a  process  into a cgroup different from the par‐
>               ent's cgroup makes it possible for  a  service  manager  to
>               directly  spawn  new services into dedicated cgroups.  This
>               eliminates the accounting jitter that would  be  caused  if
>               the  child  process was first created in the same cgroup as
>               the parent and then moved  into  the  target  cgroup.   The

I forgot to mention that spawning directly into a target cgroup is also
more efficient than moving it after creation. The specific reason is
mentioned in the commit message, the write lock of the semaphore need
not be taken in contrast to when it is moved afterwards. That
implementation details is not that interesting but it might be
interesting to know that it provides performance benefits in general.

>               CLONE_INTO_CGROUP  flag  also allows the creation of frozen
>               child processes by spawning  them  into  a  frozen  cgroup.
>               (See  cgroups(7)  for  a  description  of  the freezer con‐
>               troller.)  For threaded applications (or even thread imple‐
>               mentations  which  make  use of cgroups to limit individual
>               threads), it is possible to establish a fixed cgroup layout
>               before  spawning  each  thread  directly  into  its  target
>               cgroup.
> 
> ERRORS
>        EBUSY (clone3() only)
>               CLONE_INTO_CGROUP  was  specified in cl_args.flags, but the
>               file descriptor specified in  cl_args.cgroup  refers  to  a
>               version 2 cgroup in which a domain controller is enabled.
> 
>        EOPNOTSUP (clone3() only)
>               CLONE_INTO_CGROUP  was  specified in cl_args.flags, but the
>               file descriptor specified in  cl_args.cgroup  refers  to  a
>               version 2 cgroup that is in the domain invalid state.

Ah, good catch with the errnos.

> 
> Look okay to you?

Yep, looks great!
Thanks!
Christian

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] clone.2: Document CLONE_INTO_CGROUP
  2020-05-19 13:51     ` Christian Brauner
@ 2020-05-19 19:44       ` Michael Kerrisk (man-pages)
  0 siblings, 0 replies; 11+ messages in thread
From: Michael Kerrisk (man-pages) @ 2020-05-19 19:44 UTC (permalink / raw)
  To: Christian Brauner
  Cc: mtk.manpages, Christian Brauner, cgroups, linux-api,
	linux-kernel, linux-man, oleg, tj

On 5/19/20 3:51 PM, Christian Brauner wrote:
> On Tue, May 19, 2020 at 03:36:28PM +0200, Michael Kerrisk (man-pages) wrote:
>> On 5/18/20 7:55 PM, Christian Brauner wrote:
>>> From: Christian Brauner <christian.brauner@ubuntu.com>
>>> +
>>> +Spawning a process into a cgroup different from the parent's cgroup
>>> +makes it possible for a service manager to directly spawn new
>>> +services into dedicated cgroups. This allows eliminating accounting
>>> +jitter which would be caused by the new process living in the
>>> +parent's cgroup for a short amount of time before being
>>> +moved into the target cgroup. This flag also allows the creation of
>>> +frozen child process by spawning them into a frozen cgroup (see
>>> +.BR cgroups (7)
>>> +for a description of the freezer feature in version 2 cgroups).
>>> +For threaded applications or even thread implementations which
>>> +make use of cgroups to limit individual threads it is possible to
>>> +establish a fixed cgroup layout before spawning each thread
>>> +directly into its target cgroup.
>>
>> Thanks for these use cases; that's great!
>>
>> So, I did some fairly heavy editing, which resulted in the
>> following (the sum of the diffs is shown at the end of this
>> mail):
>>
>>        CLONE_INTO_CGROUP (since Linux 5.7)
>>               By default, a child process is placed in the same version 2
>>               cgroup  as  its  parent.   The CLONE_INTO_CGROUP allows the
> 
> Not a native speaker, but is this missing a noun like "flag"?
> "The CLONE_INTO_CGROUP {flag,feature} allows the [...]"?

Yes, "flag" was missing. Thanks.

>>               child process to  be  created  in  a  different  version  2
>>               cgroup.   (Note  that CLONE_INTO_CGROUP has effect only for
>>               version 2 cgroups.)
>>
>>               In order to place the child process in a different  cgroup,
>>               the caller specifies CLONE_INTO_CGROUP in cl_args.flags and
>>               passes a file descriptor that refers to a version 2  cgroup
>>               in  the cl_args.cgroup field.  (This file descriptor can be
>>               obtained by opening a cgroup v2 directory file using either
> 
> Should this just be "opening a cgroup v2 directory" and not "directory
> file"? Feels redundant.

Yes, better. Changed.
 
>>               the  O_RDONLY  or  the  O_PATH flag.)  Note that all of the
>>               usual restrictions (described in cgroups(7)) on  placing  a
>>               process into a version 2 cgroup apply.
>>
>>               Spawning  a  process  into a cgroup different from the par‐
>>               ent's cgroup makes it possible for  a  service  manager  to
>>               directly  spawn  new services into dedicated cgroups.  This
>>               eliminates the accounting jitter that would  be  caused  if
>>               the  child  process was first created in the same cgroup as
>>               the parent and then moved  into  the  target  cgroup.   The
> 
> I forgot to mention that spawning directly into a target cgroup is also
> more efficient than moving it after creation. The specific reason is
> mentioned in the commit message, the write lock of the semaphore need
> not be taken in contrast to when it is moved afterwards. That
> implementation details is not that interesting but it might be
> interesting to know that it provides performance benefits in general.

Thanks. I added this sentence:

    Furthermore, spawning the child process directly into a 
    target cgroup is significantly cheaper than moving the child 
    process into the target cgroup after it has been created.

>> Look okay to you?
> 
> Yep, looks great!

Good!

Thanks for the review.

Cheers,

Michael


-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Linux/UNIX System Programming Training: http://man7.org/training/

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2020-05-19 19:44 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-08 11:12 CLONE_INTO_CGROUP documentation? Michael Kerrisk (man-pages)
2020-04-10 10:41 ` [PATCH] clone.2: Document CLONE_INTO_CGROUP Christian Brauner
2020-04-10 20:18   ` Michael Kerrisk (man-pages)
2020-04-21 14:30     ` Michael Kerrisk (man-pages)
2020-04-23 10:14       ` Christian Brauner
2020-05-15 11:41         ` Michael Kerrisk (man-pages)
2020-05-15 11:59           ` Christian Brauner
2020-05-18 17:55 ` [PATCH v2] " Christian Brauner
2020-05-19 13:36   ` Michael Kerrisk (man-pages)
2020-05-19 13:51     ` Christian Brauner
2020-05-19 19:44       ` Michael Kerrisk (man-pages)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).