From: Adrian Reber <areber@redhat.com> To: mtk.manpages@gmail.com, Christian Brauner <christian.brauner@ubuntu.com> Cc: linux-man@vger.kernel.org Subject: [PATCH v2] clone.2: added clone3() set_tid information Date: Mon, 2 Dec 2019 15:27:40 +0100 Message-ID: <20191202142740.59402-1-areber@redhat.com> (raw) Signed-off-by: Adrian Reber <areber@redhat.com> --- v2: applied changes from review (Michael and Christian) --- man2/clone.2 | 95 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 95 insertions(+) diff --git a/man2/clone.2 b/man2/clone.2 index 076b9258e..c1691dd78 100644 --- a/man2/clone.2 +++ b/man2/clone.2 @@ -195,6 +195,8 @@ struct clone_args { u64 stack; /* Pointer to lowest byte of stack */ u64 stack_size; /* Size of stack */ u64 tls; /* Location of new TLS */ + u64 set_tid; /* Pointer to a \fIpid_t\fP array */ + u64 set_tid_size; /* Number of elements in \fIset_tid\fP */ }; .EE .in @@ -262,6 +264,8 @@ flags & 0xff exit_signal stack stack \fP---\fP stack_size tls tls See CLONE_SETTLS +\fP---\fP set_tid See below for details +\fP---\fP set_tid_size .TE .RE .\" @@ -285,6 +289,76 @@ options when waiting for the child with If no signal (i.e., zero) is specified, then the parent process is not signaled when the child terminates. .\" +.SS The set_tid array +.PP +By default, the kernel chooses the next sequential PID for the new +process in each of the PID namespaces where it is present. +When creating a process with +.BR clone3 (), +the +.I set_tid +array can be used to select specific PIDs for the process in some +or all of the PID namespaces where it is present. +If the PID of the newly created process should only be set for the current +PID namespace or in the newly created PID namespace (if +.I flags +contains +.BR CLONE_NEWPID ) +then the first element in the +.I set_tid +array has to be the desired PID and +.I set_tid_size +needs to be 1. +.PP +If the PID of the newly created process should have a certain value in +multiple PID namespaces the +.I set_tid +array can have multiple entries. The first entry defines the PID in the most +deeply nested PID namespace and all following entries contain the PID of the +corresponding parent PID namespace. The number of PID namespaces in which a PID +should be set is defined by +.I set_tid_size +which cannot be larger than the number of currently nested PID namespaces. +.PP +To create a process with the following PIDs in a PID namespace hierarchy: +.RS +.TS +lb lb +l l . +PID NS level Requested PID +0 (host) 31496 +1 42 +2 7 +.TE +.RE +.PP +Set the array to: +.PP +.EX + set_tid[0] = 7; + set_tid[1] = 42; + set_tid[2] = 31496; + set_tid_size = 3; +.EE +.PP +If only the PIDs in the two innermost PID namespaces +need to be specified, set the array to: +.PP +.EX + set_tid[0] = 7; + set_tid[1] = 42; + set_tid_size = 2; +.EE +.PP +The PID in the PID namespaces outside the two innermost PID namespaces +will be selected the same way as any other PID is selected. +.PP +The +.I set_tid +feature requires +.RB CAP_SYS_ADMIN +in all owning user namespaces of the target PID namespaces. +.\" .SS The flags mask .PP Both @@ -1201,6 +1275,11 @@ will be set appropriately. Too many processes are already running; see .BR fork (2). .TP +.BR EEXIST " (" clone3 "() only)" +One or more of the PIDs specified in +.I set_tid +already exists in the corresponding PID namespace. +.TP .B EINVAL .B CLONE_SIGHAND was specified in the @@ -1379,6 +1458,15 @@ in the .I flags mask. .TP +.BR EINVAL " (" clone3 "() only)" +.I set_tid_size +larger than current number of nested PID namespaces. +.TP +.BR EINVAL " (" clone3 "() only)" +If one of the PIDs specified in +.I set_tid +was an invalid PID. +.TP .B ENOMEM Cannot allocate sufficient memory to allocate a task structure for the child, or to copy those parts of the caller's context that need to be @@ -1450,6 +1538,13 @@ mask and the caller is in a chroot environment (i.e., the caller's root directory does not match the root directory of the mount namespace in which it resides). .TP +.BR EPERM " (" clone3 "() only)" +.I set_tid_size +was greater than zero, and the caller lacks the +.B CAP_SYS_ADMIN +capability in one or more of the user namespaces that own the +corresponding PID namespaces. +.TP .BR ERESTARTNOINTR " (since Linux 2.6.17)" .\" commit 4a2c7a7837da1b91468e50426066d988050e4d56 System call was interrupted by a signal and will be restarted. base-commit: daf57a6ae0d9662cadde3bd750e14253036f6fde prerequisite-patch-id: 517c3fcf393b318a0711f33a93c75a65feca17ca -- 2.23.0
next reply index Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-12-02 14:27 Adrian Reber [this message] 2019-12-02 22:14 ` Christian Brauner 2019-12-04 7:00 ` Adrian Reber
Reply instructions: You may reply publically to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20191202142740.59402-1-areber@redhat.com \ --to=areber@redhat.com \ --cc=christian.brauner@ubuntu.com \ --cc=linux-man@vger.kernel.org \ --cc=mtk.manpages@gmail.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: link
Linux-man Archive on lore.kernel.org Archives are clonable: git clone --mirror https://lore.kernel.org/linux-man/0 linux-man/git/0.git # If you have public-inbox 1.1+ installed, you may # initialize and index your mirror using the following commands: public-inbox-init -V2 linux-man linux-man/ https://lore.kernel.org/linux-man \ linux-man@vger.kernel.org public-inbox-index linux-man Example config snippet for mirrors Newsgroup available over NNTP: nntp://nntp.lore.kernel.org/org.kernel.vger.linux-man AGPL code for this site: git clone https://public-inbox.org/public-inbox.git