From: Mike Christie <michael.christie@oracle.com> To: geert@linux-m68k.org, vverma@digitalocean.com, hdanton@sina.com, hch@infradead.org, stefanha@redhat.com, jasowang@redhat.com, mst@redhat.com, sgarzare@redhat.com, virtualization@lists.linux-foundation.org, christian.brauner@ubuntu.com, axboe@kernel.dk, linux-kernel@vger.kernel.org Cc: Christoph Hellwig <hch@lst.de> Subject: [PATCH V6 06/10] fork: add helpers to clone a process for kernel use Date: Mon, 29 Nov 2021 13:47:03 -0600 [thread overview] Message-ID: <20211129194707.5863-7-michael.christie@oracle.com> (raw) In-Reply-To: <20211129194707.5863-1-michael.christie@oracle.com> The vhost layer is creating kthreads to execute IO and management operations. These threads need to share a mm with a userspace thread, inherit cgroups, and we would like to have the thread accounted for under the userspace thread's rlimit nproc value so a user can't overwhelm the system with threads when creating VMs. We have helpers for cgroups and mm but not for the rlimit nproc and in the future we will probably want helpers for things like namespaces. For those two items and to allow future sharing/inheritance, this patch adds two helpers, user_worker_create and user_worker_start that allow callers to create threads that copy or inherit the caller's attributes like mm, cgroups, namespaces, etc, and are accounted for under the callers rlimits nproc value similar to if the caller did a clone() in userspace. However, instead of returning to userspace the thread is usable in the kernel for modules like vhost or layers like io_uring. [added flag validation code from Christian Brauner's SIG_IGN patch] Signed-off-by: Mike Christie <michael.christie@oracle.com> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: Christoph Hellwig <hch@lst.de> --- include/linux/sched/task.h | 5 +++ kernel/fork.c | 72 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 77 insertions(+) diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h index f8a658700075..ecb21c0d95ce 100644 --- a/include/linux/sched/task.h +++ b/include/linux/sched/task.h @@ -95,6 +95,11 @@ struct mm_struct *copy_init_mm(void); extern pid_t kernel_thread(int (*fn)(void *), void *arg, unsigned long flags); extern long kernel_wait4(pid_t, int __user *, int, struct rusage *); int kernel_wait(pid_t pid, int *stat); +struct task_struct *user_worker_create(int (*fn)(void *), void *arg, int node, + unsigned long clone_flags, + u32 worker_flags); +__printf(2, 3) +void user_worker_start(struct task_struct *tsk, const char namefmt[], ...); extern void free_task(struct task_struct *tsk); diff --git a/kernel/fork.c b/kernel/fork.c index c9152596a285..e72239ae1e08 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2543,6 +2543,78 @@ struct task_struct *create_io_thread(int (*fn)(void *), void *arg, int node) return copy_process(NULL, 0, node, &args); } +static bool user_worker_flags_valid(struct kernel_clone_args *kargs) +{ + /* Verify that no unknown flags are passed along. */ + if (kargs->worker_flags & ~(USER_WORKER_IO | USER_WORKER | + USER_WORKER_NO_FILES | USER_WORKER_SIG_IGN)) + return false; + + /* + * If we're ignoring all signals don't allow sharing struct sighand and + * don't bother clearing signal handlers. + */ + if ((kargs->flags & (CLONE_SIGHAND | CLONE_CLEAR_SIGHAND)) && + (kargs->worker_flags & USER_WORKER_SIG_IGN)) + return false; + + return true; +} + +/** + * user_worker_create - create a copy of a process to be used by the kernel + * @fn: thread stack + * @arg: data to be passed to fn + * @node: numa node to allocate task from + * @clone_flags: CLONE flags + * @worker_flags: USER_WORKER flags + * + * This returns a created task, or an error pointer. The returned task is + * inactive, and the caller must fire it up through user_worker_start(). If + * this is an PF_IO_WORKER all singals but KILL and STOP are blocked. + */ +struct task_struct *user_worker_create(int (*fn)(void *), void *arg, int node, + unsigned long clone_flags, + u32 worker_flags) +{ + struct kernel_clone_args args = { + .flags = ((lower_32_bits(clone_flags) | CLONE_VM | + CLONE_UNTRACED) & ~CSIGNAL), + .exit_signal = (lower_32_bits(clone_flags) & CSIGNAL), + .stack = (unsigned long)fn, + .stack_size = (unsigned long)arg, + .worker_flags = USER_WORKER | worker_flags, + }; + + if (!user_worker_flags_valid(&args)) + return ERR_PTR(-EINVAL); + + return copy_process(NULL, 0, node, &args); +} +EXPORT_SYMBOL_GPL(user_worker_create); + +/** + * user_worker_start - Start a task created with user_worker_create + * @tsk: task to wake up + * @namefmt: printf-style format string for the thread name + * @arg: arguments for @namefmt + */ +void user_worker_start(struct task_struct *tsk, const char namefmt[], ...) +{ + char name[TASK_COMM_LEN]; + va_list args; + + WARN_ON(!(tsk->flags & PF_USER_WORKER)); + + va_start(args, namefmt); + vsnprintf(name, sizeof(name), namefmt, args); + set_task_comm(tsk, name); + va_end(args); + + wake_up_new_task(tsk); +} +EXPORT_SYMBOL_GPL(user_worker_start); + /* * Ok, this is the main fork-routine. * -- 2.25.1 _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
WARNING: multiple messages have this Message-ID (diff)
From: Mike Christie <michael.christie@oracle.com> To: geert@linux-m68k.org, vverma@digitalocean.com, hdanton@sina.com, hch@infradead.org, stefanha@redhat.com, jasowang@redhat.com, mst@redhat.com, sgarzare@redhat.com, virtualization@lists.linux-foundation.org, christian.brauner@ubuntu.com, axboe@kernel.dk, linux-kernel@vger.kernel.org Cc: Mike Christie <michael.christie@oracle.com>, Christoph Hellwig <hch@lst.de> Subject: [PATCH V6 06/10] fork: add helpers to clone a process for kernel use Date: Mon, 29 Nov 2021 13:47:03 -0600 [thread overview] Message-ID: <20211129194707.5863-7-michael.christie@oracle.com> (raw) In-Reply-To: <20211129194707.5863-1-michael.christie@oracle.com> The vhost layer is creating kthreads to execute IO and management operations. These threads need to share a mm with a userspace thread, inherit cgroups, and we would like to have the thread accounted for under the userspace thread's rlimit nproc value so a user can't overwhelm the system with threads when creating VMs. We have helpers for cgroups and mm but not for the rlimit nproc and in the future we will probably want helpers for things like namespaces. For those two items and to allow future sharing/inheritance, this patch adds two helpers, user_worker_create and user_worker_start that allow callers to create threads that copy or inherit the caller's attributes like mm, cgroups, namespaces, etc, and are accounted for under the callers rlimits nproc value similar to if the caller did a clone() in userspace. However, instead of returning to userspace the thread is usable in the kernel for modules like vhost or layers like io_uring. [added flag validation code from Christian Brauner's SIG_IGN patch] Signed-off-by: Mike Christie <michael.christie@oracle.com> Acked-by: Christian Brauner <christian.brauner@ubuntu.com> Reviewed-by: Christoph Hellwig <hch@lst.de> --- include/linux/sched/task.h | 5 +++ kernel/fork.c | 72 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 77 insertions(+) diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h index f8a658700075..ecb21c0d95ce 100644 --- a/include/linux/sched/task.h +++ b/include/linux/sched/task.h @@ -95,6 +95,11 @@ struct mm_struct *copy_init_mm(void); extern pid_t kernel_thread(int (*fn)(void *), void *arg, unsigned long flags); extern long kernel_wait4(pid_t, int __user *, int, struct rusage *); int kernel_wait(pid_t pid, int *stat); +struct task_struct *user_worker_create(int (*fn)(void *), void *arg, int node, + unsigned long clone_flags, + u32 worker_flags); +__printf(2, 3) +void user_worker_start(struct task_struct *tsk, const char namefmt[], ...); extern void free_task(struct task_struct *tsk); diff --git a/kernel/fork.c b/kernel/fork.c index c9152596a285..e72239ae1e08 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2543,6 +2543,78 @@ struct task_struct *create_io_thread(int (*fn)(void *), void *arg, int node) return copy_process(NULL, 0, node, &args); } +static bool user_worker_flags_valid(struct kernel_clone_args *kargs) +{ + /* Verify that no unknown flags are passed along. */ + if (kargs->worker_flags & ~(USER_WORKER_IO | USER_WORKER | + USER_WORKER_NO_FILES | USER_WORKER_SIG_IGN)) + return false; + + /* + * If we're ignoring all signals don't allow sharing struct sighand and + * don't bother clearing signal handlers. + */ + if ((kargs->flags & (CLONE_SIGHAND | CLONE_CLEAR_SIGHAND)) && + (kargs->worker_flags & USER_WORKER_SIG_IGN)) + return false; + + return true; +} + +/** + * user_worker_create - create a copy of a process to be used by the kernel + * @fn: thread stack + * @arg: data to be passed to fn + * @node: numa node to allocate task from + * @clone_flags: CLONE flags + * @worker_flags: USER_WORKER flags + * + * This returns a created task, or an error pointer. The returned task is + * inactive, and the caller must fire it up through user_worker_start(). If + * this is an PF_IO_WORKER all singals but KILL and STOP are blocked. + */ +struct task_struct *user_worker_create(int (*fn)(void *), void *arg, int node, + unsigned long clone_flags, + u32 worker_flags) +{ + struct kernel_clone_args args = { + .flags = ((lower_32_bits(clone_flags) | CLONE_VM | + CLONE_UNTRACED) & ~CSIGNAL), + .exit_signal = (lower_32_bits(clone_flags) & CSIGNAL), + .stack = (unsigned long)fn, + .stack_size = (unsigned long)arg, + .worker_flags = USER_WORKER | worker_flags, + }; + + if (!user_worker_flags_valid(&args)) + return ERR_PTR(-EINVAL); + + return copy_process(NULL, 0, node, &args); +} +EXPORT_SYMBOL_GPL(user_worker_create); + +/** + * user_worker_start - Start a task created with user_worker_create + * @tsk: task to wake up + * @namefmt: printf-style format string for the thread name + * @arg: arguments for @namefmt + */ +void user_worker_start(struct task_struct *tsk, const char namefmt[], ...) +{ + char name[TASK_COMM_LEN]; + va_list args; + + WARN_ON(!(tsk->flags & PF_USER_WORKER)); + + va_start(args, namefmt); + vsnprintf(name, sizeof(name), namefmt, args); + set_task_comm(tsk, name); + va_end(args); + + wake_up_new_task(tsk); +} +EXPORT_SYMBOL_GPL(user_worker_start); + /* * Ok, this is the main fork-routine. * -- 2.25.1
next prev parent reply other threads:[~2021-11-29 19:47 UTC|newest] Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-11-29 19:46 [PATCH V6 01/10] Use copy_process in vhost layer Mike Christie 2021-11-29 19:46 ` Mike Christie 2021-11-29 19:46 ` [PATCH V6 01/10] fork: Make IO worker options flag based Mike Christie 2021-11-29 19:46 ` Mike Christie 2021-11-29 19:46 ` [PATCH V6 02/10] fork/vm: Move common PF_IO_WORKER behavior to new flag Mike Christie 2021-11-29 19:46 ` Mike Christie 2021-11-29 19:47 ` [PATCH V6 03/10] fork: add USER_WORKER flag to not dup/clone files Mike Christie 2021-11-29 19:47 ` Mike Christie 2021-11-29 19:47 ` [PATCH V6 04/10] fork: Add USER_WORKER flag to ignore signals Mike Christie 2021-11-29 19:47 ` Mike Christie 2021-11-29 19:47 ` [PATCH V6 05/10] signal: Perfom autoreap for PF_USER_WORKER Mike Christie 2021-11-29 19:47 ` Mike Christie 2021-12-17 18:42 ` Eric W. Biederman 2021-12-17 18:42 ` Eric W. Biederman 2021-11-29 19:47 ` Mike Christie [this message] 2021-11-29 19:47 ` [PATCH V6 06/10] fork: add helpers to clone a process for kernel use Mike Christie 2021-12-17 18:53 ` Eric W. Biederman 2021-12-17 18:53 ` Eric W. Biederman 2021-11-29 19:47 ` [PATCH V6 07/10] io_uring: switch to user_worker Mike Christie 2021-11-29 19:47 ` Mike Christie 2021-11-29 19:47 ` [PATCH V6 08/10] fork: remove create_io_thread Mike Christie 2021-11-29 19:47 ` Mike Christie 2021-11-29 19:47 ` [PATCH V6 09/10] vhost: move worker thread fields to new struct Mike Christie 2021-11-29 19:47 ` Mike Christie 2021-11-29 19:47 ` [PATCH V6 10/10] vhost: use user_worker to check RLIMITs Mike Christie 2021-11-29 19:47 ` Mike Christie 2021-12-17 19:01 ` Eric W. Biederman 2021-12-17 19:01 ` Eric W. Biederman 2021-12-08 20:34 ` [PATCH V6 01/10] Use copy_process in vhost layer Michael S. Tsirkin 2021-12-08 20:34 ` Michael S. Tsirkin 2021-12-08 22:13 ` michael.christie 2021-12-08 22:13 ` michael.christie 2021-12-09 9:32 ` Christian Brauner 2021-12-17 19:26 ` Eric W. Biederman 2021-12-17 19:26 ` Eric W. Biederman 2021-12-17 22:08 ` michael.christie 2021-12-17 22:08 ` michael.christie 2021-12-22 0:20 ` Eric W. Biederman 2021-12-22 0:20 ` Eric W. Biederman 2021-12-22 17:32 ` Mike Christie 2021-12-22 17:32 ` Mike Christie 2021-12-22 18:24 ` Eric W. Biederman 2021-12-22 18:24 ` Eric W. Biederman 2021-12-22 20:25 ` Michael S. Tsirkin 2021-12-22 20:25 ` Michael S. Tsirkin 2022-01-17 16:41 ` Mike Christie 2022-01-17 16:41 ` Mike Christie 2022-01-17 17:31 ` Eric W. Biederman 2022-01-17 17:31 ` Eric W. Biederman 2022-01-18 18:51 ` Mike Christie 2022-01-18 18:51 ` Mike Christie 2022-01-18 19:00 ` Mike Christie 2022-01-18 19:00 ` Mike Christie 2022-01-18 19:12 ` Eric W. Biederman 2022-01-18 19:12 ` Eric W. Biederman 2022-02-02 21:02 ` Mike Christie 2022-02-02 21:02 ` Mike Christie
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20211129194707.5863-7-michael.christie@oracle.com \ --to=michael.christie@oracle.com \ --cc=axboe@kernel.dk \ --cc=christian.brauner@ubuntu.com \ --cc=geert@linux-m68k.org \ --cc=hch@infradead.org \ --cc=hch@lst.de \ --cc=hdanton@sina.com \ --cc=jasowang@redhat.com \ --cc=linux-kernel@vger.kernel.org \ --cc=mst@redhat.com \ --cc=sgarzare@redhat.com \ --cc=stefanha@redhat.com \ --cc=virtualization@lists.linux-foundation.org \ --cc=vverma@digitalocean.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.