From: Mike Christie <michael.christie@oracle.com> To: stefanha@redhat.com, jasowang@redhat.com, mst@redhat.com, sgarzare@redhat.com, virtualization@lists.linux-foundation.org, christian.brauner@ubuntu.com, axboe@kernel.dk, linux-kernel@vger.kernel.org Subject: [PATCH 0/8] Use copy_process/create_io_thread in vhost layer Date: Thu, 16 Sep 2021 16:20:43 -0500 [thread overview] Message-ID: <20210916212051.6918-1-michael.christie@oracle.com> (raw) The following patches were made over linus's tree and also apply over Jens's 5.15 io_ring branch and Michael's vhost branch. The patchset allows the vhost layer to do a copy_process on the thread that does the VHOST_SET_OWNER ioctl like how io_uring does a copy_process against its userspace app (Jens, the patches make create_io_thread more generic so that's why you are cc'd). This allows the vhost layer's worker threads to inherit cgroups, namespaces, address space, etc and this worker thread will also be accounted for against that owner/parent process's RLIMIT_NPROC limit. Here is a more detailed problem description: Qemu will create vhost devices in the kernel which perform network, SCSI, etc IO and management operations from worker threads created by the kthread API. Because the kthread API does a copy_process on the kthreadd thread, the vhost layer has to use kthread_use_mm to access the Qemu thread's memory and cgroup_attach_task_all to add itself to the Qemu thread's cgroups. The problem with this approach is that we then have to add new functions/ args/functionality for every thing we want to inherit. I started doing that here: https://lkml.org/lkml/2021/6/23/1233 for the RLIMIT_NPROC check, but it seems it might be easier to just inherit everything from the beginning, becuase I'd need to do something like that patch several times. For example, the current approach does not support cgroups v2 so commands like virsh emulatorpin do not work. The qemu process can go over its RLIMIT_NPROC. And for future vhost interfaces where we export the vhost thread pid we will want the namespace info. _______________________________________________ Virtualization mailing list Virtualization@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/virtualization
WARNING: multiple messages have this Message-ID (diff)
From: Mike Christie <michael.christie@oracle.com> To: stefanha@redhat.com, jasowang@redhat.com, mst@redhat.com, sgarzare@redhat.com, virtualization@lists.linux-foundation.org, christian.brauner@ubuntu.com, axboe@kernel.dk, linux-kernel@vger.kernel.org Subject: [PATCH 0/8] Use copy_process/create_io_thread in vhost layer Date: Thu, 16 Sep 2021 16:20:43 -0500 [thread overview] Message-ID: <20210916212051.6918-1-michael.christie@oracle.com> (raw) The following patches were made over linus's tree and also apply over Jens's 5.15 io_ring branch and Michael's vhost branch. The patchset allows the vhost layer to do a copy_process on the thread that does the VHOST_SET_OWNER ioctl like how io_uring does a copy_process against its userspace app (Jens, the patches make create_io_thread more generic so that's why you are cc'd). This allows the vhost layer's worker threads to inherit cgroups, namespaces, address space, etc and this worker thread will also be accounted for against that owner/parent process's RLIMIT_NPROC limit. Here is a more detailed problem description: Qemu will create vhost devices in the kernel which perform network, SCSI, etc IO and management operations from worker threads created by the kthread API. Because the kthread API does a copy_process on the kthreadd thread, the vhost layer has to use kthread_use_mm to access the Qemu thread's memory and cgroup_attach_task_all to add itself to the Qemu thread's cgroups. The problem with this approach is that we then have to add new functions/ args/functionality for every thing we want to inherit. I started doing that here: https://lkml.org/lkml/2021/6/23/1233 for the RLIMIT_NPROC check, but it seems it might be easier to just inherit everything from the beginning, becuase I'd need to do something like that patch several times. For example, the current approach does not support cgroups v2 so commands like virsh emulatorpin do not work. The qemu process can go over its RLIMIT_NPROC. And for future vhost interfaces where we export the vhost thread pid we will want the namespace info.
next reply other threads:[~2021-09-16 21:21 UTC|newest] Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top 2021-09-16 21:20 Mike Christie [this message] 2021-09-16 21:20 ` [PATCH 0/8] Use copy_process/create_io_thread in vhost layer Mike Christie 2021-09-16 21:20 ` [PATCH 1/8] fork: add helper to clone a process Mike Christie 2021-09-16 21:20 ` Mike Christie 2021-09-17 6:00 ` Christoph Hellwig 2021-09-17 6:00 ` Christoph Hellwig 2021-09-17 7:44 ` Christian Brauner 2021-09-17 8:01 ` Christoph Hellwig 2021-09-17 8:01 ` Christoph Hellwig 2021-09-17 8:43 ` Christian Brauner 2021-09-17 8:48 ` Christian Brauner 2021-09-16 21:20 ` [PATCH 2/8] signal: Export ignore_signals Mike Christie 2021-09-16 21:20 ` Mike Christie 2021-09-16 21:20 ` [PATCH 3/8] fork: add option to not clone or dup files Mike Christie 2021-09-16 21:20 ` Mike Christie 2021-09-17 8:54 ` Christian Brauner 2021-09-16 21:20 ` [PATCH 4/8] fork: move PF_IO_WORKER's kernel frame setup to new flag Mike Christie 2021-09-16 21:20 ` Mike Christie 2021-09-16 21:20 ` [PATCH 5/8] io_uring: switch to kernel_copy_process Mike Christie 2021-09-16 21:20 ` Mike Christie 2021-09-16 21:20 ` [PATCH 6/8] vhost: move worker thread fields to new struct Mike Christie 2021-09-16 21:20 ` Mike Christie 2021-09-16 21:20 ` [PATCH 7/8] vhost: use kernel_copy_process to check RLIMITs and inherit cgroups Mike Christie 2021-09-16 21:20 ` Mike Christie 2021-09-19 8:24 ` Hillf Danton 2021-09-20 20:47 ` kernel test robot 2021-09-20 20:47 ` kernel test robot 2021-09-20 20:47 ` kernel test robot 2021-09-16 21:20 ` [PATCH 8/8] vhost: remove cgroup code Mike Christie 2021-09-16 21:20 ` Mike Christie
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20210916212051.6918-1-michael.christie@oracle.com \ --to=michael.christie@oracle.com \ --cc=axboe@kernel.dk \ --cc=christian.brauner@ubuntu.com \ --cc=jasowang@redhat.com \ --cc=linux-kernel@vger.kernel.org \ --cc=mst@redhat.com \ --cc=sgarzare@redhat.com \ --cc=stefanha@redhat.com \ --cc=virtualization@lists.linux-foundation.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.