kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] KVM:x86: Let kvm-pit thread inherit the cgroups of the calling process
@ 2022-01-02  8:22 Jietao Xiao
  2022-01-06 12:36 ` Like Xu
  0 siblings, 1 reply; 2+ messages in thread
From: Jietao Xiao @ 2022-01-02  8:22 UTC (permalink / raw)
  To: pbonzini, seanjc, vkuznets, wanpengli, jmattson, joro
  Cc: kvm, linux-kernel, Jietao Xiao

Qemu-kvm will create several kernel threads for each VM including
kvm-nx-lpage-re, vhost, and so on. Both of them properly inherit
the cgroups of the calling process,so they are easy to attach to
the VMM process's cgroups.

Kubernetes has a feature Pod Overhead for accounting for the resources
consumed by the Pod infrastructure(e.g overhead brought by qemu-kvm),
and sandbox container runtime usually creates a sandbox or sandbox
overhead cgroup for this feature. By just simply adding the runtime or
the VMM process to the sandbox's cgroup, vhost and kvm-nx-lpage-re thread
can successfully attach to the sanbox's cgroup but kvm-pit thread cannot.
Besides, in some scenarios, kvm-pit thread can bring some CPU overhead.
So it's better to let the kvm-pit inherit the cgroups of the calling
userspace process.

By queuing the attach cgroup work as the first work after the creation
of the kvm-pit worker thread, the worker thread can successfully attach
to the callings process's cgroups.

Signed-off-by: Jietao Xiao <shawtao1125@gmail.com>
---
 arch/x86/kvm/i8254.c | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
index 0b65a764ed3a..c8dcfd6a9ed4 100644
--- a/arch/x86/kvm/i8254.c
+++ b/arch/x86/kvm/i8254.c
@@ -34,6 +34,7 @@
 
 #include <linux/kvm_host.h>
 #include <linux/slab.h>
+#include <linux/cgroup.h>
 
 #include "ioapic.h"
 #include "irq.h"
@@ -647,6 +648,32 @@ static void pit_mask_notifer(struct kvm_irq_mask_notifier *kimn, bool mask)
 		kvm_pit_reset_reinject(pit);
 }
 
+struct pit_attach_cgroups_struct {
+	struct kthread_work work;
+	struct task_struct *owner;
+	int ret;
+};
+
+static void pit_attach_cgroups_work(struct kthread_work *work)
+{
+	struct pit_attach_cgroups_struct *attach;
+
+	attach = container_of(work, struct pit_attach_cgroups_struct, work);
+	attach->ret = cgroup_attach_task_all(attach->owner, current);
+}
+
+
+static int pit_attach_cgroups(struct kvm_pit *pit)
+{
+	struct pit_attach_cgroups_struct attach;
+
+	attach.owner = current;
+	kthread_init_work(&attach.work, pit_attach_cgroups_work);
+	kthread_queue_work(pit->worker, &attach.work);
+	kthread_flush_work(&attach.work);
+	return attach.ret;
+}
+
 static const struct kvm_io_device_ops pit_dev_ops = {
 	.read     = pit_ioport_read,
 	.write    = pit_ioport_write,
@@ -683,6 +710,10 @@ struct kvm_pit *kvm_create_pit(struct kvm *kvm, u32 flags)
 	if (IS_ERR(pit->worker))
 		goto fail_kthread;
 
+	ret = pit_attach_cgroups(pit);
+	if (ret < 0)
+		goto fail_attach_cgroups;
+
 	kthread_init_work(&pit->expired, pit_do_work);
 
 	pit->kvm = kvm;
@@ -723,6 +754,7 @@ struct kvm_pit *kvm_create_pit(struct kvm *kvm, u32 flags)
 fail_register_pit:
 	mutex_unlock(&kvm->slots_lock);
 	kvm_pit_set_reinject(pit, false);
+fail_attach_cgroups:
 	kthread_destroy_worker(pit->worker);
 fail_kthread:
 	kvm_free_irq_source_id(kvm, pit->irq_source_id);
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] KVM:x86: Let kvm-pit thread inherit the cgroups of the calling process
  2022-01-02  8:22 [PATCH] KVM:x86: Let kvm-pit thread inherit the cgroups of the calling process Jietao Xiao
@ 2022-01-06 12:36 ` Like Xu
  0 siblings, 0 replies; 2+ messages in thread
From: Like Xu @ 2022-01-06 12:36 UTC (permalink / raw)
  To: Paolo Bonzini - Distinguished Engineer (kernel-recipes.org), Jietao Xiao
  Cc: kvm, linux-kernel, seanjc, vkuznets, wanpengli, jmattson, joro

On 2/1/2022 4:22 pm, Jietao Xiao wrote:
> Qemu-kvm will create several kernel threads for each VM including
> kvm-nx-lpage-re, vhost, and so on. Both of them properly inherit
> the cgroups of the calling process,so they are easy to attach to
> the VMM process's cgroups.
> 
> Kubernetes has a feature Pod Overhead for accounting for the resources
> consumed by the Pod infrastructure(e.g overhead brought by qemu-kvm),
> and sandbox container runtime usually creates a sandbox or sandbox
> overhead cgroup for this feature. By just simply adding the runtime or
> the VMM process to the sandbox's cgroup, vhost and kvm-nx-lpage-re thread
> can successfully attach to the sanbox's cgroup but kvm-pit thread cannot.

Emm, it seems to be true for kvm-pit kthread.

> Besides, in some scenarios, kvm-pit thread can bring some CPU overhead.
> So it's better to let the kvm-pit inherit the cgroups of the calling
> userspace process.

As a side note, there is about ~3% overhead in the firecracker scenario.

> 
> By queuing the attach cgroup work as the first work after the creation
> of the kvm-pit worker thread, the worker thread can successfully attach
> to the callings process's cgroups.
> 
> Signed-off-by: Jietao Xiao <shawtao1125@gmail.com>
> ---
>   arch/x86/kvm/i8254.c | 32 ++++++++++++++++++++++++++++++++
>   1 file changed, 32 insertions(+)
> 
> diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
> index 0b65a764ed3a..c8dcfd6a9ed4 100644
> --- a/arch/x86/kvm/i8254.c
> +++ b/arch/x86/kvm/i8254.c
> @@ -34,6 +34,7 @@
>   
>   #include <linux/kvm_host.h>
>   #include <linux/slab.h>
> +#include <linux/cgroup.h>
>   
>   #include "ioapic.h"
>   #include "irq.h"
> @@ -647,6 +648,32 @@ static void pit_mask_notifer(struct kvm_irq_mask_notifier *kimn, bool mask)
>   		kvm_pit_reset_reinject(pit);
>   }
>   
> +struct pit_attach_cgroups_struct {
> +	struct kthread_work work;
> +	struct task_struct *owner;
> +	int ret;
> +};
> +
> +static void pit_attach_cgroups_work(struct kthread_work *work)
> +{
> +	struct pit_attach_cgroups_struct *attach;
> +
> +	attach = container_of(work, struct pit_attach_cgroups_struct, work);
> +	attach->ret = cgroup_attach_task_all(attach->owner, current);

This cgroup_v1 interface is also called by the vhost_attach_cgroups_work(),
as well as the kvm_vm_worker_thread() in the KVM context.

This part of the code may be a bit redundant as the number of kthreads increases.

> +}
> +
> +
> +static int pit_attach_cgroups(struct kvm_pit *pit)
> +{
> +	struct pit_attach_cgroups_struct attach;
> +
> +	attach.owner = current;
> +	kthread_init_work(&attach.work, pit_attach_cgroups_work);
> +	kthread_queue_work(pit->worker, &attach.work);
> +	kthread_flush_work(&attach.work);
> +	return attach.ret;
> +}
> +
>   static const struct kvm_io_device_ops pit_dev_ops = {
>   	.read     = pit_ioport_read,
>   	.write    = pit_ioport_write,
> @@ -683,6 +710,10 @@ struct kvm_pit *kvm_create_pit(struct kvm *kvm, u32 flags)
>   	if (IS_ERR(pit->worker))
>   		goto fail_kthread;

I wonder if we could unify the kthread_create method for both vhost and kvm-pit
so that all kthreds from kvm_arch_vm agent could share the cgroup_attach_task_all()
code base and more stuff like set_user_nice().

>   
> +	ret = pit_attach_cgroups(pit);
> +	if (ret < 0)
> +		goto fail_attach_cgroups;
> +
>   	kthread_init_work(&pit->expired, pit_do_work);
>   
>   	pit->kvm = kvm;
> @@ -723,6 +754,7 @@ struct kvm_pit *kvm_create_pit(struct kvm *kvm, u32 flags)
>   fail_register_pit:
>   	mutex_unlock(&kvm->slots_lock);
>   	kvm_pit_set_reinject(pit, false);
> +fail_attach_cgroups:
>   	kthread_destroy_worker(pit->worker);

If it fails, could we keep it at least alive and functional ?

>   fail_kthread:
>   	kvm_free_irq_source_id(kvm, pit->irq_source_id);

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2022-01-06 12:36 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-02  8:22 [PATCH] KVM:x86: Let kvm-pit thread inherit the cgroups of the calling process Jietao Xiao
2022-01-06 12:36 ` Like Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).