virtualization.lists.linux-foundation.org archive mirror
 help / color / mirror / Atom feed
From: "Michael S. Tsirkin" <mst@redhat.com>
To: Mike Christie <michael.christie@oracle.com>
Cc: brauner@kernel.org, konrad.wilk@oracle.com,
	linux-kernel@vger.kernel.org,
	virtualization@lists.linux-foundation.org, hch@infradead.org,
	ebiederm@xmission.com, stefanha@redhat.com,
	torvalds@linux-foundation.org
Subject: Re: [PATCH v11 8/8] vhost: use vhost_tasks for worker threads
Date: Sun, 13 Aug 2023 15:01:24 -0400	[thread overview]
Message-ID: <20230813145936-mutt-send-email-mst@kernel.org> (raw)
In-Reply-To: <b2b02526-913d-42a9-9d23-59badf5b96db@oracle.com>

On Fri, Aug 11, 2023 at 01:51:36PM -0500, Mike Christie wrote:
> On 8/10/23 1:57 PM, Michael S. Tsirkin wrote:
> > On Sat, Jul 22, 2023 at 11:03:29PM -0500, michael.christie@oracle.com wrote:
> >> On 7/20/23 8:06 AM, Michael S. Tsirkin wrote:
> >>> On Thu, Feb 02, 2023 at 05:25:17PM -0600, Mike Christie wrote:
> >>>> For vhost workers we use the kthread API which inherit's its values from
> >>>> and checks against the kthreadd thread. This results in the wrong RLIMITs
> >>>> being checked, so while tools like libvirt try to control the number of
> >>>> threads based on the nproc rlimit setting we can end up creating more
> >>>> threads than the user wanted.
> >>>>
> >>>> This patch has us use the vhost_task helpers which will inherit its
> >>>> values/checks from the thread that owns the device similar to if we did
> >>>> a clone in userspace. The vhost threads will now be counted in the nproc
> >>>> rlimits. And we get features like cgroups and mm sharing automatically,
> >>>> so we can remove those calls.
> >>>>
> >>>> Signed-off-by: Mike Christie <michael.christie@oracle.com>
> >>>> Acked-by: Michael S. Tsirkin <mst@redhat.com>
> >>>
> >>>
> >>> Hi Mike,
> >>> So this seems to have caused a measureable regression in networking
> >>> performance (about 30%). Take a look here, and there's a zip file
> >>> with detailed measuraments attached:
> >>>
> >>> https://bugzilla.redhat.com/show_bug.cgi?id=2222603
> >>>
> >>>
> >>> Could you take a look please?
> >>> You can also ask reporter questions there assuming you
> >>> have or can create a (free) account.
> >>>
> >>
> >> Sorry for the late reply. I just got home from vacation.
> >>
> >> The account creation link seems to be down. I keep getting a
> >> "unable to establish SMTP connection to bz-exim-prod port 25 " error.
> >>
> >> Can you give me Quan's email?
> >>
> >> I think I can replicate the problem. I just need some extra info from Quan:
> >>
> >> 1. Just double check that they are using RHEL 9 on the host running the VMs.
> >> 2. The kernel config
> >> 3. Any tuning that was done. Is tuned running in guest and/or host running the
> >> VMs and what profile is being used in each.
> >> 4. Number of vCPUs and virtqueues being used.
> >> 5. Can they dump the contents of:
> >>
> >> /sys/kernel/debug/sched
> >>
> >> and
> >>
> >> sysctl  -a
> >>
> >> on the host running the VMs.
> >>
> >> 6. With the 6.4 kernel, can they also run a quick test and tell me if they set
> >> the scheduler to batch:
> >>
> >> ps -T -o comm,pid,tid $QEMU_THREAD
> >>
> >> then for each vhost thread do:
> >>
> >> chrt -b -p 0 $VHOST_THREAD
> >>
> >> Does that end up increasing perf? When I do this I see throughput go up by
> >> around 50% vs 6.3 when sessions was 16 or more (16 was the number of vCPUs
> >> and virtqueues per net device in the VM). Note that I'm not saying that is a fix.
> >> It's just a difference I noticed when running some other tests.
> > 
> > 
> > Mike I'm unsure what to do at this point. Regressions are not nice
> > but if the kernel is released with the new userspace api we won't
> > be able to revert. So what's the plan?
> > 
> 
> I'm sort of stumped. I still can't replicate the problem out of the box. 6.3 and
> 6.4 perform the same for me. I've tried your setup and settings and with different
> combos of using things like tuned and irqbalance.
> 
> I can sort of force the issue. In 6.4, the vhost thread inherits it's settings
> from the parent thread. In 6.3, the vhost thread inherits from kthreadd and we
> would then reset the sched settings. So in 6.4 if I just tune the parent differently
> I can cause different performance. If we want the 6.3 behavior we can do the patch
> below.
> 
> However, I don't think you guys are hitting this because you are just running
> qemu from the normal shell and were not doing anything fancy with the sched
> settings.
> 
> 
> diff --git a/kernel/vhost_task.c b/kernel/vhost_task.c
> index da35e5b7f047..f2c2638d1106 100644
> --- a/kernel/vhost_task.c
> +++ b/kernel/vhost_task.c
> @@ -2,6 +2,7 @@
>  /*
>   * Copyright (C) 2021 Oracle Corporation
>   */
> +#include <uapi/linux/sched/types.h>
>  #include <linux/slab.h>
>  #include <linux/completion.h>
>  #include <linux/sched/task.h>
> @@ -22,9 +23,16 @@ struct vhost_task {
>  
>  static int vhost_task_fn(void *data)
>  {
> +	static const struct sched_param param = { .sched_priority = 0 };
>  	struct vhost_task *vtsk = data;
>  	bool dead = false;
>  
> +	/*
> +	 * Don't inherit the parent's sched info, so we maintain compat from
> +	 * when we used kthreads and it reset this info.
> +	 */
> +	sched_setscheduler_nocheck(current, SCHED_NORMAL, &param);
> +
>  	for (;;) {
>  		bool did_work;
>  
> 
> 

yes seems unlikely, still, attach this to bugzilla so it can be
tested?

and, what will help you debug? any traces to enable?

Also wasn't there another issue with a non standard config?
Maybe if we fix that it will by chance fix this one too?

> 
> 

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization

  reply	other threads:[~2023-08-13 19:01 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-02-02 23:25 [PATCH v11 0/8] Use copy_process in vhost layer Mike Christie
2023-02-02 23:25 ` [PATCH v11 1/8] fork: Make IO worker options flag based Mike Christie
2023-02-03  0:14   ` Linus Torvalds
2023-02-02 23:25 ` [PATCH v11 2/8] fork/vm: Move common PF_IO_WORKER behavior to new flag Mike Christie
2023-02-02 23:25 ` [PATCH v11 3/8] fork: add USER_WORKER flag to not dup/clone files Mike Christie
2023-02-03  0:16   ` Linus Torvalds
2023-02-02 23:25 ` [PATCH v11 4/8] fork: Add USER_WORKER flag to ignore signals Mike Christie
2023-02-03  0:19   ` Linus Torvalds
2023-02-05 16:06     ` Mike Christie
2023-02-02 23:25 ` [PATCH v11 5/8] fork: allow kernel code to call copy_process Mike Christie
2023-02-02 23:25 ` [PATCH v11 6/8] vhost_task: Allow vhost layer to use copy_process Mike Christie
2023-02-03  0:43   ` Linus Torvalds
2023-02-02 23:25 ` [PATCH v11 7/8] vhost: move worker thread fields to new struct Mike Christie
2023-02-02 23:25 ` [PATCH v11 8/8] vhost: use vhost_tasks for worker threads Mike Christie
     [not found]   ` <aba6cca4-e66c-768f-375c-b38c8ba5e8a8@6wind.com>
2023-05-05 18:22     ` Linus Torvalds
2023-05-05 22:37       ` Mike Christie
2023-05-06  1:53         ` Linus Torvalds
2023-05-13 12:39         ` Thorsten Leemhuis
2023-05-13 15:08           ` Linus Torvalds
     [not found]             ` <20230515-vollrausch-liebgeworden-2765f3ca3540@brauner>
2023-05-15 15:44               ` Linus Torvalds
2023-05-15 15:52                 ` Jens Axboe
2023-05-15 15:54                   ` Linus Torvalds
2023-05-15 17:23                     ` Linus Torvalds
2023-05-15 15:56                   ` Linus Torvalds
2023-05-15 22:23                 ` Mike Christie
2023-05-15 22:54                   ` Linus Torvalds
2023-05-16  3:53                     ` Mike Christie
2023-05-16 13:18                       ` Oleg Nesterov
2023-05-16 13:40                       ` Oleg Nesterov
2023-05-16 15:56                     ` Eric W. Biederman
2023-05-16 18:37                       ` Oleg Nesterov
2023-05-16 20:12                         ` Eric W. Biederman
2023-05-17 17:09                           ` Oleg Nesterov
2023-05-17 18:22                             ` Mike Christie
     [not found]                   ` <20230516-weltmeere-backofen-27f12ae2c9e0@brauner>
2023-05-16 16:24                     ` Mike Christie
2023-07-20 13:06   ` Michael S. Tsirkin
2023-07-23  4:03     ` michael.christie
2023-07-23  9:31       ` Michael S. Tsirkin
2023-08-10 18:57       ` Michael S. Tsirkin
2023-08-11 18:51         ` Mike Christie
2023-08-13 19:01           ` Michael S. Tsirkin [this message]
2023-08-14  3:13             ` michael.christie

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230813145936-mutt-send-email-mst@kernel.org \
    --to=mst@redhat.com \
    --cc=brauner@kernel.org \
    --cc=ebiederm@xmission.com \
    --cc=hch@infradead.org \
    --cc=konrad.wilk@oracle.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=michael.christie@oracle.com \
    --cc=stefanha@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=virtualization@lists.linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).