All of lore.kernel.org
 help / color / mirror / Atom feed
From: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
To: Marcelo Tosatti <mtosatti@redhat.com>
Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: Re: [patch 2/3] KVM: x86: KVM_HC_RT_PRIO hypercall (host-side)
Date: Thu, 21 Sep 2017 09:32:12 -0400	[thread overview]
Message-ID: <20170921133212.GN26248@char.us.oracle.com> (raw)
In-Reply-To: <20170921114039.364395490@redhat.com>

On Thu, Sep 21, 2017 at 08:38:37AM -0300, Marcelo Tosatti wrote:
> When executing guest vcpu-0 with FIFO:1 priority, which is necessary to 
> deal with the following situation:
> 
> VCPU-0 (housekeeping VCPU)		VCPU-1 (realtime VCPU)
> 
> raw_spin_lock(A)
> interrupted, schedule task T-1		raw_spin_lock(A) (spin)
> 
> raw_spin_unlock(A)
> 
> Certain operations must interrupt guest vcpu-0 (see trace below).
> 
> To fix this issue, only change guest vcpu-0 to FIFO priority
> on spinlock critical sections (see patch).
> 
> Hang trace
> ==========
> 
> Without FIFO priority:
> 
> qemu-kvm-6705  [002] ....1.. 767785.648964: kvm_exit: reason IO_INSTRUCTION rip 0xe8fe info 1f00039 0
> qemu-kvm-6705  [002] ....1.. 767785.648965: kvm_exit: reason IO_INSTRUCTION rip 0xe911 info 3f60008 0
> qemu-kvm-6705  [002] ....1.. 767785.648968: kvm_exit: reason IO_INSTRUCTION rip 0x8984 info 608000b 0
> qemu-kvm-6705  [002] ....1.. 767785.648971: kvm_exit: reason IO_INSTRUCTION rip 0xb313 info 1f70008 0
> qemu-kvm-6705  [002] ....1.. 767785.648974: kvm_exit: reason IO_INSTRUCTION rip 0xb514 info 3f60000 0
> qemu-kvm-6705  [002] ....1.. 767785.648977: kvm_exit: reason PENDING_INTERRUPT rip 0x8052 info 0 0
> qemu-kvm-6705  [002] ....1.. 767785.648980: kvm_exit: reason IO_INSTRUCTION rip 0xeee6 info 200040 0
> qemu-kvm-6705  [002] ....1.. 767785.648999: kvm_exit: reason EPT_MISCONFIG rip 0x2120 info 0 0
> 
> With FIFO priority:
> 
> qemu-kvm-7636  [002] ....1.. 768218.205065: kvm_exit: reason IO_INSTRUCTION rip 0xb313 info 1f70008 0
> qemu-kvm-7636  [002] ....1.. 768218.205068: kvm_exit: reason IO_INSTRUCTION rip 0x8984 info 608000b 0
> qemu-kvm-7636  [002] ....1.. 768218.205071: kvm_exit: reason IO_INSTRUCTION rip 0xb313 info 1f70008 0
> qemu-kvm-7636  [002] ....1.. 768218.205074: kvm_exit: reason IO_INSTRUCTION rip 0x8984 info 608000b 0
> qemu-kvm-7636  [002] ....1.. 768218.205077: kvm_exit: reason IO_INSTRUCTION rip 0xb313 info 1f70008 0
> ..
> 
> Performance numbers (kernel compilation with make -j2)
> ======================================================
> 
> With hypercall: 4:40.  (make -j2)
> Without hypercall: 3:38.  (make -j2)
> 
> Note for NFV workloads spinlock performance is not relevant
> since DPDK should not enter the kernel (and housekeeping vcpu
> performance is far from a key factor).
> 
> Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
> 
> ---
>  Documentation/virtual/kvm/hypercalls.txt |   22 +++++++++++++++
>  arch/x86/kvm/x86.c                       |   43 +++++++++++++++++++++++++++++++
>  include/uapi/linux/kvm_para.h            |    2 +
>  3 files changed, 67 insertions(+)
> 
> Index: kvm.fifopriohc-submit/Documentation/virtual/kvm/hypercalls.txt
> ===================================================================
> --- kvm.fifopriohc-submit.orig/Documentation/virtual/kvm/hypercalls.txt
> +++ kvm.fifopriohc-submit/Documentation/virtual/kvm/hypercalls.txt
> @@ -121,3 +121,25 @@ compute the CLOCK_REALTIME for its clock
>  
>  Returns KVM_EOPNOTSUPP if the host does not use TSC clocksource,
>  or if clock type is different than KVM_CLOCK_PAIRING_WALLCLOCK.
> +
> +6. KVM_HC_RT_PRIO
> +------------------------
> +Architecture: x86
> +Status: active
> +Purpose: Hypercall used to change qemu vcpu process -RT priority.

So the guest can change the scheduling decisions at the host level?
And the host HAS to follow it? There is no policy override for the
host to say - nah, not going to do it?

Also wouldn't the guest want to always be at SCHED_FIFO? [I am thinking
of a guest admin who wants all the CPU resources he can get]


> +
> +Usage: Having a pCPU share a FIFO:1 vcpu and a QEMU emulator thread
> +can be problematic: especially if the vcpu busy-spins on memory waiting
> +for the QEMU emulator thread to write to, which leads to a hang

.. Is the QEMU emulator writing to the spinlock memory?

> +(because the FIFO:1 vcpu is never scheduled in favor of QEMU emulator
> +thread).
> +So this hypercall is supposed to be called by the guest when
> +the OS knows its not going to busy spin on memory thats
> +written by the emulator thread as above.
> +
> +a0: bit 0 contains enable bit, if 0 indicates that SCHED_OTHER
> +priority should be set for vcpu, if 1 indicates SCHED_FIFO
> +priority (the actual value for FIFO priority is decided
> +by the host).
> +
> +
> Index: kvm.fifopriohc-submit/include/uapi/linux/kvm_para.h
> ===================================================================
> --- kvm.fifopriohc-submit.orig/include/uapi/linux/kvm_para.h
> +++ kvm.fifopriohc-submit/include/uapi/linux/kvm_para.h
> @@ -15,6 +15,7 @@
>  #define KVM_E2BIG		E2BIG
>  #define KVM_EPERM		EPERM
>  #define KVM_EOPNOTSUPP		95
> +#define KVM_EINVAL		EINVAL
>  
>  #define KVM_HC_VAPIC_POLL_IRQ		1
>  #define KVM_HC_MMU_OP			2
> @@ -25,6 +26,7 @@
>  #define KVM_HC_MIPS_EXIT_VM		7
>  #define KVM_HC_MIPS_CONSOLE_OUTPUT	8
>  #define KVM_HC_CLOCK_PAIRING		9
> +#define KVM_HC_RT_PRIO			10
>  
>  /*
>   * hypercalls use architecture specific
> Index: kvm.fifopriohc-submit/arch/x86/kvm/x86.c
> ===================================================================
> --- kvm.fifopriohc-submit.orig/arch/x86/kvm/x86.c
> +++ kvm.fifopriohc-submit/arch/x86/kvm/x86.c
> @@ -66,6 +66,8 @@
>  #include <asm/pvclock.h>
>  #include <asm/div64.h>
>  #include <asm/irq_remapping.h>
> +#include <uapi/linux/sched/types.h>
> +#include <uapi/linux/sched.h>
>  
>  #define CREATE_TRACE_POINTS
>  #include "trace.h"
> @@ -6261,6 +6263,44 @@ void kvm_vcpu_deactivate_apicv(struct kv
>  	kvm_x86_ops->refresh_apicv_exec_ctrl(vcpu);
>  }
>  
> +static int convert_to_kvm_errcode(int error)
> +{
> +	switch (error) {
> +	case -EPERM:
> +		return -KVM_EPERM;
> +	case -EINVAL:
> +	default:
> +		return -KVM_EINVAL;
> +	}
> +}
> +
> +int kvm_pv_rt_prio(struct kvm_vcpu *vcpu, unsigned long a0)
> +{
> +	int ret;
> +	bool enable;
> +	struct sched_param param;
> +
> +	memset(&param, 0, sizeof(struct sched_param));
> +	param.sched_priority = vcpu->arch.rt_sched_priority;
> +
> +	enable = a0 & 0x1;
> +
> +	if (vcpu->arch.enable_rt_prio_hc == false)
> +		return -KVM_EPERM;
> +
> +	if (enable) {
> +		ret = sched_setscheduler(current, SCHED_FIFO, &param);
> +	} else {
> +		param.sched_priority = 0;
> +		ret = sched_setscheduler(current, SCHED_NORMAL, &param);
> +	}
> +
> +	if (ret)
> +		ret = convert_to_kvm_errcode(ret);
> +
> +	return ret;
> +}
> +
>  int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
>  {
>  	unsigned long nr, a0, a1, a2, a3, ret;
> @@ -6306,6 +6346,9 @@ int kvm_emulate_hypercall(struct kvm_vcp
>  		ret = kvm_pv_clock_pairing(vcpu, a0, a1);
>  		break;
>  #endif
> +	case KVM_HC_RT_PRIO:
> +		ret = kvm_pv_rt_prio(vcpu, a0);
> +		break;
>  	default:
>  		ret = -KVM_ENOSYS;
>  		break;
> 
> 

  reply	other threads:[~2017-09-21 13:32 UTC|newest]

Thread overview: 52+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-09-21 11:38 [patch 0/3] KVM KVM_HC_RT_PRIO hypercall support Marcelo Tosatti
2017-09-21 11:38 ` [patch 1/3] KVM: x86: add per-vcpu option to set guest vcpu -RT priority Marcelo Tosatti
2017-09-21 11:38 ` [patch 2/3] KVM: x86: KVM_HC_RT_PRIO hypercall (host-side) Marcelo Tosatti
2017-09-21 13:32   ` Konrad Rzeszutek Wilk [this message]
2017-09-21 13:49     ` Paolo Bonzini
2017-09-22  1:08       ` Marcelo Tosatti
2017-09-22  7:23         ` Paolo Bonzini
2017-09-22 12:24           ` Marcelo Tosatti
2017-09-21 11:38 ` [patch 3/3] x86: kvm guest side support for KVM_HC_RT_PRIO hypercall Marcelo Tosatti
2017-09-21 13:36   ` Konrad Rzeszutek Wilk
2017-09-21 14:06     ` Peter Zijlstra
2017-09-22  1:10       ` Marcelo Tosatti
2017-09-22 10:00         ` Peter Zijlstra
2017-09-22 10:56           ` Peter Zijlstra
2017-09-22 12:33             ` Marcelo Tosatti
2017-09-22 12:55               ` Peter Zijlstra
2017-09-23 10:56                 ` Paolo Bonzini
2017-09-23 13:41                   ` Peter Zijlstra
2017-09-24 13:05                     ` Paolo Bonzini
2017-09-25  2:57                       ` Marcelo Tosatti
2017-09-25  9:13                         ` Peter Zijlstra
2017-09-25 15:12                           ` Paolo Bonzini
2017-09-26 22:49                             ` [patch 3/3] x86: kvm guest side support for KVM_HC_RT_PRIO hypercall\ Marcelo Tosatti
2017-09-27  9:37                               ` Paolo Bonzini
2017-09-28  0:44                                 ` Marcelo Tosatti
2017-09-28  7:22                                   ` Paolo Bonzini
2017-09-28 21:35                                     ` Marcelo Tosatti
2017-09-28 21:41                                       ` Marcelo Tosatti
2017-09-29  8:18                                       ` Paolo Bonzini
2017-09-29 16:40                                         ` Marcelo Tosatti
2017-09-29 17:05                                           ` Paolo Bonzini
2017-09-29 20:17                                             ` Marcelo Tosatti
2017-10-02 12:30                                               ` Paolo Bonzini
2017-10-02 12:48                                                 ` Peter Zijlstra
2017-09-26 23:22                           ` [patch 3/3] x86: kvm guest side support for KVM_HC_RT_PRIO hypercall Marcelo Tosatti
2017-09-25 16:20                         ` Konrad Rzeszutek Wilk
2017-09-22 12:16           ` Marcelo Tosatti
2017-09-22 12:31             ` Peter Zijlstra
2017-09-22 12:36               ` Marcelo Tosatti
2017-09-22 12:59                 ` Peter Zijlstra
2017-09-25  1:52                   ` Marcelo Tosatti
2017-09-25  8:35                     ` Peter Zijlstra
2017-09-22 12:40               ` [patch 3/3] x86: kvm guest side support for KVM_HC_RT_PRIO hypercall\ Marcelo Tosatti
2017-09-22 13:01                 ` Peter Zijlstra
2017-09-25  2:22                   ` Marcelo Tosatti
2017-09-25  8:58                     ` Peter Zijlstra
2017-09-25 10:41                     ` Thomas Gleixner
2017-09-25 18:28                       ` Jan Kiszka
2017-09-21 17:45 ` [patch 0/3] KVM KVM_HC_RT_PRIO hypercall support Jan Kiszka
2017-09-22  1:19   ` Marcelo Tosatti
2017-09-22  6:23     ` Jan Kiszka
2017-09-26 23:59       ` Marcelo Tosatti

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170921133212.GN26248@char.us.oracle.com \
    --to=konrad.wilk@oracle.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mtosatti@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.