kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jon Kohler <jon@nutanix.com>
To: unlisted-recipients:; (no To-header on input)
Cc: Jon Kohler <jon@nutanix.com>,
	Bijan Mottahedeh <bijan.mottahedeh@nutanix.com>,
	Raphael Norwitz <raphael.norwitz@nutanix.com>,
	Junaid Shahid <junaids@google.com>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Sean Christopherson <seanjc@google.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wanpeng Li <wanpengli@tencent.com>,
	Jim Mattson <jmattson@google.com>, Joerg Roedel <joro@8bytes.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Ingo Molnar <mingo@redhat.com>, Borislav Petkov <bp@alien8.de>,
	x86@kernel.org, "H. Peter Anvin" <hpa@zytor.com>,
	kvm@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [PATCH] kvm: x86: move srcu lock out of kvm_vcpu_check_block
Date: Wed, 28 Apr 2021 13:38:19 -0400	[thread overview]
Message-ID: <20210428173820.13051-1-jon@nutanix.com> (raw)

When using halt polling on x86, srcu_read_lock + unlock overhead [1] is
high in a bursty workload, showing as ~8% of samples in a 60-sec flame
graph.

kvm_vcpu_block calls kvm_vcpu_check_block for both halt polling and
normal blocking. kvm_vcpu_check_block takes srcu_read on kvm->srcu.
This was added in 50c28f21d0 to support fast CR3 and was questioned [2]
at the time but made it in such that we take the lock even for
non-nested. This only appears to be valid for nested situations, where
we will eventually call kvm_vcpu_running and vmx_check_nested_events.
This check is hidden behind is_guest_mode() and therefore does not
seem to apply to non-nested workloads.

To improve performance, this moves kvm->srcu lock logic from
kvm_vcpu_check_block to kvm_vcpu_running and wraps directly around
check_events. Also adds a hint for callers to tell
kvm_vcpu_running whether or not to acquire srcu, which is useful in
situations where the lock may already be held. With this in place, we
see roughly 5% improvement in an internal benchmark [3] and no more
impact from this lock on non-nested workloads.

[1] perf top output in heavy workload
Overhead  Shared Object  Symbol
   9.24%  [kernel]       [k] __srcu_read_lock
   7.48%  [kernel]       [k] __srcu_read_unlock

[2] Locking originally discussed here
https://patchwork.kernel.org/project/kvm/patch/20180612225244.71856-9-junaids@google.com/

[3] Internal benchmark details
Fixed-rate 100 GBytes/second 1MB random read IO ran against the
internal in-memory read cache of Nutanix AOS, 16 threads on a 22
vCPU CentOS 7.9 VM. Before: ~120us avg latency, After: ~113us.

Fixes: 50c28f21d0 ("kvm: x86: Use fast CR3 switch for nested VMX")
Signed-off-by: Jon Kohler <jon@nutanix.com>
Reviewed-by: Bijan Mottahedeh <bijan.mottahedeh@nutanix.com>
Reviewed-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Cc: Junaid Shahid <junaids@google.com>
---
 arch/x86/kvm/x86.c  | 24 +++++++++++++++++++-----
 virt/kvm/kvm_main.c | 21 +++++++--------------
 2 files changed, 26 insertions(+), 19 deletions(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index efc7a82ab140..354f690cc982 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9273,10 +9273,24 @@ static inline int vcpu_block(struct kvm *kvm, struct kvm_vcpu *vcpu)
 	return 1;
 }

-static inline bool kvm_vcpu_running(struct kvm_vcpu *vcpu)
+static inline bool kvm_vcpu_running(struct kvm_vcpu *vcpu, bool acquire_srcu)
 {
-	if (is_guest_mode(vcpu))
-		kvm_x86_ops.nested_ops->check_events(vcpu);
+	if (is_guest_mode(vcpu)) {
+		if (acquire_srcu) {
+			/*
+			 * We need to lock because check_events could call
+			 * nested_vmx_vmexit() which might need to resolve a
+			 * valid memslot. We will have this lock only when
+			 * called from vcpu_run but not when called from
+			 * kvm_vcpu_check_block > kvm_arch_vcpu_runnable.
+			 */
+			int idx = srcu_read_lock(&vcpu->kvm->srcu);
+			kvm_x86_ops.nested_ops->check_events(vcpu);
+			srcu_read_unlock(&vcpu->kvm->srcu, idx);
+		} else {
+			kvm_x86_ops.nested_ops->check_events(vcpu);
+		}
+	}

 	return (vcpu->arch.mp_state == KVM_MP_STATE_RUNNABLE &&
 		!vcpu->arch.apf.halted);
@@ -9291,7 +9305,7 @@ static int vcpu_run(struct kvm_vcpu *vcpu)
 	vcpu->arch.l1tf_flush_l1d = true;

 	for (;;) {
-		if (kvm_vcpu_running(vcpu)) {
+		if (kvm_vcpu_running(vcpu, false)) {
 			r = vcpu_enter_guest(vcpu);
 		} else {
 			r = vcpu_block(kvm, vcpu);
@@ -10999,7 +11013,7 @@ static inline bool kvm_vcpu_has_events(struct kvm_vcpu *vcpu)

 int kvm_arch_vcpu_runnable(struct kvm_vcpu *vcpu)
 {
-	return kvm_vcpu_running(vcpu) || kvm_vcpu_has_events(vcpu);
+	return kvm_vcpu_running(vcpu, true) || kvm_vcpu_has_events(vcpu);
 }

 bool kvm_arch_dy_runnable(struct kvm_vcpu *vcpu)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 383df23514b9..05e29aed35b5 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -2783,22 +2783,15 @@ static void shrink_halt_poll_ns(struct kvm_vcpu *vcpu)

 static int kvm_vcpu_check_block(struct kvm_vcpu *vcpu)
 {
-	int ret = -EINTR;
-	int idx = srcu_read_lock(&vcpu->kvm->srcu);
-
 	if (kvm_arch_vcpu_runnable(vcpu)) {
 		kvm_make_request(KVM_REQ_UNHALT, vcpu);
-		goto out;
+		return -EINTR;
 	}
-	if (kvm_cpu_has_pending_timer(vcpu))
-		goto out;
-	if (signal_pending(current))
-		goto out;

-	ret = 0;
-out:
-	srcu_read_unlock(&vcpu->kvm->srcu, idx);
-	return ret;
+	if (kvm_cpu_has_pending_timer(vcpu) || signal_pending(current))
+		return -EINTR;
+
+	return 0;
 }

 static inline void
--
2.24.3 (Apple Git-128)


             reply	other threads:[~2021-04-28 17:40 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-28 17:38 Jon Kohler [this message]
2021-04-30 20:45 ` [PATCH] kvm: x86: move srcu lock out of kvm_vcpu_check_block Sean Christopherson
2021-05-01 13:05   ` Paolo Bonzini
2021-05-05 15:46     ` Jon Kohler
2021-05-19 21:53       ` Sean Christopherson
2021-05-20 12:31         ` Paolo Bonzini
     [not found] <<20210428173820.13051-1-jon@nutanix.com>
2021-04-30  1:54 ` Prashanth Sreenivasa

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210428173820.13051-1-jon@nutanix.com \
    --to=jon@nutanix.com \
    --cc=bijan.mottahedeh@nutanix.com \
    --cc=bp@alien8.de \
    --cc=hpa@zytor.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=junaids@google.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=raphael.norwitz@nutanix.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).