From: Waiman Long <longman@redhat.com>
To: Jeremy Fitzhardinge <jeremy@goop.org>,
Chris Wright <chrisw@sous-sol.org>,
Alok Kataria <akataria@vmware.com>,
Rusty Russell <rusty@rustcorp.com.au>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@redhat.com>,
Thomas Gleixner <tglx@linutronix.de>,
"H. Peter Anvin" <hpa@zytor.com>
Cc: linux-arch@vger.kernel.org, "Juergen Gross" <jgross@suse.com>,
kvm@vger.kernel.org, "Radim Krčmář" <rkrcmar@redhat.com>,
"Pan Xinhui" <xinhui.pan@linux.vnet.ibm.com>,
x86@kernel.org, linux-kernel@vger.kernel.org,
virtualization@lists.linux-foundation.org,
"Waiman Long" <longman@redhat.com>,
"Paolo Bonzini" <pbonzini@redhat.com>,
xen-devel@lists.xenproject.org,
"Boris Ostrovsky" <boris.ostrovsky@oracle.com>
Subject: [PATCH v4 2/2] x86/kvm: Provide optimized version of vcpu_is_preempted() for x86-64
Date: Wed, 15 Feb 2017 16:37:50 -0500 [thread overview]
Message-ID: <1487194670-6319-3-git-send-email-longman__23450.211963109$1487194747$gmane$org@redhat.com> (raw)
In-Reply-To: <1487194670-6319-1-git-send-email-longman@redhat.com>
It was found when running fio sequential write test with a XFS ramdisk
on a KVM guest running on a 2-socket x86-64 system, the %CPU times
as reported by perf were as follows:
69.75% 0.59% fio [k] down_write
69.15% 0.01% fio [k] call_rwsem_down_write_failed
67.12% 1.12% fio [k] rwsem_down_write_failed
63.48% 52.77% fio [k] osq_lock
9.46% 7.88% fio [k] __raw_callee_save___kvm_vcpu_is_preempt
3.93% 3.93% fio [k] __kvm_vcpu_is_preempted
Making vcpu_is_preempted() a callee-save function has a relatively
high cost on x86-64 primarily due to at least one more cacheline of
data access from the saving and restoring of registers (8 of them)
to and from stack as well as one more level of function call.
To reduce this performance overhead, an optimized assembly version
of the the __raw_callee_save___kvm_vcpu_is_preempt() function is
provided for x86-64.
With this patch applied on a KVM guest on a 2-socekt 16-core 32-thread
system with 16 parallel jobs (8 on each socket), the aggregrate
bandwidth of the fio test on an XFS ramdisk were as follows:
I/O Type w/o patch with patch
-------- --------- ----------
random read 8141.2 MB/s 8497.1 MB/s
seq read 8229.4 MB/s 8304.2 MB/s
random write 1675.5 MB/s 1701.5 MB/s
seq write 1681.3 MB/s 1699.9 MB/s
There are some increases in the aggregated bandwidth because of
the patch.
The perf data now became:
70.78% 0.58% fio [k] down_write
70.20% 0.01% fio [k] call_rwsem_down_write_failed
69.70% 1.17% fio [k] rwsem_down_write_failed
59.91% 55.42% fio [k] osq_lock
10.14% 10.14% fio [k] __kvm_vcpu_is_preempted
The assembly code was verified by using a test kernel module to
compare the output of C __kvm_vcpu_is_preempted() and that of assembly
__raw_callee_save___kvm_vcpu_is_preempt() to verify that they matched.
Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Waiman Long <longman@redhat.com>
---
arch/x86/kernel/kvm.c | 30 ++++++++++++++++++++++++++++++
1 file changed, 30 insertions(+)
diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c
index 85ed343..e423435 100644
--- a/arch/x86/kernel/kvm.c
+++ b/arch/x86/kernel/kvm.c
@@ -589,6 +589,7 @@ static void kvm_wait(u8 *ptr, u8 val)
local_irq_restore(flags);
}
+#ifdef CONFIG_X86_32
__visible bool __kvm_vcpu_is_preempted(long cpu)
{
struct kvm_steal_time *src = &per_cpu(steal_time, cpu);
@@ -597,11 +598,40 @@ __visible bool __kvm_vcpu_is_preempted(long cpu)
}
PV_CALLEE_SAVE_REGS_THUNK(__kvm_vcpu_is_preempted);
+#else
+
+extern bool __raw_callee_save___kvm_vcpu_is_preempted(long);
+
+/*
+ * Hand-optimize version for x86-64 to avoid 8 64-bit register saving and
+ * restoring to/from the stack. It is assumed that the preempted value
+ * is at an offset of 16 from the beginning of the kvm_steal_time structure
+ * which is verified by the BUILD_BUG_ON() macro below.
+ */
+#define PREEMPTED_OFFSET 16
+asm(
+".pushsection .text;"
+".global __raw_callee_save___kvm_vcpu_is_preempted;"
+".type __raw_callee_save___kvm_vcpu_is_preempted, @function;"
+"__raw_callee_save___kvm_vcpu_is_preempted:"
+"movq __per_cpu_offset(,%rdi,8), %rax;"
+"cmpb $0, " __stringify(PREEMPTED_OFFSET) "+steal_time(%rax);"
+"setne %al;"
+"ret;"
+".popsection");
+
+#endif
+
/*
* Setup pv_lock_ops to exploit KVM_FEATURE_PV_UNHALT if present.
*/
void __init kvm_spinlock_init(void)
{
+#ifdef CONFIG_X86_64
+ BUILD_BUG_ON((offsetof(struct kvm_steal_time, preempted)
+ != PREEMPTED_OFFSET) || (sizeof(steal_time.preempted) != 1));
+#endif
+
if (!kvm_para_available())
return;
/* Does host kernel support KVM_FEATURE_PV_UNHALT? */
--
1.8.3.1
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
next prev parent reply other threads:[~2017-02-15 21:38 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-02-15 21:37 [PATCH v4 0/2] x86/kvm: Reduce vcpu_is_preempted() overhead Waiman Long
2017-02-15 21:37 ` [PATCH v4 1/2] x86/paravirt: Change vcp_is_preempted() arg type to long Waiman Long
2017-02-15 21:37 ` Waiman Long
2017-02-15 21:37 ` Waiman Long
2017-02-16 16:09 ` Peter Zijlstra
2017-02-16 16:09 ` Peter Zijlstra
2017-02-16 16:09 ` Peter Zijlstra
2017-02-16 21:02 ` Waiman Long
2017-02-16 21:02 ` Waiman Long
2017-02-16 21:02 ` Waiman Long
2017-02-17 9:42 ` Peter Zijlstra
2017-02-17 9:42 ` Peter Zijlstra
2017-02-17 9:42 ` Peter Zijlstra
2017-02-15 21:37 ` Waiman Long [this message]
2017-02-15 21:37 ` [PATCH v4 2/2] x86/kvm: Provide optimized version of vcpu_is_preempted() for x86-64 Waiman Long
2017-02-15 21:37 ` Waiman Long
2017-02-16 16:48 ` Peter Zijlstra
2017-02-16 16:48 ` Peter Zijlstra
2017-02-16 16:48 ` Peter Zijlstra
2017-02-16 21:00 ` Waiman Long
2017-02-16 21:00 ` Waiman Long
2017-02-16 21:00 ` Waiman Long
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='1487194670-6319-3-git-send-email-longman__23450.211963109$1487194747$gmane$org@redhat.com' \
--to=longman@redhat.com \
--cc=akataria@vmware.com \
--cc=boris.ostrovsky@oracle.com \
--cc=chrisw@sous-sol.org \
--cc=hpa@zytor.com \
--cc=jeremy@goop.org \
--cc=jgross@suse.com \
--cc=kvm@vger.kernel.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peterz@infradead.org \
--cc=rkrcmar@redhat.com \
--cc=rusty@rustcorp.com.au \
--cc=tglx@linutronix.de \
--cc=virtualization@lists.linux-foundation.org \
--cc=x86@kernel.org \
--cc=xen-devel@lists.xenproject.org \
--cc=xinhui.pan@linux.vnet.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.