linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: riel@redhat.com
To: linux-kernel@vger.kernel.org
Cc: x86@kernel.org, tglx@linutronix.de, pbonzini@redhat.com,
	mingo@redhat.com, luto@kernel.org, hpa@zytor.com,
	dave.hansen@linux.intel.com, bp@suse.de
Subject: [PATCH RFC 4/5] x86,fpu: lazily skip FPU restore when still loaded
Date: Sat,  1 Oct 2016 16:31:34 -0400	[thread overview]
Message-ID: <1475353895-22175-5-git-send-email-riel@redhat.com> (raw)
In-Reply-To: <1475353895-22175-1-git-send-email-riel@redhat.com>

From: Rik van Riel <riel@redhat.com>

When the FPU register set has not been touched by anybody else,
we can lazily skip the restore.

Intel has a number of clever optimizations to reduce the FPU
restore overhead, but those optimizations do not work across
the guest/host virtualization boundary, and even on bare metal
it should be faster to skip the restore entirely.

This code is still BROKEN. I am not yet sure why.

Signed-off-by: Rik van Riel <riel@redhat.com>
---
 arch/x86/include/asm/fpu/internal.h | 13 +++++++++++++
 arch/x86/kernel/process.c           |  3 +++
 arch/x86/kvm/x86.c                  |  8 +++++++-
 3 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index b5accb35e434..f69960e9aea1 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -575,6 +575,19 @@ static inline void fpregs_deactivate(struct fpu *fpu)
 }
 
 /*
+ * Check whether an FPU's register set is still loaded in the CPU.
+ */
+static inline bool fpu_lazy_skip_restore(struct fpu *fpu)
+{
+	bool still_loaded = (fpu->fpstate_active &&
+			     fpu->last_cpu == raw_smp_processor_id() &&
+			     __this_cpu_read(fpu_fpregs_owner_ctx) == fpu);
+
+	fpu->fpregs_active = still_loaded;
+	return still_loaded;
+}
+
+/*
  * FPU state switching for scheduling.
  *
  * This is a three-stage process:
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 087413be39cf..6b72415e400f 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -208,6 +208,9 @@ void switch_fpu_return(void)
 		  (use_eager_fpu() || fpu->counter > 5);
 
 	if (preload) {
+		if (fpu_lazy_skip_restore(fpu))
+			return;
+
 		prefetch(&fpu->state);
 		fpu->counter++;
 		__fpregs_activate(fpu);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 55c82d066d3a..16ebcd12edf7 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7346,7 +7346,12 @@ void kvm_load_guest_fpu(struct kvm_vcpu *vcpu)
 
 	vcpu->guest_fpu_loaded = 1;
 	__kernel_fpu_begin(fpu);
-	__copy_kernel_to_fpregs(&fpu->state);
+
+	if (!fpu_lazy_skip_restore(fpu)) {
+		fpu->last_cpu = raw_smp_processor_id();
+		__copy_kernel_to_fpregs(&fpu->state);
+	}
+
 	trace_kvm_fpu(1);
 }
 
@@ -7358,6 +7363,7 @@ void kvm_put_guest_fpu(struct kvm_vcpu *vcpu)
 	}
 
 	vcpu->guest_fpu_loaded = 0;
+	vcpu->arch.guest_fpu.fpregs_active = 0;
 	copy_fpregs_to_fpstate(&vcpu->arch.guest_fpu);
 	__kernel_fpu_end();
 	++vcpu->stat.fpu_reload;
-- 
2.7.4

  parent reply	other threads:[~2016-10-01 20:51 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-01 20:31 [PATCH RFC 0/5] x86,fpu: make FPU context switching much lazier riel
2016-10-01 20:31 ` [PATCH RFC 1/5] x86,fpu: split prev/next task fpu state handling riel
2016-10-01 23:26   ` Andy Lutomirski
2016-10-02  0:02     ` Rik van Riel
2016-10-01 20:31 ` [PATCH RFC 2/5] x86,fpu: delay FPU register loading until switch to userspace riel
2016-10-01 23:44   ` Andy Lutomirski
2016-10-02  0:08     ` Rik van Riel
2016-10-03 20:54       ` Andy Lutomirski
2016-10-03 21:21         ` Rik van Riel
2016-10-03 21:36           ` Andy Lutomirski
2016-10-04  1:29             ` Rik van Riel
2016-10-04  2:09               ` Andy Lutomirski
2016-10-04  2:47                 ` Rik van Riel
2016-10-04  3:02                   ` Andy Lutomirski
2016-10-04  6:35                 ` Ingo Molnar
2016-10-04 12:48                   ` Rik van Riel
2016-10-04  2:11             ` Rik van Riel
2016-10-04  3:02               ` Andy Lutomirski
2016-10-02  0:42     ` Rik van Riel
2016-10-03 16:23       ` Dave Hansen
2016-10-01 20:31 ` [PATCH RFC 3/5] x86,fpu: add kernel fpu argument to __kernel_fpu_begin riel
2016-10-01 20:31 ` riel [this message]
2016-10-03 20:04   ` [PATCH RFC 4/5] x86,fpu: lazily skip FPU restore when still loaded Dave Hansen
2016-10-03 20:22     ` Rik van Riel
2016-10-03 20:49       ` Dave Hansen
2016-10-03 21:02         ` Rik van Riel
2016-10-01 20:31 ` [PATCH RFC 5/5] x86,fpu: kinda sorta fix up signal path riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1475353895-22175-5-git-send-email-riel@redhat.com \
    --to=riel@redhat.com \
    --cc=bp@suse.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).