From: "Liang, Kan" <kan.liang@linux.intel.com>
To: Dave Hansen <dave.hansen@intel.com>,
Peter Zijlstra <peterz@infradead.org>
Cc: mingo@redhat.com, acme@kernel.org, tglx@linutronix.de,
bp@alien8.de, x86@kernel.org, linux-kernel@vger.kernel.org,
mark.rutland@arm.com, alexander.shishkin@linux.intel.com,
jolsa@redhat.com, namhyung@kernel.org, yu-cheng.yu@intel.com,
bigeasy@linutronix.de, gorcunov@gmail.com, hpa@zytor.com,
alexey.budankov@linux.intel.com, eranian@google.com,
ak@linux.intel.com, like.xu@linux.intel.com,
yao.jin@linux.intel.com
Subject: Re: [PATCH 17/21] x86/fpu: Use proper mask to replace full instruction mask
Date: Mon, 22 Jun 2020 14:46:53 -0400 [thread overview]
Message-ID: <ca901df8-5765-9483-1898-a27efb5b87a2@linux.intel.com> (raw)
In-Reply-To: <5223f714-87eb-947e-e65c-886431cc7655@intel.com>
On 6/22/2020 2:05 PM, Dave Hansen wrote:
> On 6/22/20 10:47 AM, Liang, Kan wrote:
>>> I'm wondering if we should just take these copy_*regs_to_*() functions
>>> and uninline them. Yeah, they are basically wrapping one instruction,
>>> but it might literally be the most heavyweight instruction in the
>>> whole ISA.
>> Thanks for the suggestions, but I'm not sure if I follow these methods.
>>
>> I don't think simply removing the "inline" key word for the
>> copy_xregs_to_kernel() functions would help here.
>> Do you mean exporting the copy_*regs_to_*()?
> The thing that worries me here is exporting "internal" FPU state like
> xfeatures_mask_all. I'm much happier exporting a function with a much
> more defined purpose.
>
> So, yes, I'm suggesting exporting the functions,*not* the data structures.
>
I think maybe we should just export the copy_fpregs_to_fpstate() as
below, because
- KVM directly invokes this function. The copy_xregs_to_kernel() is
indirectly invoked via the function. I think we should export the
function which is directly used by other modules.
- The copy_fpregs_to_fpstate() is a bigger function with many checks.
Uninline the function should not impact the performance.
- it's also a function. It's a safer way than exporting the "internal"
FPU state. No one except the FPU can change the state
intentionally/unintentionally.
diff --git a/arch/x86/include/asm/fpu/internal.h
b/arch/x86/include/asm/fpu/internal.h
index 0388c792..d3724dc 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -411,43 +411,7 @@ static inline int copy_kernel_to_xregs_err(struct
xregs_state *xstate, u64 mask)
return err;
}
-/*
- * These must be called with preempt disabled. Returns
- * 'true' if the FPU state is still intact and we can
- * keep registers active.
- *
- * The legacy FNSAVE instruction cleared all FPU state
- * unconditionally, so registers are essentially destroyed.
- * Modern FPU state can be kept in registers, if there are
- * no pending FP exceptions.
- */
-static inline int copy_fpregs_to_fpstate(struct fpu *fpu)
-{
- if (likely(use_xsave())) {
- copy_xregs_to_kernel(&fpu->state.xsave);
-
- /*
- * AVX512 state is tracked here because its use is
- * known to slow the max clock speed of the core.
- */
- if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_AVX512)
- fpu->avx512_timestamp = jiffies;
- return 1;
- }
-
- if (likely(use_fxsr())) {
- copy_fxregs_to_kernel(fpu);
- return 1;
- }
-
- /*
- * Legacy FPU register saving, FNSAVE always clears FPU registers,
- * so we have to mark them inactive:
- */
- asm volatile("fnsave %[fp]; fwait" : [fp] "=m" (fpu->state.fsave));
-
- return 0;
-}
+extern int copy_fpregs_to_fpstate(struct fpu *fpu);
static inline void __copy_kernel_to_fpregs(union fpregs_state
*fpstate, u64 mask)
{
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 06c8189..1bb7532 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -82,6 +82,45 @@ bool irq_fpu_usable(void)
}
EXPORT_SYMBOL(irq_fpu_usable);
+/*
+ * These must be called with preempt disabled. Returns
+ * 'true' if the FPU state is still intact and we can
+ * keep registers active.
+ *
+ * The legacy FNSAVE instruction cleared all FPU state
+ * unconditionally, so registers are essentially destroyed.
+ * Modern FPU state can be kept in registers, if there are
+ * no pending FP exceptions.
+ */
+int copy_fpregs_to_fpstate(struct fpu *fpu)
+{
+ if (likely(use_xsave())) {
+ copy_xregs_to_kernel(&fpu->state.xsave);
+
+ /*
+ * AVX512 state is tracked here because its use is
+ * known to slow the max clock speed of the core.
+ */
+ if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_AVX512)
+ fpu->avx512_timestamp = jiffies;
+ return 1;
+ }
+
+ if (likely(use_fxsr())) {
+ copy_fxregs_to_kernel(fpu);
+ return 1;
+ }
+
+ /*
+ * Legacy FPU register saving, FNSAVE always clears FPU registers,
+ * so we have to mark them inactive:
+ */
+ asm volatile("fnsave %[fp]; fwait" : [fp] "=m" (fpu->state.fsave));
+
+ return 0;
+}
+EXPORT_SYMBOL(copy_fpregs_to_fpstate);
+
void kernel_fpu_begin(void)
{
preempt_disable();
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 9c0541d..ca20029 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -58,7 +58,6 @@ static short xsave_cpuid_features[] __initdata = {
* XSAVE buffer, both supervisor and user xstates.
*/
u64 xfeatures_mask_all __read_mostly;
-EXPORT_SYMBOL_GPL(xfeatures_mask_all);
static unsigned int xstate_offsets[XFEATURE_MAX] = { [ 0 ...
XFEATURE_MAX - 1] = -1}; static unsigned int xstate_sizes[XFEATURE_MAX]
= { [ 0 ... XFEATURE_MAX - 1] = -1};
next prev parent reply other threads:[~2020-06-22 18:47 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-06-19 14:03 [PATCH 00/21] Support Architectural LBR kan.liang
2020-06-19 14:03 ` [PATCH 01/21] x86/cpufeatures: Add Architectural LBRs feature bit kan.liang
2020-06-19 14:03 ` [PATCH 02/21] perf/x86/intel/lbr: Add pointers for LBR enable and disable kan.liang
2020-06-19 14:03 ` [PATCH 03/21] perf/x86/intel/lbr: Add pointer for LBR reset kan.liang
2020-06-19 14:03 ` [PATCH 04/21] perf/x86/intel/lbr: Add pointer for LBR read kan.liang
2020-06-19 14:03 ` [PATCH 05/21] perf/x86/intel/lbr: Add pointers for LBR save and restore kan.liang
2020-06-19 14:03 ` [PATCH 06/21] perf/x86/intel/lbr: Factor out a new struct for generic optimization kan.liang
2020-06-19 14:03 ` [PATCH 07/21] perf/x86/intel/lbr: Use dynamic data structure for task_ctx kan.liang
2020-06-19 14:03 ` [PATCH 08/21] x86/msr-index: Add bunch of MSRs for Arch LBR kan.liang
2020-06-19 19:11 ` Peter Zijlstra
2020-06-19 14:03 ` [PATCH 09/21] perf/x86: Expose CPUID enumeration bits for arch LBR kan.liang
2020-06-19 18:31 ` Peter Zijlstra
2020-06-19 14:03 ` [PATCH 10/21] perf/x86/intel: Check Arch LBR MSRs kan.liang
2020-06-19 14:03 ` [PATCH 11/21] perf/x86/intel/lbr: Support LBR_CTL kan.liang
2020-06-19 18:40 ` Peter Zijlstra
2020-06-19 19:15 ` Liang, Kan
2020-06-19 19:22 ` Peter Zijlstra
2020-06-19 14:04 ` [PATCH 12/21] perf/x86/intel/lbr: Support Architectural LBR kan.liang
2020-06-19 19:08 ` Peter Zijlstra
2020-06-19 19:40 ` Liang, Kan
2020-06-19 14:04 ` [PATCH 13/21] perf/core: Factor out functions to allocate/free the task_ctx_data kan.liang
2020-06-19 14:04 ` [PATCH 14/21] perf/core: Use kmem_cache to allocate the PMU specific data kan.liang
2020-06-19 14:04 ` [PATCH 15/21] perf/x86/intel/lbr: Create kmem_cache for the LBR context data kan.liang
2020-06-19 14:04 ` [PATCH 16/21] perf/x86: Remove task_ctx_size kan.liang
2020-06-19 14:04 ` [PATCH 17/21] x86/fpu: Use proper mask to replace full instruction mask kan.liang
2020-06-19 19:31 ` Peter Zijlstra
2020-06-22 14:52 ` Liang, Kan
2020-06-22 15:02 ` Dave Hansen
2020-06-22 17:47 ` Liang, Kan
2020-06-22 18:05 ` Dave Hansen
2020-06-22 18:46 ` Liang, Kan [this message]
2020-06-19 14:04 ` [PATCH 18/21] x86/fpu/xstate: Support dynamic supervisor feature for LBR kan.liang
2020-06-19 14:04 ` [PATCH 19/21] x86/fpu/xstate: Add helpers for LBR dynamic supervisor feature kan.liang
2020-06-19 14:04 ` [PATCH 20/21] perf/x86/intel/lbr: Support XSAVES/XRSTORS for LBR context switch kan.liang
2020-06-19 19:41 ` Peter Zijlstra
2020-06-19 22:28 ` Liang, Kan
2020-06-19 14:04 ` [PATCH 21/21] perf/x86/intel/lbr: Support XSAVES for arch LBR read kan.liang
2020-06-22 18:49 ` Cyrill Gorcunov
2020-06-22 19:11 ` Liang, Kan
2020-06-22 19:31 ` Cyrill Gorcunov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ca901df8-5765-9483-1898-a27efb5b87a2@linux.intel.com \
--to=kan.liang@linux.intel.com \
--cc=acme@kernel.org \
--cc=ak@linux.intel.com \
--cc=alexander.shishkin@linux.intel.com \
--cc=alexey.budankov@linux.intel.com \
--cc=bigeasy@linutronix.de \
--cc=bp@alien8.de \
--cc=dave.hansen@intel.com \
--cc=eranian@google.com \
--cc=gorcunov@gmail.com \
--cc=hpa@zytor.com \
--cc=jolsa@redhat.com \
--cc=like.xu@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mark.rutland@arm.com \
--cc=mingo@redhat.com \
--cc=namhyung@kernel.org \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
--cc=yao.jin@linux.intel.com \
--cc=yu-cheng.yu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).