linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Liang, Kan" <kan.liang@linux.intel.com>
To: Dave Hansen <dave.hansen@intel.com>,
	Peter Zijlstra <peterz@infradead.org>
Cc: mingo@redhat.com, acme@kernel.org, tglx@linutronix.de,
	bp@alien8.de, x86@kernel.org, linux-kernel@vger.kernel.org,
	mark.rutland@arm.com, alexander.shishkin@linux.intel.com,
	jolsa@redhat.com, namhyung@kernel.org, yu-cheng.yu@intel.com,
	bigeasy@linutronix.de, gorcunov@gmail.com, hpa@zytor.com,
	alexey.budankov@linux.intel.com, eranian@google.com,
	ak@linux.intel.com, like.xu@linux.intel.com,
	yao.jin@linux.intel.com
Subject: Re: [PATCH 17/21] x86/fpu: Use proper mask to replace full instruction mask
Date: Mon, 22 Jun 2020 14:46:53 -0400	[thread overview]
Message-ID: <ca901df8-5765-9483-1898-a27efb5b87a2@linux.intel.com> (raw)
In-Reply-To: <5223f714-87eb-947e-e65c-886431cc7655@intel.com>



On 6/22/2020 2:05 PM, Dave Hansen wrote:
> On 6/22/20 10:47 AM, Liang, Kan wrote:
>>> I'm wondering if we should just take these copy_*regs_to_*() functions
>>> and uninline them.  Yeah, they are basically wrapping one instruction,
>>> but it might literally be the most heavyweight instruction in the
>>> whole ISA.
>> Thanks for the suggestions, but I'm not sure if I follow these methods.
>>
>> I don't think simply removing the "inline" key word for the
>> copy_xregs_to_kernel() functions would help here.
>> Do you mean exporting the copy_*regs_to_*()?
> The thing that worries me here is exporting "internal" FPU state like
> xfeatures_mask_all.  I'm much happier exporting a function with a much
> more defined purpose.
> 
> So, yes, I'm suggesting exporting the functions,*not*  the data structures.
> 

I think maybe we should just export the copy_fpregs_to_fpstate() as 
below, because
- KVM directly invokes this function. The copy_xregs_to_kernel() is 
indirectly invoked via the function. I think we should export the 
function which is directly used by other modules.
- The copy_fpregs_to_fpstate() is a bigger function with many checks. 
Uninline the function should not impact the performance.
- it's also a function. It's a safer way than exporting the "internal" 
FPU state. No one except the FPU can change the state 
intentionally/unintentionally.


diff --git a/arch/x86/include/asm/fpu/internal.h 
b/arch/x86/include/asm/fpu/internal.h
index 0388c792..d3724dc 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -411,43 +411,7 @@ static inline int copy_kernel_to_xregs_err(struct 
xregs_state *xstate, u64 mask)
  	return err;
  }

-/*
- * These must be called with preempt disabled. Returns
- * 'true' if the FPU state is still intact and we can
- * keep registers active.
- *
- * The legacy FNSAVE instruction cleared all FPU state
- * unconditionally, so registers are essentially destroyed.
- * Modern FPU state can be kept in registers, if there are
- * no pending FP exceptions.
- */
-static inline int copy_fpregs_to_fpstate(struct fpu *fpu)
-{
-	if (likely(use_xsave())) {
-		copy_xregs_to_kernel(&fpu->state.xsave);
-
-		/*
-		 * AVX512 state is tracked here because its use is
-		 * known to slow the max clock speed of the core.
-		 */
-		if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_AVX512)
-			fpu->avx512_timestamp = jiffies;
-		return 1;
-	}
-
-	if (likely(use_fxsr())) {
-		copy_fxregs_to_kernel(fpu);
-		return 1;
-	}
-
-	/*
-	 * Legacy FPU register saving, FNSAVE always clears FPU registers,
-	 * so we have to mark them inactive:
-	 */
-	asm volatile("fnsave %[fp]; fwait" : [fp] "=m" (fpu->state.fsave));
-
-	return 0;
-}
+extern int copy_fpregs_to_fpstate(struct fpu *fpu);

  static inline void __copy_kernel_to_fpregs(union fpregs_state 
*fpstate, u64 mask)
  {
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 06c8189..1bb7532 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -82,6 +82,45 @@ bool irq_fpu_usable(void)
  }
  EXPORT_SYMBOL(irq_fpu_usable);

+/*
+ * These must be called with preempt disabled. Returns
+ * 'true' if the FPU state is still intact and we can
+ * keep registers active.
+ *
+ * The legacy FNSAVE instruction cleared all FPU state
+ * unconditionally, so registers are essentially destroyed.
+ * Modern FPU state can be kept in registers, if there are
+ * no pending FP exceptions.
+ */
+int copy_fpregs_to_fpstate(struct fpu *fpu)
+{
+	if (likely(use_xsave())) {
+		copy_xregs_to_kernel(&fpu->state.xsave);
+
+		/*
+		 * AVX512 state is tracked here because its use is
+		 * known to slow the max clock speed of the core.
+		 */
+		if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_AVX512)
+			fpu->avx512_timestamp = jiffies;
+		return 1;
+	}
+
+	if (likely(use_fxsr())) {
+		copy_fxregs_to_kernel(fpu);
+		return 1;
+	}
+
+	/*
+	 * Legacy FPU register saving, FNSAVE always clears FPU registers,
+	 * so we have to mark them inactive:
+	 */
+	asm volatile("fnsave %[fp]; fwait" : [fp] "=m" (fpu->state.fsave));
+
+	return 0;
+}
+EXPORT_SYMBOL(copy_fpregs_to_fpstate);
+
  void kernel_fpu_begin(void)
  {
  	preempt_disable();
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 9c0541d..ca20029 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -58,7 +58,6 @@ static short xsave_cpuid_features[] __initdata = {
   * XSAVE buffer, both supervisor and user xstates.
   */
  u64 xfeatures_mask_all __read_mostly;
-EXPORT_SYMBOL_GPL(xfeatures_mask_all);

  static unsigned int xstate_offsets[XFEATURE_MAX] = { [ 0 ... 
XFEATURE_MAX - 1] = -1}; static unsigned int xstate_sizes[XFEATURE_MAX] 
  = { [ 0 ... XFEATURE_MAX - 1] = -1};

  reply	other threads:[~2020-06-22 18:47 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-06-19 14:03 [PATCH 00/21] Support Architectural LBR kan.liang
2020-06-19 14:03 ` [PATCH 01/21] x86/cpufeatures: Add Architectural LBRs feature bit kan.liang
2020-06-19 14:03 ` [PATCH 02/21] perf/x86/intel/lbr: Add pointers for LBR enable and disable kan.liang
2020-06-19 14:03 ` [PATCH 03/21] perf/x86/intel/lbr: Add pointer for LBR reset kan.liang
2020-06-19 14:03 ` [PATCH 04/21] perf/x86/intel/lbr: Add pointer for LBR read kan.liang
2020-06-19 14:03 ` [PATCH 05/21] perf/x86/intel/lbr: Add pointers for LBR save and restore kan.liang
2020-06-19 14:03 ` [PATCH 06/21] perf/x86/intel/lbr: Factor out a new struct for generic optimization kan.liang
2020-06-19 14:03 ` [PATCH 07/21] perf/x86/intel/lbr: Use dynamic data structure for task_ctx kan.liang
2020-06-19 14:03 ` [PATCH 08/21] x86/msr-index: Add bunch of MSRs for Arch LBR kan.liang
2020-06-19 19:11   ` Peter Zijlstra
2020-06-19 14:03 ` [PATCH 09/21] perf/x86: Expose CPUID enumeration bits for arch LBR kan.liang
2020-06-19 18:31   ` Peter Zijlstra
2020-06-19 14:03 ` [PATCH 10/21] perf/x86/intel: Check Arch LBR MSRs kan.liang
2020-06-19 14:03 ` [PATCH 11/21] perf/x86/intel/lbr: Support LBR_CTL kan.liang
2020-06-19 18:40   ` Peter Zijlstra
2020-06-19 19:15     ` Liang, Kan
2020-06-19 19:22       ` Peter Zijlstra
2020-06-19 14:04 ` [PATCH 12/21] perf/x86/intel/lbr: Support Architectural LBR kan.liang
2020-06-19 19:08   ` Peter Zijlstra
2020-06-19 19:40     ` Liang, Kan
2020-06-19 14:04 ` [PATCH 13/21] perf/core: Factor out functions to allocate/free the task_ctx_data kan.liang
2020-06-19 14:04 ` [PATCH 14/21] perf/core: Use kmem_cache to allocate the PMU specific data kan.liang
2020-06-19 14:04 ` [PATCH 15/21] perf/x86/intel/lbr: Create kmem_cache for the LBR context data kan.liang
2020-06-19 14:04 ` [PATCH 16/21] perf/x86: Remove task_ctx_size kan.liang
2020-06-19 14:04 ` [PATCH 17/21] x86/fpu: Use proper mask to replace full instruction mask kan.liang
2020-06-19 19:31   ` Peter Zijlstra
2020-06-22 14:52     ` Liang, Kan
2020-06-22 15:02       ` Dave Hansen
2020-06-22 17:47         ` Liang, Kan
2020-06-22 18:05           ` Dave Hansen
2020-06-22 18:46             ` Liang, Kan [this message]
2020-06-19 14:04 ` [PATCH 18/21] x86/fpu/xstate: Support dynamic supervisor feature for LBR kan.liang
2020-06-19 14:04 ` [PATCH 19/21] x86/fpu/xstate: Add helpers for LBR dynamic supervisor feature kan.liang
2020-06-19 14:04 ` [PATCH 20/21] perf/x86/intel/lbr: Support XSAVES/XRSTORS for LBR context switch kan.liang
2020-06-19 19:41   ` Peter Zijlstra
2020-06-19 22:28     ` Liang, Kan
2020-06-19 14:04 ` [PATCH 21/21] perf/x86/intel/lbr: Support XSAVES for arch LBR read kan.liang
2020-06-22 18:49   ` Cyrill Gorcunov
2020-06-22 19:11     ` Liang, Kan
2020-06-22 19:31       ` Cyrill Gorcunov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ca901df8-5765-9483-1898-a27efb5b87a2@linux.intel.com \
    --to=kan.liang@linux.intel.com \
    --cc=acme@kernel.org \
    --cc=ak@linux.intel.com \
    --cc=alexander.shishkin@linux.intel.com \
    --cc=alexey.budankov@linux.intel.com \
    --cc=bigeasy@linutronix.de \
    --cc=bp@alien8.de \
    --cc=dave.hansen@intel.com \
    --cc=eranian@google.com \
    --cc=gorcunov@gmail.com \
    --cc=hpa@zytor.com \
    --cc=jolsa@redhat.com \
    --cc=like.xu@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=mingo@redhat.com \
    --cc=namhyung@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    --cc=yao.jin@linux.intel.com \
    --cc=yu-cheng.yu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).