linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Chang S. Bae" <chang.seok.bae@intel.com>
To: bp@suse.de, luto@kernel.org, tglx@linutronix.de,
	mingo@kernel.org, x86@kernel.org
Cc: len.brown@intel.com, lenb@kernel.org, dave.hansen@intel.com,
	thiago.macieira@intel.com, jing2.liu@intel.com,
	ravi.v.shankar@intel.com, linux-kernel@vger.kernel.org,
	chang.seok.bae@intel.com
Subject: [PATCH v10 23/28] x86/fpu/xstate: Skip writing zeros to signal frame for dynamic user states if in INIT-state
Date: Wed, 25 Aug 2021 08:54:08 -0700	[thread overview]
Message-ID: <20210825155413.19673-24-chang.seok.bae@intel.com> (raw)
In-Reply-To: <20210825155413.19673-1-chang.seok.bae@intel.com>

By default, for XSTATE features in the INIT-state, XSAVE writes zeros to
the uncompressed destination buffer.

E.g., if you are not using AVX-512, you will still get a bunch of zeros on
the signal stack where live AVX-512 data would go.

For 'dynamic user state' (currently only XTILEDATA), explicitly skip this
data transfer. The result is that the user buffer for the AMX region will
not be touched by XSAVE.

[ Reading XINUSE takes about 20-30 cycles, but writing zeros consumes about
  5-times or more, e.g., for XTILEDATA. ]

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
---
Changes from v9:
* Use cpu_feature_enabled() instead of boot_cpu_has(). (Borislav Petkov)

Changes from v5:
* Mentioned the optimization trade-offs in the changelog. (Dave Hansen)
* Added code comment.

Changes from v4:
* Added as a new patch.
---
 arch/x86/include/asm/fpu/internal.h | 38 +++++++++++++++++++++--------
 1 file changed, 28 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h
index a7e39862df30..0225ab63e5d1 100644
--- a/arch/x86/include/asm/fpu/internal.h
+++ b/arch/x86/include/asm/fpu/internal.h
@@ -337,8 +337,9 @@ static inline void os_xrstor(struct xregs_state *xstate, u64 mask)
  */
 static inline int xsave_to_user_sigframe(struct xregs_state __user *buf)
 {
+	struct fpu *fpu = &current->thread.fpu;
 	u32 lmask, hmask;
-	u64 mask;
+	u64 state_mask;
 	int err;
 
 	/*
@@ -346,21 +347,38 @@ static inline int xsave_to_user_sigframe(struct xregs_state __user *buf)
 	 * internally, e.g. PKRU. That's user space ABI and also required
 	 * to allow the signal handler to modify PKRU.
 	 */
-	mask = xfeatures_mask_uabi();
+	state_mask = xfeatures_mask_uabi();
+
+	if (!xfeatures_mask_user_dynamic)
+		goto mask_ready;
 
 	/*
 	 * Exclude dynamic user states for non-opt-in threads.
 	 */
-	if (xfeatures_mask_user_dynamic) {
-		struct fpu *fpu = &current->thread.fpu;
-
-		mask &= fpu->dynamic_state_perm ?
-			fpu->state_mask :
-			~xfeatures_mask_user_dynamic;
+	if (!fpu->dynamic_state_perm) {
+		state_mask &= ~xfeatures_mask_user_dynamic;
+	} else {
+		u64 dynamic_state_mask;
+
+		state_mask &= fpu->state_mask;
+
+		dynamic_state_mask = state_mask & xfeatures_mask_user_dynamic;
+		if (dynamic_state_mask && cpu_feature_enabled(X86_FEATURE_XGETBV1)) {
+			u64 dynamic_xinuse, dynamic_init;
+			u64 xinuse = xgetbv(1);
+
+			dynamic_xinuse = xinuse & dynamic_state_mask;
+			dynamic_init = ~xinuse & dynamic_state_mask;
+			if (dynamic_init) {
+				state_mask &= ~xfeatures_mask_user_dynamic;
+				state_mask |= dynamic_xinuse;
+			}
+		}
 	}
 
-	lmask = mask;
-	hmask = mask >> 32;
+mask_ready:
+	lmask = state_mask;
+	hmask = state_mask >> 32;
 
 	/*
 	 * Clear the xsave header first, so that reserved fields are
-- 
2.17.1


  parent reply	other threads:[~2021-08-25 16:01 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-25 15:53 [PATCH v10 00/28] x86: Support Intel Advanced Matrix Extensions Chang S. Bae
2021-08-25 15:53 ` [PATCH v10 01/28] x86/fpu/xstate: Fix the state copy function to the XSTATE buffer Chang S. Bae
2021-10-01 12:44   ` Thomas Gleixner
2021-10-03 22:34     ` Bae, Chang Seok
2021-08-25 15:53 ` [PATCH v10 02/28] x86/fpu/xstate: Modify the initialization helper to handle both static and dynamic buffers Chang S. Bae
2021-10-01 12:45   ` Thomas Gleixner
2021-10-03 22:35     ` Bae, Chang Seok
2021-08-25 15:53 ` [PATCH v10 03/28] x86/fpu/xstate: Modify state copy helpers " Chang S. Bae
2021-10-01 12:47   ` Thomas Gleixner
2021-10-03 22:42     ` Bae, Chang Seok
2021-08-25 15:53 ` [PATCH v10 04/28] x86/fpu/xstate: Modify address finders " Chang S. Bae
2021-10-01 13:15   ` Thomas Gleixner
2021-10-03 22:35     ` Bae, Chang Seok
2021-10-04 12:54       ` Thomas Gleixner
2021-08-25 15:53 ` [PATCH v10 05/28] x86/fpu/xstate: Add a new variable to indicate dynamic user states Chang S. Bae
2021-10-01 13:16   ` Thomas Gleixner
2021-10-03 22:35     ` Bae, Chang Seok
2021-10-04 12:57       ` Thomas Gleixner
2021-08-25 15:53 ` [PATCH v10 06/28] x86/fpu/xstate: Add new variables to indicate dynamic XSTATE buffer size Chang S. Bae
2021-10-01 13:32   ` Thomas Gleixner
2021-10-03 22:36     ` Bae, Chang Seok
2021-08-25 15:53 ` [PATCH v10 07/28] x86/fpu/xstate: Calculate and remember dynamic XSTATE buffer sizes Chang S. Bae
2021-08-25 15:53 ` [PATCH v10 08/28] x86/fpu/xstate: Convert the struct fpu 'state' field to a pointer Chang S. Bae
2021-08-25 15:53 ` [PATCH v10 09/28] x86/fpu/xstate: Introduce helpers to manage the XSTATE buffer dynamically Chang S. Bae
2021-10-01 14:20   ` Thomas Gleixner
2021-10-03 22:36     ` Bae, Chang Seok
2021-08-25 15:53 ` [PATCH v10 10/28] x86/fpu/xstate: Update the XSTATE save function to support dynamic states Chang S. Bae
2021-10-01 15:41   ` Thomas Gleixner
2021-10-02 21:31     ` Thomas Gleixner
2021-10-02 22:54       ` Bae, Chang Seok
2021-10-05  8:16         ` Paolo Bonzini
2021-10-05  7:50       ` Paolo Bonzini
2021-10-05  9:55         ` Thomas Gleixner
2021-08-25 15:53 ` [PATCH v10 11/28] x86/fpu/xstate: Update the XSTATE buffer address finder " Chang S. Bae
2021-08-25 15:53 ` [PATCH v10 12/28] x86/fpu/xstate: Update the XSTATE context copy function " Chang S. Bae
2021-08-25 15:53 ` [PATCH v10 13/28] x86/fpu/xstate: Use feature disable (XFD) to protect dynamic user state Chang S. Bae
2021-10-01 15:02   ` Thomas Gleixner
2021-10-01 15:10     ` Thomas Gleixner
2021-10-03 22:38       ` Bae, Chang Seok
2021-10-04 12:35         ` Thomas Gleixner
2021-10-01 20:20     ` Thomas Gleixner
2021-10-03 22:39       ` Bae, Chang Seok
2021-10-04 19:03         ` Thomas Gleixner
2021-10-03 22:41     ` Bae, Chang Seok
2021-08-25 15:53 ` [PATCH v10 14/28] x86/fpu/xstate: Support ptracer-induced XSTATE buffer expansion Chang S. Bae
2021-08-25 15:54 ` [PATCH v10 15/28] x86/arch_prctl: Create ARCH_SET_STATE_ENABLE/ARCH_GET_STATE_ENABLE Chang S. Bae
2021-08-25 16:36   ` Bae, Chang Seok
2021-08-25 15:54 ` [PATCH v10 16/28] x86/fpu/xstate: Support both legacy and expanded signal XSTATE size Chang S. Bae
2021-08-25 15:54 ` [PATCH v10 17/28] x86/fpu/xstate: Adjust the XSAVE feature table to address gaps in state component numbers Chang S. Bae
2021-08-25 15:54 ` [PATCH v10 18/28] x86/fpu/xstate: Disable XSTATE support if an inconsistent state is detected Chang S. Bae
2021-08-25 15:54 ` [PATCH v10 19/28] x86/cpufeatures/amx: Enumerate Advanced Matrix Extension (AMX) feature bits Chang S. Bae
2021-08-25 15:54 ` [PATCH v10 20/28] x86/fpu/amx: Define AMX state components and have it used for boot-time checks Chang S. Bae
2021-08-25 15:54 ` [PATCH v10 21/28] x86/fpu/amx: Initialize child's AMX state Chang S. Bae
2021-08-25 15:54 ` [PATCH v10 22/28] x86/fpu/amx: Enable the AMX feature in 64-bit mode Chang S. Bae
2021-08-25 15:54 ` Chang S. Bae [this message]
2021-08-25 15:54 ` [PATCH v10 24/28] selftest/x86/amx: Test cases for the AMX state management Chang S. Bae
2021-08-25 15:54 ` [PATCH v10 25/28] x86/insn/amx: Add TILERELEASE instruction to the opcode map Chang S. Bae
2021-08-25 15:54 ` [PATCH v10 26/28] intel_idle/amx: Add SPR support with XTILEDATA capability Chang S. Bae
2021-08-25 15:54 ` [PATCH v10 27/28] x86/fpu/xstate: Add a sanity check for XFD state when saving XSTATE Chang S. Bae
2021-08-25 15:54 ` [PATCH v10 28/28] x86/arch_prctl: ARCH_GET_FEATURES_WITH_KERNEL_ASSISTANCE Chang S. Bae
2021-09-30 21:12 ` [PATCH v10 00/28] x86: Support Intel Advanced Matrix Extensions Len Brown

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210825155413.19673-24-chang.seok.bae@intel.com \
    --to=chang.seok.bae@intel.com \
    --cc=bp@suse.de \
    --cc=dave.hansen@intel.com \
    --cc=jing2.liu@intel.com \
    --cc=len.brown@intel.com \
    --cc=lenb@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@kernel.org \
    --cc=ravi.v.shankar@intel.com \
    --cc=tglx@linutronix.de \
    --cc=thiago.macieira@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).