linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Chang S. Bae" <chang.seok.bae@intel.com>
To: bp@suse.de, luto@kernel.org, tglx@linutronix.de,
	mingo@kernel.org, x86@kernel.org
Cc: len.brown@intel.com, dave.hansen@intel.com, jing2.liu@intel.com,
	ravi.v.shankar@intel.com, linux-kernel@vger.kernel.org,
	chang.seok.bae@intel.com
Subject: [PATCH v5 09/28] x86/fpu/xstate: Introduce helpers to manage the xstate buffer dynamically
Date: Sun, 23 May 2021 12:32:40 -0700	[thread overview]
Message-ID: <20210523193259.26200-10-chang.seok.bae@intel.com> (raw)
In-Reply-To: <20210523193259.26200-1-chang.seok.bae@intel.com>

The static xstate per-task buffer contains the extended register states --
but it is not expandable at runtime. Introduce runtime methods and a new
fpu struct field to support the expansion.

fpu->state_mask indicates which state components are reserved to be
saved in the xstate buffer.

alloc_xstate_buffer() uses vmalloc(). If use of this mechanism grows to
allocate buffers larger than 64KB, a more sophisticated allocation scheme
that includes purpose-built reclaim capability might be justified.

Introduce a new helper -- get_xstate_size() to calculate the buffer size.

Also, use the new field and helper to initialize the buffer.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
---
Changes from v3:
* Updated code comments. (Borislav Petkov)
* Used vzalloc() instead of vmalloc() with memset(). (Borislav Petkov)
* Removed the max size check for >64KB. (Borislav Petkov)
* Removed the allocation size check in the helper. (Borislav Petkov)
* Switched the function description in the kernel-doc style.
* Used them for buffer initialization -- moved from the next patch.

Changes from v2:
* Updated the changelog with task->fpu removed. (Borislav Petkov)
* Replaced 'area' with 'buffer' in the comments and the changelog.
* Updated the code comments.

Changes from v1:
* Removed unneeded interrupt masking (Andy Lutomirski)
* Added vmalloc() error tracing (Dave Hansen, PeterZ, and Andy Lutomirski)
---
 arch/x86/include/asm/fpu/types.h  |   7 ++
 arch/x86/include/asm/fpu/xstate.h |   4 +
 arch/x86/include/asm/trace/fpu.h  |   5 ++
 arch/x86/kernel/fpu/core.c        |  14 ++--
 arch/x86/kernel/fpu/xstate.c      | 125 ++++++++++++++++++++++++++++++
 5 files changed, 148 insertions(+), 7 deletions(-)

diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index dcd28a545377..6fc707c14350 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -336,6 +336,13 @@ struct fpu {
 	 */
 	unsigned long			avx512_timestamp;
 
+	/*
+	 * @state_mask:
+	 *
+	 * The bitmap represents state components reserved to be saved in ->state.
+	 */
+	u64				state_mask;
+
 	/*
 	 * @state:
 	 *
diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index 1fba2ca15874..cbb4795d2b45 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -112,6 +112,10 @@ extern unsigned int get_xstate_config(enum xstate_config cfg);
 void set_xstate_config(enum xstate_config cfg, unsigned int value);
 
 void *get_xsave_addr(struct fpu *fpu, int xfeature_nr);
+unsigned int get_xstate_size(u64 mask);
+int alloc_xstate_buffer(struct fpu *fpu, u64 mask);
+void free_xstate_buffer(struct fpu *fpu);
+
 const void *get_xsave_field_ptr(int xfeature_nr);
 int using_compacted_format(void);
 int xfeature_size(int xfeature_nr);
diff --git a/arch/x86/include/asm/trace/fpu.h b/arch/x86/include/asm/trace/fpu.h
index ef82f4824ce7..b691c2db47c7 100644
--- a/arch/x86/include/asm/trace/fpu.h
+++ b/arch/x86/include/asm/trace/fpu.h
@@ -89,6 +89,11 @@ DEFINE_EVENT(x86_fpu, x86_fpu_xstate_check_failed,
 	TP_ARGS(fpu)
 );
 
+DEFINE_EVENT(x86_fpu, x86_fpu_xstate_alloc_failed,
+	TP_PROTO(struct fpu *fpu),
+	TP_ARGS(fpu)
+);
+
 #undef TRACE_INCLUDE_PATH
 #define TRACE_INCLUDE_PATH asm/trace/
 #undef TRACE_INCLUDE_FILE
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 44d41251b474..918930553290 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -210,9 +210,8 @@ void fpstate_init(struct fpu *fpu)
 
 	if (likely(fpu)) {
 		state = fpu->state;
-		/* The dynamic user states are not prepared yet. */
-		mask = xfeatures_mask_all & ~xfeatures_mask_user_dynamic;
-		size = get_xstate_config(XSTATE_MIN_SIZE);
+		mask = fpu->state_mask;
+		size = get_xstate_size(fpu->state_mask);
 	} else {
 		state = &init_fpstate;
 		mask = xfeatures_mask_all;
@@ -248,14 +247,15 @@ int fpu__copy(struct task_struct *dst, struct task_struct *src)
 
 	WARN_ON_FPU(src_fpu != &current->thread.fpu);
 
+	/*
+	 * The child does not inherit the dynamic states. Thus, use the buffer
+	 * embedded in struct task_struct, which has the minimum size.
+	 */
+	dst_fpu->state_mask = (xfeatures_mask_all & ~xfeatures_mask_user_dynamic);
 	dst_fpu->state = &dst_fpu->__default_state;
-
 	/*
 	 * Don't let 'init optimized' areas of the XSAVE area
 	 * leak into the child task:
-	 *
-	 * The child does not inherit the dynamic states. So,
-	 * the xstate buffer has the minimum size.
 	 */
 	memset(&dst_fpu->state->xsave, 0, get_xstate_config(XSTATE_MIN_SIZE));
 
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index c3c7c57616eb..0e3f93b03b3f 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -10,6 +10,7 @@
 #include <linux/pkeys.h>
 #include <linux/seq_file.h>
 #include <linux/proc_fs.h>
+#include <linux/vmalloc.h>
 
 #include <asm/fpu/api.h>
 #include <asm/fpu/internal.h>
@@ -19,6 +20,7 @@
 
 #include <asm/tlbflush.h>
 #include <asm/cpufeature.h>
+#include <asm/trace/fpu.h>
 
 /*
  * Although we spell it out in here, the Processor Trace
@@ -71,6 +73,11 @@ static unsigned int xstate_offsets[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX - 1] =
 static unsigned int xstate_sizes[XFEATURE_MAX]   = { [ 0 ... XFEATURE_MAX - 1] = -1};
 static unsigned int xstate_comp_offsets[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX - 1] = -1};
 static unsigned int xstate_supervisor_only_offsets[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX - 1] = -1};
+/*
+ * True if the buffer of the corresponding XFEATURE is located on the next 64
+ * byte boundary. Otherwise, it follows the preceding component immediately.
+ */
+static bool xstate_aligns[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX - 1] = false};
 
 /**
  * struct fpu_xstate_buffer_config - xstate per-task buffer configuration
@@ -168,6 +175,58 @@ static bool xfeature_is_supervisor(int xfeature_nr)
 	return ecx & 1;
 }
 
+/**
+ * get_xstate_size() - calculate an xstate buffer size
+ * @mask:	This bitmap tells which components reserved in the buffer.
+ *
+ * Available once those arrays for the offset, size, and alignment info are set up,
+ * by setup_xstate_features().
+ *
+ * Returns:	The buffer size
+ */
+unsigned int get_xstate_size(u64 mask)
+{
+	unsigned int size;
+	u64 xmask;
+	int i, nr;
+
+	if (!mask)
+		return 0;
+
+	/*
+	 * The minimum buffer size excludes the dynamic user state. When a task
+	 * uses the state, the buffer can grow up to the max size.
+	 */
+	if (mask == (xfeatures_mask_all & ~xfeatures_mask_user_dynamic))
+		return get_xstate_config(XSTATE_MIN_SIZE);
+	else if (mask == xfeatures_mask_all)
+		return get_xstate_config(XSTATE_MAX_SIZE);
+
+	nr = fls64(mask) - 1;
+
+	if (!using_compacted_format())
+		return xstate_offsets[nr] + xstate_sizes[nr];
+
+	xmask = BIT_ULL(nr + 1) - 1;
+
+	if (mask == (xmask & xfeatures_mask_all))
+		return xstate_comp_offsets[nr] + xstate_sizes[nr];
+
+	/*
+	 * With the given mask, no relevant size is found so far. So, calculate
+	 * it by summing up each state size.
+	 */
+	for (size = FXSAVE_SIZE + XSAVE_HDR_SIZE, i = FIRST_EXTENDED_XFEATURE; i <= nr; i++) {
+		if (!(mask & BIT_ULL(i)))
+			continue;
+
+		if (xstate_aligns[i])
+			size = ALIGN(size, 64);
+		size += xstate_sizes[i];
+	}
+	return size;
+}
+
 /*
  * When executing XSAVEOPT (or other optimized XSAVE instructions), if
  * a processor implementation detects that an FPU state component is still
@@ -308,10 +367,12 @@ static void __init setup_xstate_features(void)
 	xstate_offsets[XFEATURE_FP]	= 0;
 	xstate_sizes[XFEATURE_FP]	= offsetof(struct fxregs_state,
 						   xmm_space);
+	xstate_aligns[XFEATURE_FP]	= true;
 
 	xstate_offsets[XFEATURE_SSE]	= xstate_sizes[XFEATURE_FP];
 	xstate_sizes[XFEATURE_SSE]	= sizeof_field(struct fxregs_state,
 						       xmm_space);
+	xstate_aligns[XFEATURE_SSE]	= true;
 
 	for (i = FIRST_EXTENDED_XFEATURE; i < XFEATURE_MAX; i++) {
 		if (!xfeature_enabled(i))
@@ -329,6 +390,7 @@ static void __init setup_xstate_features(void)
 			continue;
 
 		xstate_offsets[i] = ebx;
+		xstate_aligns[i] = (ecx & 2) ? true : false;
 
 		/*
 		 * In our xstate size checks, we assume that the highest-numbered
@@ -915,6 +977,9 @@ void __init fpu__init_system_xstate(void)
 	if (err)
 		goto out_disable;
 
+	/* Make sure init_task does not include the dynamic user states. */
+	current->thread.fpu.state_mask = (xfeatures_mask_all & ~xfeatures_mask_user_dynamic);
+
 	/*
 	 * Update info used for ptrace frames; use standard-format size and no
 	 * supervisor xstates:
@@ -1141,6 +1206,66 @@ static inline bool xfeatures_mxcsr_quirk(u64 xfeatures)
 	return true;
 }
 
+void free_xstate_buffer(struct fpu *fpu)
+{
+	/* Free up only the dynamically-allocated memory. */
+	if (fpu->state != &fpu->__default_state)
+		vfree(fpu->state);
+}
+
+/**
+ * alloc_xstate_buffer() - allocate an xstate buffer with the size calculated based on @mask.
+ *
+ * @fpu:	A struct fpu * pointer
+ * @mask:	The bitmap tells which components to be reserved in the new buffer.
+ *
+ * Use vmalloc() simply here. If the task with a vmalloc()-allocated buffer tends
+ * to terminate quickly, vfree()-induced IPIs may be a concern. Caching may be
+ * helpful for this. But the task with large state is likely to live longer.
+ *
+ * Also, this method does not shrink or reclaim the buffer.
+ *
+ * Returns 0 on success, -ENOMEM on allocation error.
+ */
+int alloc_xstate_buffer(struct fpu *fpu, u64 mask)
+{
+	union fpregs_state *state;
+	unsigned int oldsz, newsz;
+	u64 state_mask;
+
+	state_mask = fpu->state_mask | mask;
+
+	oldsz = get_xstate_size(fpu->state_mask);
+	newsz = get_xstate_size(state_mask);
+
+	if (oldsz >= newsz)
+		return 0;
+
+	state = vzalloc(newsz);
+	if (!state) {
+		/*
+		 * When allocation requested from #NM, the error code may not be
+		 * populated well. Then, this tracepoint is useful for providing
+		 * the failure context.
+		 */
+		trace_x86_fpu_xstate_alloc_failed(fpu);
+		return -ENOMEM;
+	}
+
+	if (using_compacted_format())
+		fpstate_init_xstate(&state->xsave, state_mask);
+
+	/*
+	 * As long as the register state is intact, save the xstate in the new buffer
+	 * at the next context copy/switch or potentially ptrace-driven xstate writing.
+	 */
+
+	free_xstate_buffer(fpu);
+	fpu->state = state;
+	fpu->state_mask = state_mask;
+	return 0;
+}
+
 static void fill_gap(struct membuf *to, unsigned *last, unsigned offset)
 {
 	if (*last >= offset)
-- 
2.17.1


  parent reply	other threads:[~2021-05-23 19:38 UTC|newest]

Thread overview: 75+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-23 19:32 [PATCH v5 00/28] x86: Support Intel Advanced Matrix Extensions Chang S. Bae
2021-05-23 19:32 ` [PATCH v5 01/28] x86/fpu/xstate: Modify the initialization helper to handle both static and dynamic buffers Chang S. Bae
2021-05-23 19:32 ` [PATCH v5 02/28] x86/fpu/xstate: Modify state copy helpers " Chang S. Bae
2021-05-23 19:32 ` [PATCH v5 03/28] x86/fpu/xstate: Modify address finders " Chang S. Bae
2021-05-23 19:32 ` [PATCH v5 04/28] x86/fpu/xstate: Modify the context restore helper " Chang S. Bae
2021-05-23 19:32 ` [PATCH v5 05/28] x86/fpu/xstate: Add a new variable to indicate dynamic user states Chang S. Bae
2021-05-23 19:32 ` [PATCH v5 06/28] x86/fpu/xstate: Add new variables to indicate dynamic xstate buffer size Chang S. Bae
2021-05-23 19:32 ` [PATCH v5 07/28] x86/fpu/xstate: Calculate and remember dynamic xstate buffer sizes Chang S. Bae
2021-05-23 19:32 ` [PATCH v5 08/28] x86/fpu/xstate: Convert the struct fpu 'state' field to a pointer Chang S. Bae
2021-05-23 19:32 ` Chang S. Bae [this message]
2021-05-23 19:32 ` [PATCH v5 10/28] x86/fpu/xstate: Define the scope of the initial xstate data Chang S. Bae
2021-05-23 19:32 ` [PATCH v5 11/28] x86/fpu/xstate: Update the xstate save function to support dynamic states Chang S. Bae
2021-05-23 19:32 ` [PATCH v5 12/28] x86/fpu/xstate: Update the xstate buffer address finder " Chang S. Bae
2021-05-23 19:32 ` [PATCH v5 13/28] x86/fpu/xstate: Update the xstate context copy function " Chang S. Bae
2021-05-23 19:32 ` [PATCH v5 14/28] x86/fpu/xstate: Prevent unauthorised use of dynamic user state Chang S. Bae
2021-06-16 16:17   ` Dave Hansen
2021-06-16 16:27   ` Dave Hansen
2021-06-16 18:12     ` Andy Lutomirski
2021-06-16 18:47       ` Bae, Chang Seok
2021-06-16 19:01         ` Dave Hansen
2021-06-16 19:23           ` Bae, Chang Seok
2021-06-16 19:28             ` Dave Hansen
2021-06-16 19:37               ` Bae, Chang Seok
2021-06-28 10:11               ` Liu, Jing2
2021-06-29 17:43           ` Bae, Chang Seok
2021-06-29 17:54             ` Dave Hansen
2021-06-29 18:35               ` Bae, Chang Seok
2021-06-29 18:50                 ` Dave Hansen
2021-06-29 19:13                   ` Bae, Chang Seok
2021-06-29 19:26                     ` Dave Hansen
2021-05-23 19:32 ` [PATCH v5 15/28] x86/arch_prctl: Create ARCH_GET_XSTATE/ARCH_PUT_XSTATE Chang S. Bae
2021-05-24 23:10   ` Len Brown
2021-05-25 17:27     ` Borislav Petkov
2021-05-25 17:33       ` Dave Hansen
2021-05-26  0:38     ` Len Brown
2021-05-27 11:14       ` second, sync-alloc syscall Borislav Petkov
2021-05-27 13:59         ` Len Brown
2021-05-27 19:35           ` Andy Lutomirski
2021-05-25 15:46   ` [PATCH v5 15/28] x86/arch_prctl: Create ARCH_GET_XSTATE/ARCH_PUT_XSTATE Dave Hansen
2021-05-23 19:32 ` [PATCH v5 16/28] x86/fpu/xstate: Support ptracer-induced xstate buffer expansion Chang S. Bae
2021-05-23 19:32 ` [PATCH v5 17/28] x86/fpu/xstate: Adjust the XSAVE feature table to address gaps in state component numbers Chang S. Bae
2021-05-23 19:32 ` [PATCH v5 18/28] x86/fpu/xstate: Disable xstate support if an inconsistent state is detected Chang S. Bae
2021-05-23 19:32 ` [PATCH v5 19/28] x86/cpufeatures/amx: Enumerate Advanced Matrix Extension (AMX) feature bits Chang S. Bae
2021-05-23 19:32 ` [PATCH v5 20/28] x86/fpu/amx: Define AMX state components and have it used for boot-time checks Chang S. Bae
2021-05-23 19:32 ` [PATCH v5 21/28] x86/fpu/amx: Initialize child's AMX state Chang S. Bae
2021-05-24  3:09   ` Andy Lutomirski
2021-05-24 17:37     ` Len Brown
2021-05-24 18:13       ` Andy Lutomirski
2021-05-24 18:21         ` Len Brown
2021-05-25  3:44           ` Andy Lutomirski
2021-05-23 19:32 ` [PATCH v5 22/28] x86/fpu/amx: Enable the AMX feature in 64-bit mode Chang S. Bae
2021-05-23 19:32 ` [PATCH v5 23/28] selftest/x86/amx: Test cases for the AMX state management Chang S. Bae
2021-05-23 19:32 ` [PATCH v5 24/28] x86/fpu/xstate: Use per-task xstate mask for saving xstate in signal frame Chang S. Bae
2021-05-24  3:15   ` Andy Lutomirski
2021-05-24 18:06     ` Len Brown
2021-05-25  4:47       ` Andy Lutomirski
2021-05-25 14:04         ` Len Brown
2021-05-23 19:32 ` [PATCH v5 25/28] x86/fpu/xstate: Skip writing zeros to signal frame for dynamic user states if in INIT-state Chang S. Bae
2021-05-24  3:25   ` Andy Lutomirski
2021-05-24 18:15     ` Len Brown
2021-05-24 18:29       ` Dave Hansen
2021-05-25  4:46       ` Andy Lutomirski
2021-05-23 19:32 ` [PATCH v5 26/28] selftest/x86/amx: Test case for AMX state copy optimization in signal delivery Chang S. Bae
2021-05-23 19:32 ` [PATCH v5 27/28] x86/insn/amx: Add TILERELEASE instruction to the opcode map Chang S. Bae
2021-05-23 19:32 ` [PATCH v5 28/28] x86/fpu/amx: Clear the AMX state when appropriate Chang S. Bae
2021-05-24  3:13   ` Andy Lutomirski
2021-05-24 14:10     ` Dave Hansen
2021-05-24 17:32       ` Len Brown
2021-05-24 17:39         ` Dave Hansen
2021-05-24 18:24           ` Len Brown
2021-05-27 11:56             ` Peter Zijlstra
2021-05-27 14:02               ` Len Brown
2021-05-24 14:06   ` Dave Hansen
2021-05-24 17:34     ` Len Brown
2021-05-24 21:11       ` [PATCH v5-fix " Chang S. Bae

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210523193259.26200-10-chang.seok.bae@intel.com \
    --to=chang.seok.bae@intel.com \
    --cc=bp@suse.de \
    --cc=dave.hansen@intel.com \
    --cc=jing2.liu@intel.com \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@kernel.org \
    --cc=ravi.v.shankar@intel.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).