All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Chang S. Bae" <chang.seok.bae@intel.com>
To: bp@suse.de, luto@kernel.org, tglx@linutronix.de,
	mingo@kernel.org, x86@kernel.org
Cc: len.brown@intel.com, dave.hansen@intel.com, jing2.liu@intel.com,
	ravi.v.shankar@intel.com, linux-kernel@vger.kernel.org,
	chang.seok.bae@intel.com
Subject: [PATCH v3 07/21] x86/fpu/xstate: Introduce helpers to manage dynamic xstate buffers
Date: Wed, 23 Dec 2020 07:57:03 -0800	[thread overview]
Message-ID: <20201223155717.19556-8-chang.seok.bae@intel.com> (raw)
In-Reply-To: <20201223155717.19556-1-chang.seok.bae@intel.com>

The static per-task xstate buffer contains the extended register states --
but it is not expandable at runtime. Introduce runtime methods and new fpu
struct fields to support the expansion.

fpu->state_mask indicates the saved states per task and fpu->state_ptr
points to the dynamically allocated buffer.

alloc_xstate_buffer() uses vmalloc(). If use of this mechanism grows to
allocate buffers larger than 64KB, a more sophisticated allocation scheme
that includes purpose-built reclaim capability might be justified.

Introduce a new helper -- get_xstate_size() to calculate the buffer size.

No functional change until the kernel supports dynamic user states.

Signed-off-by: Chang S. Bae <chang.seok.bae@intel.com>
Reviewed-by: Len Brown <len.brown@intel.com>
Cc: x86@kernel.org
Cc: linux-kernel@vger.kernel.org
---
Changes from v2:
* Updated the changelog with task->fpu removed. (Boris Petkov)
* Replaced 'area' with 'buffer' in the comments and the changelog.
* Updated the code comments.

Changes from v1:
* Removed unneeded interrupt masking (Andy Lutomirski)
* Added vmalloc() error tracing (Dave Hansen, PeterZ, and Andy Lutomirski)
---
 arch/x86/include/asm/fpu/types.h  |  29 ++++++--
 arch/x86/include/asm/fpu/xstate.h |   3 +
 arch/x86/include/asm/trace/fpu.h  |   5 ++
 arch/x86/kernel/fpu/core.c        |   3 +
 arch/x86/kernel/fpu/xstate.c      | 115 ++++++++++++++++++++++++++++++
 5 files changed, 150 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h
index f5a38a5f3ae1..3fc6dbbe3ede 100644
--- a/arch/x86/include/asm/fpu/types.h
+++ b/arch/x86/include/asm/fpu/types.h
@@ -336,14 +336,33 @@ struct fpu {
 	 */
 	unsigned long			avx512_timestamp;
 
+	/*
+	 * @state_mask:
+	 *
+	 * The state component bitmap. It indicates the saved xstate in
+	 * either @state or @state_ptr. The map value starts to be aligned
+	 * with @state and then with @state_ptr once it is in use.
+	 */
+	u64				state_mask;
+
+	/*
+	 * @state_ptr:
+	 *
+	 * Copy of all extended register states, in a dynamically allocated
+	 * buffer. When a task is using extended features, the register state
+	 * is always the most current. This state copy is more recent than
+	 * @state. If the task context-switches away, they get saved here,
+	 * representing the xstate.
+	 */
+	union fpregs_state		*state_ptr;
+
 	/*
 	 * @state:
 	 *
-	 * In-memory copy of all FPU registers that we save/restore
-	 * over context switches. If the task is using the FPU then
-	 * the registers in the FPU are more recent than this state
-	 * copy. If the task context-switches away then they get
-	 * saved here and represent the FPU state.
+	 * Copy of some extended register state. If a task uses a dynamically
+	 * allocated buffer, @state_ptr, then it has a more recent state copy
+	 * than this. This copy follows the same attributes as described for
+	 * @state_ptr.
 	 */
 	union fpregs_state		state;
 	/*
diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h
index 6ce8350672c2..379e8f8b8440 100644
--- a/arch/x86/include/asm/fpu/xstate.h
+++ b/arch/x86/include/asm/fpu/xstate.h
@@ -103,6 +103,9 @@ extern void __init update_regset_xstate_info(unsigned int size,
 					     u64 xstate_mask);
 
 void *get_xsave_addr(struct fpu *fpu, int xfeature_nr);
+int alloc_xstate_buffer(struct fpu *fpu, u64 mask);
+void free_xstate_buffer(struct fpu *fpu);
+
 const void *get_xsave_field_ptr(int xfeature_nr);
 int using_compacted_format(void);
 int xfeature_size(int xfeature_nr);
diff --git a/arch/x86/include/asm/trace/fpu.h b/arch/x86/include/asm/trace/fpu.h
index 879b77792f94..bf88b3333873 100644
--- a/arch/x86/include/asm/trace/fpu.h
+++ b/arch/x86/include/asm/trace/fpu.h
@@ -89,6 +89,11 @@ DEFINE_EVENT(x86_fpu, x86_fpu_xstate_check_failed,
 	TP_ARGS(fpu)
 );
 
+DEFINE_EVENT(x86_fpu, x86_fpu_xstate_alloc_failed,
+	TP_PROTO(struct fpu *fpu),
+	TP_ARGS(fpu)
+);
+
 #undef TRACE_INCLUDE_PATH
 #define TRACE_INCLUDE_PATH asm/trace/
 #undef TRACE_INCLUDE_FILE
diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
index 1a428803e6b2..6dafed34be4f 100644
--- a/arch/x86/kernel/fpu/core.c
+++ b/arch/x86/kernel/fpu/core.c
@@ -235,6 +235,9 @@ int fpu__copy(struct task_struct *dst, struct task_struct *src)
 	 */
 	memset(&dst_fpu->state.xsave, 0, fpu_kernel_xstate_min_size);
 
+	dst_fpu->state_mask = xfeatures_mask_all & ~xfeatures_mask_user_dynamic;
+	dst_fpu->state_ptr = NULL;
+
 	/*
 	 * If the FPU registers are not current just memcpy() the state.
 	 * Otherwise save current FPU registers directly into the child's FPU
diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index 2012b17b1793..af4d7d9aa977 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -10,6 +10,7 @@
 #include <linux/pkeys.h>
 #include <linux/seq_file.h>
 #include <linux/proc_fs.h>
+#include <linux/vmalloc.h>
 
 #include <asm/fpu/api.h>
 #include <asm/fpu/internal.h>
@@ -19,6 +20,7 @@
 
 #include <asm/tlbflush.h>
 #include <asm/cpufeature.h>
+#include <asm/trace/fpu.h>
 
 /*
  * Although we spell it out in here, the Processor Trace
@@ -71,6 +73,7 @@ static unsigned int xstate_offsets[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX - 1] =
 static unsigned int xstate_sizes[XFEATURE_MAX]   = { [ 0 ... XFEATURE_MAX - 1] = -1};
 static unsigned int xstate_comp_offsets[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX - 1] = -1};
 static unsigned int xstate_supervisor_only_offsets[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX - 1] = -1};
+static bool xstate_aligns[XFEATURE_MAX] = { [ 0 ... XFEATURE_MAX - 1] = false};
 
 /*
  * The XSAVE area of kernel can be in standard or compacted format;
@@ -130,6 +133,48 @@ static bool xfeature_is_supervisor(int xfeature_nr)
 	return ecx & 1;
 }
 
+/*
+ * Available once those arrays for the offset, size, and alignment info are set up,
+ * by setup_xstate_features().
+ */
+static unsigned int get_xstate_size(u64 mask)
+{
+	unsigned int size;
+	u64 xmask;
+	int i, nr;
+
+	if (!mask)
+		return 0;
+	else if (mask == (xfeatures_mask_all & ~xfeatures_mask_user_dynamic))
+		return fpu_kernel_xstate_min_size;
+	else if (mask == xfeatures_mask_all)
+		return fpu_kernel_xstate_max_size;
+
+	nr = fls64(mask) - 1;
+
+	if (!using_compacted_format())
+		return xstate_offsets[nr] + xstate_sizes[nr];
+
+	xmask = BIT_ULL(nr + 1) - 1;
+
+	if (mask == (xmask & xfeatures_mask_all))
+		return xstate_comp_offsets[nr] + xstate_sizes[nr];
+
+	/*
+	 * Calculate the size by summing up each state together, since no known
+	 * size found with the xstate buffer format out of the given mask.
+	 */
+	for (size = FXSAVE_SIZE + XSAVE_HDR_SIZE, i = FIRST_EXTENDED_XFEATURE; i <= nr; i++) {
+		if (!(mask & BIT_ULL(i)))
+			continue;
+
+		if (xstate_aligns[i])
+			size = ALIGN(size, 64);
+		size += xstate_sizes[i];
+	}
+	return size;
+}
+
 /*
  * When executing XSAVEOPT (or other optimized XSAVE instructions), if
  * a processor implementation detects that an FPU state component is still
@@ -270,10 +315,12 @@ static void __init setup_xstate_features(void)
 	xstate_offsets[XFEATURE_FP]	= 0;
 	xstate_sizes[XFEATURE_FP]	= offsetof(struct fxregs_state,
 						   xmm_space);
+	xstate_aligns[XFEATURE_FP]	= true;
 
 	xstate_offsets[XFEATURE_SSE]	= xstate_sizes[XFEATURE_FP];
 	xstate_sizes[XFEATURE_SSE]	= sizeof_field(struct fxregs_state,
 						       xmm_space);
+	xstate_aligns[XFEATURE_SSE]	= true;
 
 	for (i = FIRST_EXTENDED_XFEATURE; i < XFEATURE_MAX; i++) {
 		if (!xfeature_enabled(i))
@@ -291,6 +338,7 @@ static void __init setup_xstate_features(void)
 			continue;
 
 		xstate_offsets[i] = ebx;
+		xstate_aligns[i] = (ecx & 2) ? true : false;
 
 		/*
 		 * In our xstate size checks, we assume that the highest-numbered
@@ -755,6 +803,9 @@ static bool is_supported_xstate_size(unsigned int test_xstate_size)
 	return false;
 }
 
+/* The watched threshold size of dynamically allocated xstate buffer */
+#define XSTATE_BUFFER_MAX_BYTES		(64 * 1024)
+
 static int __init init_xstate_size(void)
 {
 	/* Recompute the context size for enabled features: */
@@ -779,6 +830,14 @@ static int __init init_xstate_size(void)
 	if (!is_supported_xstate_size(fpu_kernel_xstate_min_size))
 		return -EINVAL;
 
+	/*
+	 * When allocating buffers larger than the threshold, a more sophisticated
+	 * mechanism might be considerable.
+	 */
+	if (fpu_kernel_xstate_max_size > XSTATE_BUFFER_MAX_BYTES)
+		pr_warn("x86/fpu: xstate buffer too large (%u > %u)\n",
+			fpu_kernel_xstate_max_size, XSTATE_BUFFER_MAX_BYTES);
+
 	/*
 	 * User space is always in standard format.
 	 */
@@ -869,6 +928,9 @@ void __init fpu__init_system_xstate(void)
 	if (err)
 		goto out_disable;
 
+	/* Make sure init_task does not include the dynamic user states */
+	current->thread.fpu.state_mask = (xfeatures_mask_all & ~xfeatures_mask_user_dynamic);
+
 	/*
 	 * Update info used for ptrace frames; use standard-format size and no
 	 * supervisor xstates:
@@ -1089,6 +1151,59 @@ static inline bool xfeatures_mxcsr_quirk(u64 xfeatures)
 	return true;
 }
 
+void free_xstate_buffer(struct fpu *fpu)
+{
+	vfree(fpu->state_ptr);
+}
+
+/*
+ * Allocate an xstate buffer with the size calculated based on 'mask'.
+ *
+ * The allocation mechanism does not shrink or reclaim the buffer.
+ */
+int alloc_xstate_buffer(struct fpu *fpu, u64 mask)
+{
+	union fpregs_state *state_ptr;
+	unsigned int oldsz, newsz;
+	u64 state_mask;
+
+	state_mask = fpu->state_mask | mask;
+
+	oldsz = get_xstate_size(fpu->state_mask);
+	newsz = get_xstate_size(state_mask);
+
+	if (oldsz >= newsz)
+		return 0;
+
+	if (newsz > fpu_kernel_xstate_max_size) {
+		pr_warn_once("x86/fpu: xstate buffer too large (%u > %u bytes)\n",
+			     newsz, fpu_kernel_xstate_max_size);
+		XSTATE_WARN_ON(1);
+		return 0;
+	}
+
+	/* We need 64B aligned pointer, but vmalloc() returns a page-aligned address. */
+	state_ptr = vmalloc(newsz);
+	if (!state_ptr) {
+		trace_x86_fpu_xstate_alloc_failed(fpu);
+		return -ENOMEM;
+	}
+
+	memset(state_ptr, 0, newsz);
+	if (using_compacted_format())
+		fpstate_init_xstate(&state_ptr->xsave, state_mask);
+
+	/*
+	 * As long as the register state is intact, save the xstate in the new buffer
+	 * at the next context copy/switch or potentially ptrace-driven xstate writing.
+	 */
+
+	vfree(fpu->state_ptr);
+	fpu->state_ptr = state_ptr;
+	fpu->state_mask = state_mask;
+	return 0;
+}
+
 static void fill_gap(struct membuf *to, unsigned *last, unsigned offset)
 {
 	if (*last >= offset)
-- 
2.17.1


  parent reply	other threads:[~2020-12-23 16:03 UTC|newest]

Thread overview: 64+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-23 15:56 [PATCH v3 00/21] x86: Support Intel Advanced Matrix Extensions Chang S. Bae
2020-12-23 15:56 ` [PATCH v3 01/21] x86/fpu/xstate: Modify initialization helper to handle both static and dynamic buffers Chang S. Bae
2021-01-15 12:40   ` Borislav Petkov
2020-12-23 15:56 ` [PATCH v3 02/21] x86/fpu/xstate: Modify state copy helpers " Chang S. Bae
2021-01-15 12:50   ` Borislav Petkov
2021-01-19 18:50     ` Bae, Chang Seok
2021-01-20 20:53       ` Borislav Petkov
2021-01-20 21:12         ` Bae, Chang Seok
2020-12-23 15:56 ` [PATCH v3 03/21] x86/fpu/xstate: Modify address finders " Chang S. Bae
2021-01-15 13:06   ` Borislav Petkov
2020-12-23 15:57 ` [PATCH v3 04/21] x86/fpu/xstate: Modify context switch helpers " Chang S. Bae
2021-01-15 13:18   ` Borislav Petkov
2021-01-19 18:49     ` Bae, Chang Seok
2020-12-23 15:57 ` [PATCH v3 05/21] x86/fpu/xstate: Add a new variable to indicate dynamic user states Chang S. Bae
2021-01-15 13:39   ` Borislav Petkov
2021-01-15 19:47     ` Bae, Chang Seok
2021-01-19 15:57       ` Borislav Petkov
2021-01-19 18:57         ` Bae, Chang Seok
2021-01-22 10:56           ` Borislav Petkov
2021-01-27  1:23             ` Bae, Chang Seok
2020-12-23 15:57 ` [PATCH v3 06/21] x86/fpu/xstate: Calculate and remember dynamic xstate buffer sizes Chang S. Bae
2021-01-22 11:44   ` Borislav Petkov
2021-01-27  1:23     ` Bae, Chang Seok
2021-01-27  9:38       ` Borislav Petkov
2021-02-03  2:54         ` Bae, Chang Seok
2020-12-23 15:57 ` Chang S. Bae [this message]
2021-01-26 20:17   ` [PATCH v3 07/21] x86/fpu/xstate: Introduce helpers to manage dynamic xstate buffers Borislav Petkov
2021-01-27  1:23     ` Bae, Chang Seok
2021-01-27 10:41       ` Borislav Petkov
2021-02-03  4:10         ` Bae, Chang Seok
2021-02-04 13:10           ` Borislav Petkov
2021-02-03  4:10     ` Bae, Chang Seok
2020-12-23 15:57 ` [PATCH v3 08/21] x86/fpu/xstate: Define the scope of the initial xstate data Chang S. Bae
2021-02-08 12:33   ` Borislav Petkov
2021-02-08 18:53     ` Bae, Chang Seok
2021-02-09 12:49       ` Borislav Petkov
2021-02-09 15:38         ` Bae, Chang Seok
2020-12-23 15:57 ` [PATCH v3 09/21] x86/fpu/xstate: Introduce wrapper functions to organize xstate buffer access Chang S. Bae
2021-02-08 12:33   ` Borislav Petkov
2021-02-09 15:50     ` Bae, Chang Seok
2020-12-23 15:57 ` [PATCH v3 10/21] x86/fpu/xstate: Update xstate save function to support dynamic xstate Chang S. Bae
2021-01-07  8:41   ` Liu, Jing2
2021-01-07 18:40     ` Bae, Chang Seok
2021-01-12  2:52       ` Liu, Jing2
2021-01-15  4:59         ` Bae, Chang Seok
2021-01-15  5:45           ` Liu, Jing2
2021-02-08 12:33   ` Borislav Petkov
2021-02-09 15:48     ` Bae, Chang Seok
2020-12-23 15:57 ` [PATCH v3 11/21] x86/fpu/xstate: Update xstate buffer address finder " Chang S. Bae
2021-02-19 15:00   ` Borislav Petkov
2021-02-19 19:19     ` Bae, Chang Seok
2020-12-23 15:57 ` [PATCH v3 12/21] x86/fpu/xstate: Update xstate context copy function to support dynamic buffer Chang S. Bae
2020-12-23 15:57 ` [PATCH v3 13/21] x86/fpu/xstate: Expand dynamic context switch buffer on first use Chang S. Bae
2020-12-23 15:57 ` [PATCH v3 14/21] x86/fpu/xstate: Support ptracer-induced xstate buffer expansion Chang S. Bae
2020-12-23 15:57 ` [PATCH v3 15/21] x86/fpu/xstate: Extend the table to map xstate components with features Chang S. Bae
2020-12-23 15:57 ` [PATCH v3 16/21] x86/cpufeatures/amx: Enumerate Advanced Matrix Extension (AMX) feature bits Chang S. Bae
2020-12-23 15:57 ` [PATCH v3 17/21] x86/fpu/amx: Define AMX state components and have it used for boot-time checks Chang S. Bae
2020-12-23 15:57 ` [PATCH v3 18/21] x86/fpu/amx: Enable the AMX feature in 64-bit mode Chang S. Bae
2020-12-23 15:57 ` [PATCH v3 19/21] selftest/x86/amx: Include test cases for the AMX state management Chang S. Bae
2020-12-23 15:57 ` [PATCH v3 20/21] x86/fpu/xstate: Support dynamic user state in the signal handling path Chang S. Bae
2020-12-23 15:57 ` [PATCH v3 21/21] x86/fpu/xstate: Introduce boot-parameters to control some state component support Chang S. Bae
2020-12-23 18:37   ` Randy Dunlap
2021-01-14 21:31     ` Bae, Chang Seok
2021-01-14 21:31 ` [PATCH v3 00/21] x86: Support Intel Advanced Matrix Extensions Bae, Chang Seok

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201223155717.19556-8-chang.seok.bae@intel.com \
    --to=chang.seok.bae@intel.com \
    --cc=bp@suse.de \
    --cc=dave.hansen@intel.com \
    --cc=jing2.liu@intel.com \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@kernel.org \
    --cc=ravi.v.shankar@intel.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.