linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dave Hansen <dave.hansen@intel.com>
To: "Chang S. Bae" <chang.seok.bae@intel.com>,
	bp@suse.de, luto@kernel.org, tglx@linutronix.de,
	mingo@kernel.org, x86@kernel.org
Cc: len.brown@intel.com, jing2.liu@intel.com,
	ravi.v.shankar@intel.com, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v6-fix 06/26] x86/fpu/xstate: Calculate and remember dynamic XSTATE buffer sizes
Date: Fri, 2 Jul 2021 10:46:39 -0700	[thread overview]
Message-ID: <bcb5068b-8a33-c803-aa93-fddc005d9a19@intel.com> (raw)
In-Reply-To: <20210702151726.27293-1-chang.seok.bae@intel.com>

On 7/2/21 8:17 AM, Chang S. Bae wrote:
> The XSTATE per-task buffer is currently embedded into struct fpu with
> static size. To accommodate dynamic user XSTATEs, record the maximum and
> minimum buffer sizes.
> 
> Rename the size calculation function. It calculates the maximum XSTATE size
> and sanity checks it with CPUID. It also calculates the static embedded
> buffer size by excluding the dynamic user states from the maximum size.

This is missing a few things for me.  I'd probably start with this instead:

	The CPUID instruction separately enumerates sizes and alignments
	of individual xfeatures.  It independently enumerates the
	required size of an entire XSAVE buffer to store all enabled
	features.

	calculate_xstate_sizes() currently uses the individual feature
	size/alignment enumeration to independently recalculate the
	required XSAVE buffer size.  This is compared against the CPUID-
	provided value.

	Dynamic xstates introduce another wrinkle: with them, the
	'struct fpu' buffer no longer contains all of the enabled state
	components.  This means that there are now two separate sizes:

	1. The CPUID-enumerated all-feature "maxmimum" size
	2. The size of the 'struct fpu' inline buffer

	... then go on to explain how you added the #2 calculations

> diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
> index e03853bb2603..75969f1ef4b3 100644
> --- a/arch/x86/kernel/fpu/xstate.c
> +++ b/arch/x86/kernel/fpu/xstate.c
> @@ -590,24 +590,36 @@ static void check_xstate_against_struct(int nr)
>  	}
>  }
>  
> -/*
> - * This essentially double-checks what the cpu told us about
> - * how large the XSAVE buffer needs to be.  We are recalculating
> - * it to be safe.
> +/**
> + * calculate_xstate_sizes() - Calculate the xstate per-task buffer sizes.
> + *
> + * Record the minimum and double-check the maximum against what the CPU
> + * told.
>   *
>   * Independent XSAVE features allocate their own buffers and are not
>   * covered by these checks. Only the size of the buffer for task->fpu
>   * is checked here.
> + *
> + * Dynamic user states are stored in this per-task buffer. They account
> + * for the delta between the maximum and the minimum.
> + *
> + * Return: nothing.
>   */

I'm not sure "Return: nothing" is worth saying.  The "void" kinda gives
it away.

> -static void do_extra_xstate_size_checks(void)
> +static void calculate_xstate_sizes(void)
>  {
> -	int paranoid_xstate_size = FXSAVE_SIZE + XSAVE_HDR_SIZE;
> -	int i;
> +	int xstate_size = FXSAVE_SIZE + XSAVE_HDR_SIZE;
> +	int xstate_size_no_dynamic, i;

Both the changelog and the function description talk about the maximum
and minimum sizes of the xsave buffer.  Unfortunately, that doesn't seem
to be reflected in the naming at all.

> +	xstate_size_no_dynamic = xstate_size;
>  
>  	for (i = FIRST_EXTENDED_XFEATURE; i < XFEATURE_MAX; i++) {
> +		bool user_dynamic;
> +
>  		if (!xfeature_enabled(i))
>  			continue;
>  
> +		user_dynamic = (xfeatures_mask_user_dynamic & BIT_ULL(i)) ? true : false;
> +
>  		check_xstate_against_struct(i);
>  		/*
>  		 * Supervisor state components can be managed only by
> @@ -617,27 +629,39 @@ static void do_extra_xstate_size_checks(void)
>  			XSTATE_WARN_ON(xfeature_is_supervisor(i));
>  
>  		/* Align from the end of the previous feature */
> -		if (xfeature_is_aligned(i))
> -			paranoid_xstate_size = ALIGN(paranoid_xstate_size, 64);
> +		if (xfeature_is_aligned(i)) {
> +			xstate_size = ALIGN(xstate_size, 64);
> +			if (!user_dynamic)
> +				xstate_size_no_dynamic = ALIGN(xstate_size_no_dynamic, 64);
> +		}
>  		/*
>  		 * The offset of a given state in the non-compacted
>  		 * format is given to us in a CPUID leaf.  We check
>  		 * them for being ordered (increasing offsets) in
>  		 * setup_xstate_features(). XSAVES uses compacted format.
>  		 */
> -		if (!cpu_feature_enabled(X86_FEATURE_XSAVES))
> -			paranoid_xstate_size = xfeature_uncompacted_offset(i);
> +		if (!cpu_feature_enabled(X86_FEATURE_XSAVES)) {
> +			xstate_size = xfeature_uncompacted_offset(i);
> +			if (!user_dynamic)
> +				xstate_size_no_dynamic = xfeature_uncompacted_offset(i);
> +		}
>  		/*
>  		 * The compacted-format offset always depends on where
>  		 * the previous state ended.
>  		 */
> -		paranoid_xstate_size += xfeature_size(i);
> +		xstate_size += xfeature_size(i);
> +		if (!user_dynamic)
> +			xstate_size_no_dynamic += xfeature_size(i);
>  	}

I'm not a super big fan of how that loop turned out.  It's trying to
keep two variables in parallel.

How about making calculate_xstate_sizes() *return* the xstate size, and
pass 'user_dynamic' to it:

	/* 'false' to exclude dynamic states: */
	xstate_size_min = calculate_xstate_sizes(false);
	/* 'true' to include dynamic states: */
	xstate_size_max = calculate_xstate_sizes(true);

... then go on to do the checks.

>  	/*
>  	 * The size accounts for all the possible states reserved in the
>  	 * per-task buffer.  Check against the maximum size.
>  	 */
> -	XSTATE_WARN_ON(paranoid_xstate_size != get_xstate_config(XSTATE_MAX_SIZE));
> +	XSTATE_WARN_ON(xstate_size != get_xstate_config(XSTATE_MAX_SIZE));
> +
> +	/* Record the one without dynamic states as the minimum. */
> +	set_xstate_config(XSTATE_MIN_SIZE, xstate_size_no_dynamic);
>  }
>  
>  
> @@ -738,14 +762,11 @@ static int __init init_xstate_size(void)
>  	 */
>  	set_xstate_config(XSTATE_MAX_SIZE, possible_xstate_size);
>  
> -	/* Perform an extra check for the maximum size. */
> -	do_extra_xstate_size_checks();
> -
>  	/*
> -	 * Set the minimum to be the same as the maximum. The dynamic
> -	 * user states are not supported yet.
> +	 * Calculate and double-check the maximum size. Calculate and record
> +	 * the minimum size.
>  	 */
> -	set_xstate_config(XSTATE_MIN_SIZE, possible_xstate_size);
> +	calculate_xstate_sizes();
>  
>  	/* Ensure the minimum size fits in the statically-allocated buffer: */
>  	if (!is_supported_xstate_size(get_xstate_config(XSTATE_MIN_SIZE)))
> 


  reply	other threads:[~2021-07-02 17:46 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-30  6:02 [PATCH v6 00/26] x86: Support Intel Advanced Matrix Extensions Chang S. Bae
2021-06-30  6:02 ` [PATCH v6 01/26] x86/fpu/xstate: Modify the initialization helper to handle both static and dynamic buffers Chang S. Bae
2021-06-30  6:02 ` [PATCH v6 02/26] x86/fpu/xstate: Modify state copy helpers " Chang S. Bae
2021-06-30  6:02 ` [PATCH v6 03/26] x86/fpu/xstate: Modify address finders " Chang S. Bae
2021-06-30  6:02 ` [PATCH v6 04/26] x86/fpu/xstate: Add a new variable to indicate dynamic user states Chang S. Bae
2021-06-30  6:02 ` [PATCH v6 05/26] x86/fpu/xstate: Add new variables to indicate dynamic XSTATE buffer size Chang S. Bae
2021-06-30  6:02 ` [PATCH v6 06/26] x86/fpu/xstate: Calculate and remember dynamic XSTATE buffer sizes Chang S. Bae
2021-07-02 15:19   ` Bae, Chang Seok
2021-07-02 15:17     ` [PATCH v6-fix " Chang S. Bae
2021-07-02 17:46       ` Dave Hansen [this message]
2021-06-30  6:02 ` [PATCH v6 07/26] x86/fpu/xstate: Convert the struct fpu 'state' field to a pointer Chang S. Bae
2021-06-30  6:02 ` [PATCH v6 08/26] x86/fpu/xstate: Introduce helpers to manage the XSTATE buffer dynamically Chang S. Bae
2021-06-30  6:02 ` [PATCH v6 09/26] x86/fpu/xstate: Update the XSTATE save function to support dynamic states Chang S. Bae
2021-06-30  6:02 ` [PATCH v6 10/26] x86/fpu/xstate: Update the XSTATE buffer address finder " Chang S. Bae
2021-06-30  6:02 ` [PATCH v6 11/26] x86/fpu/xstate: Update the XSTATE context copy function " Chang S. Bae
2021-06-30  6:02 ` [PATCH v6 12/26] x86/fpu/xstate: Use feature disable (XFD) to protect dynamic user state Chang S. Bae
2021-06-30  6:02 ` [PATCH v6 13/26] x86/fpu/xstate: Support ptracer-induced XSTATE buffer expansion Chang S. Bae
2021-06-30  6:02 ` [PATCH v6 14/26] x86/arch_prctl: Create ARCH_ENABLE_XSTATE Chang S. Bae
2021-06-30  6:02 ` [PATCH v6 15/26] x86/fpu/xstate: Support both legacy and expanded signal XSTATE size Chang S. Bae
2021-06-30  6:02 ` [PATCH v6 16/26] x86/fpu/xstate: Adjust the XSAVE feature table to address gaps in state component numbers Chang S. Bae
2021-06-30  6:02 ` [PATCH v6 17/26] x86/fpu/xstate: Disable XSTATE support if an inconsistent state is detected Chang S. Bae
2021-06-30  6:02 ` [PATCH v6 18/26] x86/cpufeatures/amx: Enumerate Advanced Matrix Extension (AMX) feature bits Chang S. Bae
2021-06-30  6:02 ` [PATCH v6 19/26] x86/fpu/amx: Define AMX state components and have it used for boot-time checks Chang S. Bae
2021-06-30  6:02 ` [PATCH v6 20/26] x86/fpu/amx: Initialize child's AMX state Chang S. Bae
2021-06-30  6:02 ` [PATCH v6 21/26] x86/fpu/amx: Enable the AMX feature in 64-bit mode Chang S. Bae
2021-06-30  6:02 ` [PATCH v6 22/26] x86/fpu/xstate: Skip writing zeros to signal frame for dynamic user states if in INIT-state Chang S. Bae
2021-06-30  6:02 ` [PATCH v6 23/26] selftest/x86/amx: Test cases for the AMX state management Chang S. Bae
2021-06-30  6:02 ` [PATCH v6 24/26] x86/insn/amx: Add TILERELEASE instruction to the opcode map Chang S. Bae
2021-06-30  6:02 ` [PATCH v6 25/26] intel_idle/amx: Clear the AMX state before entering idle Chang S. Bae
2021-06-30 13:11   ` Rafael J. Wysocki
2021-06-30  6:02 ` [PATCH v6 26/26] x86/fpu/xstate: Add a sanity check for XFD state when saving XSTATE Chang S. Bae

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bcb5068b-8a33-c803-aa93-fddc005d9a19@intel.com \
    --to=dave.hansen@intel.com \
    --cc=bp@suse.de \
    --cc=chang.seok.bae@intel.com \
    --cc=jing2.liu@intel.com \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=mingo@kernel.org \
    --cc=ravi.v.shankar@intel.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).