All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sean Christopherson <seanjc@google.com>
To: Weijiang Yang <weijiang.yang@intel.com>
Cc: Dave Hansen <dave.hansen@intel.com>,
	pbonzini@redhat.com, kvm@vger.kernel.org,
	linux-kernel@vger.kernel.org, peterz@infradead.org,
	chao.gao@intel.com, rick.p.edgecombe@intel.com,
	john.allen@amd.com
Subject: Re: [PATCH v6 06/25] x86/fpu/xstate: Opt-in kernel dynamic bits when calculate guest xstate size
Date: Thu, 26 Oct 2023 10:24:29 -0700	[thread overview]
Message-ID: <ZTqgzZl-reO1m01I@google.com> (raw)
In-Reply-To: <de1b148c-45c6-6517-0926-53d1aad8978e@intel.com>

On Wed, Oct 25, 2023, Weijiang Yang wrote:
> On 10/25/2023 1:07 AM, Sean Christopherson wrote:
> > On Fri, Sep 15, 2023, Weijiang Yang wrote:
> > IIUC, the "dynamic" features contains CET_KERNEL, whereas xfeatures_mask_supervisor()
> > conatins PASID, CET_USER, and CET_KERNEL.  PASID isn't virtualized by KVM, but
> > doesn't that mean CET_USER will get dropped/lost if userspace requests AMX/XTILE
> > enabling?
> 
> Yes, __state_size is correct for guest enabled xfeatures, including CET_USER,
> and it gets removed from __state_perm.
> 
> IIUC, from current qemu/kernel interaction for guest permission settings,
> __xstate_request_perm() is called only _ONCE_ to set AMX/XTILE for every vCPU
> thread, so the removal of guest supervisor xfeatures won't hurt guest! ;-/

Huh?  I don't follow.  What does calling __xstate_request_perm() only once have
to do with anything?

/me stares more

OMG, hell no.  First off, this code is a nightmare to follow.  The existing comment
is useless.  No shit the code is adding in supervisor states for the host.  What's
not AT ALL clear is *why*.

The commit says it's necessary because the "permission bitmap is only relevant
for user states":

  commit 781c64bfcb735960717d1cb45428047ff6a5030c
  Author: Thomas Gleixner <tglx@linutronix.de>
  Date:   Thu Mar 24 14:47:14 2022 +0100

    x86/fpu/xstate: Handle supervisor states in XSTATE permissions
    
    The size calculation in __xstate_request_perm() fails to take supervisor
    states into account because the permission bitmap is only relevant for user
    states.

But @permitted comes from:

  permitted = xstate_get_group_perm(guest);

which is either fpu->guest_perm.__state_perm or fpu->perm.__state_perm.  And
__state_perm is initialized to:

	fpu->perm.__state_perm		= fpu_kernel_cfg.default_features;

where fpu_kernel_cfg.default_features contains everything except the dynamic
xfeatures, i.e. everything except XFEATURE_MASK_XTILE_DATA:

	fpu_kernel_cfg.default_features = fpu_kernel_cfg.max_features;
	fpu_kernel_cfg.default_features &= ~XFEATURE_MASK_USER_DYNAMIC;

So why on earth does this code to force back xfeatures_mask_supervisor()?  Because
the code just below drops the supervisor bits to compute the user xstate size and
then clobbers __state_perm.

	/* Calculate the resulting user state size */
	mask &= XFEATURE_MASK_USER_SUPPORTED;
	usize = xstate_calculate_size(mask, false);

	...

	WRITE_ONCE(perm->__state_perm, mask);

That is beyond asinine.  IIUC, the intent is to apply the permission bitmap only
for user states, because the only dynamic states are user states.  Bbut the above
creates an inconsistent mess.  If userspace doesn't request XTILE_DATA,
__state_perm will contain supervisor states, but once userspace does request
XTILE_DATA, __state_perm will be lost.

And because that's not confusing enough, clobbering __state_perm would also drop
FPU_GUEST_PERM_LOCKED, except that __xstate_request_perm() can' be reached with
said LOCKED flag set.

fpu_xstate_prctl() already strips out supervisor features:

	case ARCH_GET_XCOMP_PERM:
		/*
		 * Lockless snapshot as it can also change right after the
		 * dropping the lock.
		 */
		permitted = xstate_get_host_group_perm();
		permitted &= XFEATURE_MASK_USER_SUPPORTED;
		return put_user(permitted, uptr);

	case ARCH_GET_XCOMP_GUEST_PERM:
		permitted = xstate_get_guest_group_perm();
		permitted &= XFEATURE_MASK_USER_SUPPORTED;
		return put_user(permitted, uptr);

and while KVM doesn't apply the __state_perm to supervisor states, if it did
there would be zero harm in doing so.

	case 0xd: {
		u64 permitted_xcr0 = kvm_get_filtered_xcr0();
		u64 permitted_xss = kvm_caps.supported_xss;

Second, the relying on QEMU to only trigger __xstate_request_perm() is not acceptable.
It "works" for the current code, but only because there's only a single dynamic
feature, i.e. this will short circuit and prevent computing a bad ksize.

	/* Check whether fully enabled */
	if ((permitted & requested) == requested)
		return 0;

I don't know how I can possibly make it any clearer: KVM absolutely must not assume
userspace behavior.

So rather than continue with the current madness, which will break if/when the
next dynamic feature comes along, just preserve non-user xfeatures/flags in
__guest_perm.
 
If there are no objections, I'll test the below and write a proper changelog.
 
--
From: Sean Christopherson <seanjc@google.com>
Date: Thu, 26 Oct 2023 10:17:33 -0700
Subject: [PATCH] x86/fpu/xstate: Always preserve non-user xfeatures/flags in
 __state_perm

Fixes: 781c64bfcb73 ("x86/fpu/xstate: Handle supervisor states in XSTATE permissions")
Signed-off-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kernel/fpu/xstate.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c
index ef6906107c54..73f6bc00d178 100644
--- a/arch/x86/kernel/fpu/xstate.c
+++ b/arch/x86/kernel/fpu/xstate.c
@@ -1601,16 +1601,20 @@ static int __xstate_request_perm(u64 permitted, u64 requested, bool guest)
 	if ((permitted & requested) == requested)
 		return 0;
 
-	/* Calculate the resulting kernel state size */
+	/*
+	 * Calculate the resulting kernel state size.  Note, @permitted also
+	 * contains supervisor xfeatures even though supervisor are always
+	 * permitted for kernel and guest FPUs, and never permitted for user
+	 * FPUs.
+	 */
 	mask = permitted | requested;
-	/* Take supervisor states into account on the host */
-	if (!guest)
-		mask |= xfeatures_mask_supervisor();
 	ksize = xstate_calculate_size(mask, compacted);
 
-	/* Calculate the resulting user state size */
-	mask &= XFEATURE_MASK_USER_SUPPORTED;
-	usize = xstate_calculate_size(mask, false);
+	/*
+	 * Calculate the resulting user state size.  Take care not to clobber
+	 * the supervisor xfeatures in the new mask!
+	 */
+	usize = xstate_calculate_size(mask & XFEATURE_MASK_USER_SUPPORTED, false);
 
 	if (!guest) {
 		ret = validate_sigaltstack(usize);

base-commit: c076acf10c78c0d7e1aa50670e9cc4c91e8d59b4
-- 

  reply	other threads:[~2023-10-26 17:24 UTC|newest]

Thread overview: 120+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-14  6:33 [PATCH v6 00/25] Enable CET Virtualization Yang Weijiang
2023-09-14  6:33 ` [PATCH v6 01/25] x86/fpu/xstate: Manually check and add XFEATURE_CET_USER xstate bit Yang Weijiang
2023-09-14 22:39   ` Edgecombe, Rick P
2023-09-15  2:32     ` Yang, Weijiang
2023-09-15 16:35       ` Edgecombe, Rick P
2023-09-18  7:16         ` Yang, Weijiang
2023-10-31 17:43   ` Maxim Levitsky
2023-11-01  9:19     ` Yang, Weijiang
2023-09-14  6:33 ` [PATCH v6 02/25] x86/fpu/xstate: Fix guest fpstate allocation size calculation Yang Weijiang
2023-09-14 22:45   ` Edgecombe, Rick P
2023-09-15  2:45     ` Yang, Weijiang
2023-09-15 16:35       ` Edgecombe, Rick P
2023-10-21  0:39   ` Sean Christopherson
2023-10-24  8:50     ` Yang, Weijiang
2023-10-24 16:32       ` Sean Christopherson
2023-10-25 13:49         ` Yang, Weijiang
2023-10-31 17:43         ` Maxim Levitsky
2023-09-14  6:33 ` [PATCH v6 03/25] x86/fpu/xstate: Add CET supervisor mode state support Yang Weijiang
2023-09-15  0:06   ` Edgecombe, Rick P
2023-09-15  6:30     ` Yang, Weijiang
2023-10-31 17:44       ` Maxim Levitsky
2023-09-14  6:33 ` [PATCH v6 04/25] x86/fpu/xstate: Introduce kernel dynamic xfeature set Yang Weijiang
2023-09-15  0:24   ` Edgecombe, Rick P
2023-09-15  6:42     ` Yang, Weijiang
2023-10-31 17:44       ` Maxim Levitsky
2023-09-14  6:33 ` [PATCH v6 05/25] x86/fpu/xstate: Remove kernel dynamic xfeatures from kernel default_features Yang Weijiang
2023-09-14 16:22   ` Dave Hansen
2023-09-15  1:52     ` Yang, Weijiang
2023-10-31 17:44     ` Maxim Levitsky
2023-09-14  6:33 ` [PATCH v6 06/25] x86/fpu/xstate: Opt-in kernel dynamic bits when calculate guest xstate size Yang Weijiang
2023-09-14 17:40   ` Dave Hansen
2023-09-15  2:22     ` Yang, Weijiang
2023-10-24 17:07       ` Sean Christopherson
2023-10-25 14:49         ` Yang, Weijiang
2023-10-26 17:24           ` Sean Christopherson [this message]
2023-10-26 22:06             ` Edgecombe, Rick P
2023-10-31 17:45             ` Maxim Levitsky
2023-11-01 14:16               ` Sean Christopherson
2023-11-02 18:20                 ` Maxim Levitsky
2023-11-03 14:33                   ` Sean Christopherson
2023-11-07 18:04                     ` Maxim Levitsky
2023-11-14  9:13                       ` Yang, Weijiang
2023-09-14  6:33 ` [PATCH v6 07/25] x86/fpu/xstate: Tweak guest fpstate to support kernel dynamic xfeatures Yang Weijiang
2023-10-31 17:45   ` Maxim Levitsky
2023-09-14  6:33 ` [PATCH v6 08/25] x86/fpu/xstate: WARN if normal fpstate contains " Yang Weijiang
2023-10-31 17:45   ` Maxim Levitsky
2023-09-14  6:33 ` [PATCH v6 09/25] KVM: x86: Rework cpuid_get_supported_xcr0() to operate on vCPU data Yang Weijiang
2023-10-31 17:46   ` Maxim Levitsky
2023-11-01 14:41     ` Sean Christopherson
2023-11-02 18:25       ` Maxim Levitsky
2023-09-14  6:33 ` [PATCH v6 10/25] KVM: x86: Add kvm_msr_{read,write}() helpers Yang Weijiang
2023-10-31 17:47   ` Maxim Levitsky
2023-11-01 19:32     ` Sean Christopherson
2023-11-02 18:26       ` Maxim Levitsky
2023-11-15  9:00         ` Yang, Weijiang
2023-09-14  6:33 ` [PATCH v6 11/25] KVM: x86: Report XSS as to-be-saved if there are supported features Yang Weijiang
2023-10-31 17:47   ` Maxim Levitsky
2023-11-01 19:18     ` Sean Christopherson
2023-11-02 18:31       ` Maxim Levitsky
2023-09-14  6:33 ` [PATCH v6 12/25] KVM: x86: Refresh CPUID on write to guest MSR_IA32_XSS Yang Weijiang
2023-10-08  5:54   ` Chao Gao
2023-10-10  0:49     ` Yang, Weijiang
2023-10-31 17:51   ` Maxim Levitsky
2023-11-01 17:20     ` Sean Christopherson
2023-11-15  7:18   ` Binbin Wu
2023-09-14  6:33 ` [PATCH v6 13/25] KVM: x86: Initialize kvm_caps.supported_xss Yang Weijiang
2023-10-31 17:51   ` Maxim Levitsky
2023-09-14  6:33 ` [PATCH v6 14/25] KVM: x86: Load guest FPU state when access XSAVE-managed MSRs Yang Weijiang
2023-10-31 17:51   ` Maxim Levitsky
2023-11-01 18:05     ` Sean Christopherson
2023-11-02 18:31       ` Maxim Levitsky
2023-11-03  8:46       ` Yang, Weijiang
2023-11-03 14:02         ` Sean Christopherson
2023-09-14  6:33 ` [PATCH v6 15/25] KVM: x86: Add fault checks for guest CR4.CET setting Yang Weijiang
2023-10-31 17:51   ` Maxim Levitsky
2023-09-14  6:33 ` [PATCH v6 16/25] KVM: x86: Report KVM supported CET MSRs as to-be-saved Yang Weijiang
2023-10-08  6:19   ` Chao Gao
2023-10-10  0:54     ` Yang, Weijiang
2023-10-31 17:52   ` Maxim Levitsky
2023-09-14  6:33 ` [PATCH v6 17/25] KVM: VMX: Introduce CET VMCS fields and control bits Yang Weijiang
2023-10-31 17:52   ` Maxim Levitsky
2023-09-14  6:33 ` [PATCH v6 18/25] KVM: x86: Use KVM-governed feature framework to track "SHSTK/IBT enabled" Yang Weijiang
2023-10-31 17:54   ` Maxim Levitsky
2023-11-01 15:46     ` Sean Christopherson
2023-11-02 18:35       ` Maxim Levitsky
2023-11-04  0:07         ` Sean Christopherson
2023-11-07 18:05           ` Maxim Levitsky
2023-09-14  6:33 ` [PATCH v6 19/25] KVM: VMX: Emulate read and write to CET MSRs Yang Weijiang
2023-10-31 17:55   ` Maxim Levitsky
2023-11-01 16:31     ` Sean Christopherson
2023-11-02 18:38       ` Maxim Levitsky
2023-11-02 23:58         ` Sean Christopherson
2023-11-07 18:12           ` Maxim Levitsky
2023-11-07 18:39             ` Sean Christopherson
2023-11-03  8:18       ` Yang, Weijiang
2023-11-03 22:26         ` Sean Christopherson
2023-09-14  6:33 ` [PATCH v6 20/25] KVM: x86: Save and reload SSP to/from SMRAM Yang Weijiang
2023-10-31 17:55   ` Maxim Levitsky
2023-09-14  6:33 ` [PATCH v6 21/25] KVM: VMX: Set up interception for CET MSRs Yang Weijiang
2023-10-31 17:56   ` Maxim Levitsky
2023-09-14  6:33 ` [PATCH v6 22/25] KVM: VMX: Set host constant supervisor states to VMCS fields Yang Weijiang
2023-10-31 17:56   ` Maxim Levitsky
2023-09-14  6:33 ` [PATCH v6 23/25] KVM: x86: Enable CET virtualization for VMX and advertise to userspace Yang Weijiang
2023-09-24 13:38   ` kernel test robot
2023-09-25  0:26     ` Yang, Weijiang
2023-10-31 17:56   ` Maxim Levitsky
2023-11-01 22:14     ` Sean Christopherson
2023-09-14  6:33 ` [PATCH v6 24/25] KVM: nVMX: Introduce new VMX_BASIC bit for event error_code delivery to L1 Yang Weijiang
2023-10-31 17:57   ` Maxim Levitsky
2023-11-01  4:21   ` Chao Gao
2023-11-15  8:31     ` Yang, Weijiang
2023-11-15 13:27       ` Sean Christopherson
2023-09-14  6:33 ` [PATCH v6 25/25] KVM: nVMX: Enable CET support for nested guest Yang Weijiang
2023-10-31 17:57   ` Maxim Levitsky
2023-11-01  2:09   ` Chao Gao
2023-11-01  9:22     ` Yang, Weijiang
2023-11-01  9:54     ` Maxim Levitsky
2023-11-15  8:56       ` Yang, Weijiang
2023-11-15  8:23     ` Yang, Weijiang
2023-09-25  0:31 ` [PATCH v6 00/25] Enable CET Virtualization Yang, Weijiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZTqgzZl-reO1m01I@google.com \
    --to=seanjc@google.com \
    --cc=chao.gao@intel.com \
    --cc=dave.hansen@intel.com \
    --cc=john.allen@amd.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rick.p.edgecombe@intel.com \
    --cc=weijiang.yang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.