All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Yang, Weijiang" <weijiang.yang@intel.com>
To: Sean Christopherson <seanjc@google.com>
Cc: Chao Gao <chao.gao@intel.com>, <pbonzini@redhat.com>,
	<peterz@infradead.org>, <john.allen@amd.com>,
	<kvm@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<rick.p.edgecombe@intel.com>, <binbin.wu@linux.intel.com>
Subject: Re: [PATCH v5 09/19] KVM:x86: Make guest supervisor states as non-XSAVE managed
Date: Wed, 9 Aug 2023 10:39:16 +0800	[thread overview]
Message-ID: <4826a406-c06b-c383-9e1e-60916b24332b@intel.com> (raw)
In-Reply-To: <ZM1jV3UPL0AMpVDI@google.com>

On 8/5/2023 4:45 AM, Sean Christopherson wrote:
> On Fri, Aug 04, 2023, Weijiang Yang wrote:
>> On 8/3/2023 7:15 PM, Chao Gao wrote:
>>> On Thu, Aug 03, 2023 at 12:27:22AM -0400, Yang Weijiang wrote:
>>>> +void save_cet_supervisor_ssp(struct kvm_vcpu *vcpu)
>>>> +{
>>>> +	if (unlikely(guest_can_use(vcpu, X86_FEATURE_SHSTK))) {
> Drop the unlikely, KVM should not speculate on the guest configuration or underlying
> hardware.
OK.
>>>> +		rdmsrl(MSR_IA32_PL0_SSP, vcpu->arch.cet_s_ssp[0]);
>>>> +		rdmsrl(MSR_IA32_PL1_SSP, vcpu->arch.cet_s_ssp[1]);
>>>> +		rdmsrl(MSR_IA32_PL2_SSP, vcpu->arch.cet_s_ssp[2]);
>>>> +		/*
>>>> +		 * Omit reset to host PL{1,2}_SSP because Linux will never use
>>>> +		 * these MSRs.
>>>> +		 */
>>>> +		wrmsrl(MSR_IA32_PL0_SSP, 0);
>>> This wrmsrl() can be dropped because host doesn't support SSS yet.
>> Frankly speaking, I want to remove this line of code. But that would mess up the MSR
>> on host side, i.e., from host perspective, the MSRs could be filled with garbage data,
>> and looks awful.
> So?  :-)
>
> That's the case for all of the MSRs that KVM defers restoring until the host
> returns to userspace, i.e. running in the host with bogus values in hardware is
> nothing new.
CET PL{0,1,2}_SSP are a bit different from other MSRs, the latter will be reloaded with host values
at some points after VM-Exit, but the CET MSRs are "leaked" and never be handled anywhere.
>
> And as I mentioned in the other thread regarding the assertion that SSS isn't
> enabled in the host, sanitizing hardware values for something that should never
> be consumed is a fools errand.
>
>> Anyway, I can remove it.
> Yes please, though it may be a moot point.
>
>>>> +	}
>>>> +}
>>>> +EXPORT_SYMBOL_GPL(save_cet_supervisor_ssp);
>>>> +
>>>> +void reload_cet_supervisor_ssp(struct kvm_vcpu *vcpu)
>>>> +{
>>>> +	if (unlikely(guest_can_use(vcpu, X86_FEATURE_SHSTK))) {
>>> ditto
>> Below is to reload guest supervisor SSPs instead of resetting host ones.
>>>> +		wrmsrl(MSR_IA32_PL0_SSP, vcpu->arch.cet_s_ssp[0]);
>>>> +		wrmsrl(MSR_IA32_PL1_SSP, vcpu->arch.cet_s_ssp[1]);
>>>> +		wrmsrl(MSR_IA32_PL2_SSP, vcpu->arch.cet_s_ssp[2]);
> Pulling back in the justification from v3:
>
>   the Pros:
>    - Super easy to implement for KVM.
>    - Automatically avoids saving and restoring this data when the vmexit
>      is handled within KVM.
>
>   the Cons:
>    - Unnecessarily restores XFEATURE_CET_KERNEL when switching to
>      non-KVM task's userspace.
>    - Forces allocating space for this state on all tasks, whether or not
>      they use KVM, and with likely zero users today and the near future.
>    - Complicates the FPU optimization thinking by including things that
>      can have no affect on userspace in the FPU
>
> IMO the pros far outweigh the cons.  3x RDMSR and 3x WRMSR when loading host/guest
> state is non-trivial overhead.  That can be mitigated, e.g. by utilizing the
> user return MSR framework, but it's still unpalatable.  It's unlikely many guests
> will SSS in the *near* future, but I don't want to end up with code that performs
> poorly in the future and needs to be rewritten.
> Especially because another big negative is that not utilizing XSTATE bleeds into
> KVM's ABI.  Userspace has to be told to manually save+restore MSRs instead of just
> letting KVM_{G,S}ET_XSAVE handle the state.  And that will create a bit of a
> snafu if Linux does gain support for SSS.
>
> On the other hand, the extra per-task memory is all of 24 bytes.  AFAICT, there's
> literally zero effect on guest XSTATE allocations because those are vmalloc'd and
> thus rounded up to PAGE_SIZE, i.e. the next 4KiB.  And XSTATE needs to be 64-byte
> aligned, so the 24 bytes is only actually meaningful if the current size is within
> 24 bytes of the next cahce line.  And the "current" size is variable depending on
> which features are present and enabled, i.e. it's a roll of the dice as to whether
> or not using XSTATE for supervisor CET would actually increase memory usage.  And
> _if_ it does increase memory consumption, I have a very hard time believing an
> extra 64 bytes in the worst case scenario is a dealbreaker.
>
> If the performance is a concern, i.e. we don't want to eat saving/restoring the
> MSRs when switching to/from host FPU context, then I *think* that's simply a matter
> of keeping guest state resident when loading non-guest FPU state.
>
> diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c
> index 1015af1ae562..8e7599e3b923 100644
> --- a/arch/x86/kernel/fpu/core.c
> +++ b/arch/x86/kernel/fpu/core.c
> @@ -167,6 +167,16 @@ void restore_fpregs_from_fpstate(struct fpstate *fpstate, u64 mask)
>                   */
>                  xfd_update_state(fpstate);
>   
> +               /*
> +                * Leave supervisor CET state as-is when loading host state
> +                * (kernel or userspace).  Supervisor CET state is managed via
> +                * XSTATE for KVM guests, but the host never consumes said
> +                * state (doesn't support supervisor shadow stacks), i.e. it's
> +                * safe to keep guest state loaded into hardware.
> +                */
> +               if (!fpstate->is_guest)
> +                       mask &= ~XFEATURE_MASK_CET_KERNEL;
> +
>                  /*
>                   * Restoring state always needs to modify all features
>                   * which are in @mask even if the current task cannot use
>
>
> So unless I'm missing something, NAK to this approach, at least not without trying
> the kernel FPU approach, i.e. I want somelike like to PeterZ or tglx to actually
> full on NAK the kernel approach before we consider shoving a hack into KVM.
I will discuss it with the stakeholders, and get back to this when it's clear. Thanks!

  parent reply	other threads:[~2023-08-09  2:39 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-03  4:27 [PATCH v5 00/19] Enable CET Virtualization Yang Weijiang
2023-08-03  4:27 ` [PATCH v5 01/19] x86/cpufeatures: Add CPU feature flags for shadow stacks Yang Weijiang
2023-08-03  4:27 ` [PATCH v5 02/19] x86/fpu/xstate: Introduce CET MSR and XSAVES supervisor states Yang Weijiang
2023-08-03  4:27 ` [PATCH v5 03/19] KVM:x86: Report XSS as to-be-saved if there are supported features Yang Weijiang
2023-08-03  4:27 ` [PATCH v5 04/19] KVM:x86: Refresh CPUID on write to guest MSR_IA32_XSS Yang Weijiang
2023-08-04 16:02   ` Sean Christopherson
2023-08-04 21:43     ` Paolo Bonzini
2023-08-09  3:11       ` Yang, Weijiang
2023-08-08 14:20     ` Yang, Weijiang
2023-08-04 18:27   ` Sean Christopherson
2023-08-07  6:55     ` Paolo Bonzini
2023-08-09  8:56     ` Yang, Weijiang
2023-08-10  0:01       ` Paolo Bonzini
2023-08-10  1:12         ` Yang, Weijiang
2023-08-03  4:27 ` [PATCH v5 05/19] KVM:x86: Initialize kvm_caps.supported_xss Yang Weijiang
2023-08-04 18:45   ` Sean Christopherson
2023-08-08 15:08     ` Yang, Weijiang
2023-08-03  4:27 ` [PATCH v5 06/19] KVM:x86: Load guest FPU state when access XSAVE-managed MSRs Yang Weijiang
2023-08-03  4:27 ` [PATCH v5 07/19] KVM:x86: Add fault checks for guest CR4.CET setting Yang Weijiang
2023-08-03  9:07   ` Chao Gao
2023-08-03  4:27 ` [PATCH v5 08/19] KVM:x86: Report KVM supported CET MSRs as to-be-saved Yang Weijiang
2023-08-03 10:39   ` Chao Gao
2023-08-04  3:13     ` Yang, Weijiang
2023-08-04  5:51       ` Chao Gao
2023-08-04 18:51         ` Sean Christopherson
2023-08-04 22:01           ` Paolo Bonzini
2023-08-08 15:16           ` Yang, Weijiang
2023-08-06  8:54         ` Yang, Weijiang
2023-08-04 18:55   ` Sean Christopherson
2023-08-08 15:26     ` Yang, Weijiang
2023-08-04 21:47   ` Paolo Bonzini
2023-08-09  3:14     ` Yang, Weijiang
2023-08-03  4:27 ` [PATCH v5 09/19] KVM:x86: Make guest supervisor states as non-XSAVE managed Yang Weijiang
2023-08-03 11:15   ` Chao Gao
2023-08-04  3:26     ` Yang, Weijiang
2023-08-04 20:45       ` Sean Christopherson
2023-08-04 20:59         ` Peter Zijlstra
2023-08-04 21:32         ` Paolo Bonzini
2023-08-09  2:51           ` Yang, Weijiang
2023-08-09  2:39         ` Yang, Weijiang [this message]
2023-08-10  9:29         ` Yang, Weijiang
2023-08-10 14:29           ` Dave Hansen
2023-08-10 15:15             ` Paolo Bonzini
2023-08-10 15:37               ` Sean Christopherson
2023-08-11  3:03               ` Yang, Weijiang
2023-08-28 21:00               ` Dave Hansen
2023-08-29  7:05                 ` Yang, Weijiang
2023-08-03  4:27 ` [PATCH v5 10/19] KVM:VMX: Introduce CET VMCS fields and control bits Yang Weijiang
2023-08-03  4:27 ` [PATCH v5 11/19] KVM:VMX: Emulate read and write to CET MSRs Yang Weijiang
2023-08-04  5:14   ` Chao Gao
2023-08-04 21:27     ` Sean Christopherson
2023-08-04 21:45       ` Paolo Bonzini
2023-08-04 22:21         ` Sean Christopherson
2023-08-07  7:03           ` Paolo Bonzini
2023-08-06  8:44       ` Yang, Weijiang
2023-08-07  7:00         ` Paolo Bonzini
2023-08-04  8:28   ` Chao Gao
2023-08-09  7:12     ` Yang, Weijiang
2023-08-04 21:40   ` Paolo Bonzini
2023-08-09  3:05     ` Yang, Weijiang
2023-08-03  4:27 ` [PATCH v5 12/19] KVM:x86: Save and reload SSP to/from SMRAM Yang Weijiang
2023-08-04  7:53   ` Chao Gao
2023-08-04 15:25     ` Sean Christopherson
2023-08-06  9:14       ` Yang, Weijiang
2023-08-03  4:27 ` [PATCH v5 13/19] KVM:VMX: Set up interception for CET MSRs Yang Weijiang
2023-08-04  8:16   ` Chao Gao
2023-08-06  9:22     ` Yang, Weijiang
2023-08-07  1:16       ` Chao Gao
2023-08-09  6:11         ` Yang, Weijiang
2023-08-03  4:27 ` [PATCH v5 14/19] KVM:VMX: Set host constant supervisor states to VMCS fields Yang Weijiang
2023-08-04  8:23   ` Chao Gao
2023-08-03  4:27 ` [PATCH v5 15/19] KVM:x86: Optimize CET supervisor SSP save/reload Yang Weijiang
2023-08-04  8:43   ` Chao Gao
2023-08-09  9:00     ` Yang, Weijiang
2023-08-03  4:27 ` [PATCH v5 16/19] KVM:x86: Enable CET virtualization for VMX and advertise to userspace Yang Weijiang
2023-08-03  4:27 ` [PATCH v5 17/19] KVM:x86: Enable guest CET supervisor xstate bit support Yang Weijiang
2023-08-04 22:02   ` Paolo Bonzini
2023-08-09  6:07     ` Yang, Weijiang
2023-08-03  4:27 ` [PATCH v5 18/19] KVM:nVMX: Refine error code injection to nested VM Yang Weijiang
2023-08-04 21:38   ` Sean Christopherson
2023-08-09  3:00     ` Yang, Weijiang
2023-08-03  4:27 ` [PATCH v5 19/19] KVM:nVMX: Enable CET support for " Yang Weijiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4826a406-c06b-c383-9e1e-60916b24332b@intel.com \
    --to=weijiang.yang@intel.com \
    --cc=binbin.wu@linux.intel.com \
    --cc=chao.gao@intel.com \
    --cc=john.allen@amd.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rick.p.edgecombe@intel.com \
    --cc=seanjc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.