All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Yang, Weijiang" <weijiang.yang@intel.com>
To: Sean Christopherson <seanjc@google.com>
Cc: <pbonzini@redhat.com>, <kvm@vger.kernel.org>,
	<linux-kernel@vger.kernel.org>, <chao.gao@intel.com>,
	<rick.p.edgecombe@intel.com>, <mlevitsk@redhat.com>,
	<john.allen@amd.com>, Aaron Lewis <aaronlewis@google.com>,
	Jim Mattson <jmattson@google.com>,
	Oliver Upton <oupton@google.com>,
	Mingwei Zhang <mizhang@google.com>
Subject: Re: [PATCH v10 20/27] KVM: VMX: Emulate read and write to CET MSRs
Date: Wed, 13 Mar 2024 17:43:12 +0800	[thread overview]
Message-ID: <9f820b96-0e4b-4cdc-93ff-f63aed829f0d@intel.com> (raw)
In-Reply-To: <ZfDdS8rtVtyEr0UR@google.com>

On 3/13/2024 6:55 AM, Sean Christopherson wrote:
> -non-KVM people, +Mingwei, Aaron, Oliver, and Jim
>
> On Sun, Feb 18, 2024, Yang Weijiang wrote:
>>   	case MSR_IA32_PERF_CAPABILITIES:
>>   		if (data && !vcpu_to_pmu(vcpu)->version)
>>   			return 1;
> Ha, perfect, this is already in the diff context.
>
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index c0ed69353674..281c3fe728c5 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -1849,6 +1849,36 @@ bool kvm_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u32 type)
>>   }
>>   EXPORT_SYMBOL_GPL(kvm_msr_allowed);
>>   
>> +#define CET_US_RESERVED_BITS		GENMASK(9, 6)
>> +#define CET_US_SHSTK_MASK_BITS		GENMASK(1, 0)
>> +#define CET_US_IBT_MASK_BITS		(GENMASK_ULL(5, 2) | GENMASK_ULL(63, 10))
>> +#define CET_US_LEGACY_BITMAP_BASE(data)	((data) >> 12)
>> +
>> +static bool is_set_cet_msr_allowed(struct kvm_vcpu *vcpu, u32 index, u64 data,
>> +				   bool host_initiated)
>> +{
> ...
>
>> +	/*
>> +	 * If KVM supports the MSR, i.e. has enumerated the MSR existence to
>> +	 * userspace, then userspace is allowed to write '0' irrespective of
>> +	 * whether or not the MSR is exposed to the guest.
>> +	 */
>> +	if (!host_initiated || data)
>> +		return false;
> ...
>
>> @@ -1951,6 +2017,20 @@ static int __kvm_get_msr(struct kvm_vcpu *vcpu, u32 index, u64 *data,
>>   		    !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
>>   			return 1;
>>   		break;
>> +	case MSR_IA32_U_CET:
>> +	case MSR_IA32_S_CET:
>> +		if (!guest_can_use(vcpu, X86_FEATURE_SHSTK) &&
>> +		    !guest_can_use(vcpu, X86_FEATURE_IBT))
>> +			return 1;
> As pointed out by Mingwei in a conversation about PERF_CAPABILITIES, rejecting
> host *reads* while allowing host writes of '0' is inconsistent.  Which, while
> arguably par for the course for KVM's ABI, will likely result in the exact problem
> we're trying to avoid: killing userspace because it attempts to access an MSR KVM
> has said exists.

Thank you for the notification!
Agree on it.

>
> PERF_CAPABILITIES has a similar, but opposite, problem where KVM returns a non-zero
> value on reads, but rejects that same non-zero value on write.  PERF_CAPABILITIES
> is even more complicated because KVM stuff a non-zero value at vCPU creation, but
> that's not really relevant to this discussion, just another data point for how
> messed up this all is.
>
> Also relevant to this discussion are KVM's PV MSRs, e.g. MSR_KVM_ASYNC_PF_ACK,
> as KVM rejects attempts to write '0' if the guest doesn't support the MSR, but
> if and only userspace has enabled KVM_CAP_ENFORCE_PV_FEATURE_CPUID.
>
> Coming to the point, this mess is getting too hard to maintain, both from a code
> perspective and "what is KVM's ABI?" perspective.
>
> Rather than play whack-a-mole and inevitably end up with bugs and/or inconsistencies,
> what if we (a) return KVM_MSR_RET_INVALID when an MSR access is denied based on
> guest CPUID,

Can we define a new return value KVM_MSR_RET_REJECTED for this case in order to tell it from KVM_MSR_RET_INVALID which means the msr index doesn't exit?
> (b) wrap userspace MSR accesses at the very top level and convert
> KVM_MSR_RET_INVALID to "success" when KVM reported the MSR as savable and userspace
> is reading or writing '0',

Yes, this can limit the change on KVM side.

> and (c) drop all of the host_initiated checks that
> exist purely to exempt userspace access from guest CPUID checks.
>
> The only possible hiccup I can think of is that this could subtly break userspace
> that is setting CPUID _after_ MSRs, but my understanding is that we've agreed to
> draw a line and say that that's unsupported.

Yeah,  it would mess up things.

> And I think it's low risk, because
> I don't see how code like this:
>
> 	case MSR_TSC_AUX:
> 		if (!kvm_is_supported_user_return_msr(MSR_TSC_AUX))
> 			return 1;
>
> 		if (!host_initiated &&
> 		    !guest_cpuid_has(vcpu, X86_FEATURE_RDTSCP) &&
> 		    !guest_cpuid_has(vcpu, X86_FEATURE_RDPID))
> 			return 1;
>
> 		if (guest_cpuid_is_intel(vcpu) && (data >> 32) != 0)
> 			return 1;
>
> can possibly work if userspace sets MSRs first.  The RDTSCP/RDPID checks are
> exempt, but the vendor in guest CPUID would be '0', not Intel's magic string,
> and so setting MSRs before CPUID would fail, at least if the target vCPU model
> is Intel.
>
> P.S. I also want to rename KVM_MSR_RET_INVALID => KVM_MSR_RET_UNSUPPORTED, because
> I can never remember that "invalid" doesn't mean the value was invalid, it means
> the MSR index was invalid.

So do I :-)

>
> It'll take a few patches, but I believe we can end up with something like this:
>
> static bool kvm_is_msr_to_save(u32 msr_index)
> {
> 	unsigned int i;
>
> 	for (i = 0; i < num_msrs_to_save; i++) {
> 		if (msrs_to_save[i] == msr_index)
> 			return true;
> 	}

Should we also check emulated_msrs list here since KVM_GET_MSR_INDEX_LIST exposes it too?

>
> 	return false;
> }
> typedef int (*msr_uaccess_t)(struct kvm_vcpu *vcpu, u32 index, u64 *data,
> 			     bool host_initiated);
>
> static __always_inline int kvm_do_msr_uaccess(struct kvm_vcpu *vcpu, u32 msr,
> 					      u64 *data, bool host_initiated,
> 					      enum kvm_msr_access rw,
> 					      msr_uaccess_t msr_uaccess_fn)
> {
> 	const char *op = rw == MSR_TYPE_W ? "wrmsr" : "rdmsr";
> 	int ret;
>
> 	BUILD_BUG_ON(rw != MSR_TYPE_R && rw != MSR_TYPE_W);
>
> 	/*
> 	 * Zero the data on read failures to avoid leaking stack data to the
> 	 * guest and/or userspace, e.g. if the failure is ignored below.
> 	 */
> 	ret = msr_uaccess_fn(vcpu, msr, data, host_initiated);
> 	if (ret && rw == MSR_TYPE_R)
> 		*data = 0;
>
> 	if (ret != KVM_MSR_RET_UNSUPPORTED)
> 		return ret;
>
> 	/*
> 	 * Userspace is allowed to read MSRs, and write '0' to MSRs, that KVM
> 	 * reports as to-be-saved, even if an MSRs isn't fully supported.
> 	 * Simply check that @data is '0', which covers both the write '0' case
> 	 * and all reads (in which case @data is zeroed on failure; see above).
> 	 */
> 	if (kvm_is_msr_to_save(msr) && !*data)
> 		return 0;
>
> 	if (!ignore_msrs) {
> 		kvm_debug_ratelimited("unhandled %s: 0x%x data 0x%llx\n",
> 				      op, msr, *data);
> 		return ret;
> 	}
>
> 	if (report_ignored_msrs)
> 		kvm_pr_unimpl("ignored %s: 0x%x data 0x%llx\n", op, msr, *data);
> 	
> 	return 0;
> }

The handling flow looks good to me. Thanks a lot!



  reply	other threads:[~2024-03-13  9:43 UTC|newest]

Thread overview: 93+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-19  7:47 [PATCH v10 00/27] Enable CET Virtualization Yang Weijiang
2024-02-19  7:47 ` [PATCH v10 01/27] x86/fpu/xstate: Always preserve non-user xfeatures/flags in __state_perm Yang Weijiang
2024-02-19  7:47 ` [PATCH v10 02/27] x86/fpu/xstate: Refine CET user xstate bit enabling Yang Weijiang
2024-02-19  7:47 ` [PATCH v10 03/27] x86/fpu/xstate: Add CET supervisor mode state support Yang Weijiang
2024-02-19  7:47 ` [PATCH v10 04/27] x86/fpu/xstate: Introduce XFEATURE_MASK_KERNEL_DYNAMIC xfeature set Yang Weijiang
2024-05-01 18:45   ` Sean Christopherson
2024-05-02 17:46     ` Dave Hansen
2024-05-07 22:57       ` Sean Christopherson
2024-05-07 23:17         ` Dave Hansen
2024-05-08  1:19           ` Yang, Weijiang
2024-02-19  7:47 ` [PATCH v10 05/27] x86/fpu/xstate: Introduce fpu_guest_cfg for guest FPU configuration Yang Weijiang
2024-02-19  7:47 ` [PATCH v10 06/27] x86/fpu/xstate: Create guest fpstate with guest specific config Yang Weijiang
2024-02-19  7:47 ` [PATCH v10 07/27] x86/fpu/xstate: Warn if kernel dynamic xfeatures detected in normal fpstate Yang Weijiang
2024-02-19  7:47 ` [PATCH v10 08/27] KVM: x86: Rework cpuid_get_supported_xcr0() to operate on vCPU data Yang Weijiang
2024-02-19  7:47 ` [PATCH v10 09/27] KVM: x86: Rename kvm_{g,s}et_msr()* to menifest emulation operations Yang Weijiang
2024-05-01 18:54   ` Sean Christopherson
2024-05-06  5:58     ` Yang, Weijiang
2024-02-19  7:47 ` [PATCH v10 10/27] KVM: x86: Refine xsave-managed guest register/MSR reset handling Yang Weijiang
2024-02-20  3:04   ` Chao Gao
2024-02-20 13:23     ` Yang, Weijiang
2024-05-01 20:40   ` Sean Christopherson
2024-05-06  7:26     ` Yang, Weijiang
2024-02-19  7:47 ` [PATCH v10 11/27] KVM: x86: Add kvm_msr_{read,write}() helpers Yang Weijiang
2024-02-19  7:47 ` [PATCH v10 12/27] KVM: x86: Report XSS as to-be-saved if there are supported features Yang Weijiang
2024-02-19  7:47 ` [PATCH v10 13/27] KVM: x86: Refresh CPUID on write to guest MSR_IA32_XSS Yang Weijiang
2024-02-20  8:51   ` Chao Gao
2024-05-01 20:43   ` Sean Christopherson
2024-05-06  7:30     ` Yang, Weijiang
2024-02-19  7:47 ` [PATCH v10 14/27] KVM: x86: Initialize kvm_caps.supported_xss Yang Weijiang
2024-02-19  7:47 ` [PATCH v10 15/27] KVM: x86: Load guest FPU state when access XSAVE-managed MSRs Yang Weijiang
2024-02-19  7:47 ` [PATCH v10 16/27] KVM: x86: Add fault checks for guest CR4.CET setting Yang Weijiang
2024-02-19  7:47 ` [PATCH v10 17/27] KVM: x86: Report KVM supported CET MSRs as to-be-saved Yang Weijiang
2024-05-01 22:40   ` Sean Christopherson
2024-05-06  8:31     ` Yang, Weijiang
2024-05-07 17:27       ` Sean Christopherson
2024-05-08  7:00         ` Yang, Weijiang
2024-02-19  7:47 ` [PATCH v10 18/27] KVM: VMX: Introduce CET VMCS fields and control bits Yang Weijiang
2024-02-19  7:47 ` [PATCH v10 19/27] KVM: x86: Use KVM-governed feature framework to track "SHSTK/IBT enabled" Yang Weijiang
2024-02-19  7:47 ` [PATCH v10 20/27] KVM: VMX: Emulate read and write to CET MSRs Yang Weijiang
2024-03-12 22:55   ` Sean Christopherson
2024-03-13  9:43     ` Yang, Weijiang [this message]
2024-03-13 16:00       ` Sean Christopherson
2024-02-19  7:47 ` [PATCH v10 21/27] KVM: x86: Save and reload SSP to/from SMRAM Yang Weijiang
2024-05-01 22:50   ` Sean Christopherson
2024-05-06  8:41     ` Yang, Weijiang
2024-02-19  7:47 ` [PATCH v10 22/27] KVM: VMX: Set up interception for CET MSRs Yang Weijiang
2024-05-01 23:07   ` Sean Christopherson
2024-05-06  8:48     ` Yang, Weijiang
2024-02-19  7:47 ` [PATCH v10 23/27] KVM: VMX: Set host constant supervisor states to VMCS fields Yang Weijiang
2024-02-19  7:47 ` [PATCH v10 24/27] KVM: x86: Enable CET virtualization for VMX and advertise to userspace Yang Weijiang
2024-05-01 23:15   ` Sean Christopherson
2024-05-01 23:24     ` Edgecombe, Rick P
2024-05-06  9:19     ` Yang, Weijiang
2024-05-06 16:54       ` Sean Christopherson
2024-05-07  2:37         ` Yang, Weijiang
2024-05-06 17:05       ` Edgecombe, Rick P
2024-05-06 23:33         ` Sean Christopherson
2024-05-06 23:53           ` Edgecombe, Rick P
2024-05-07 14:21             ` Sean Christopherson
2024-05-07 14:45               ` Edgecombe, Rick P
2024-05-07 15:08                 ` Sean Christopherson
2024-05-07 15:33                   ` Edgecombe, Rick P
2024-05-16  7:13     ` Yang, Weijiang
2024-05-16 14:39       ` Sean Christopherson
2024-05-16 15:36         ` Dave Hansen
2024-05-16 16:58           ` Sean Christopherson
2024-05-17  8:27             ` Yang, Weijiang
2024-05-17  8:57         ` Thomas Gleixner
2024-05-17 14:26           ` Sean Christopherson
2024-05-20  9:43             ` Yang, Weijiang
2024-05-20 17:09               ` Sean Christopherson
2024-05-20 17:15                 ` Dave Hansen
2024-05-22  9:03                   ` Yang, Weijiang
2024-05-22 15:06                     ` Edgecombe, Rick P
2024-05-23 10:07                       ` Yang, Weijiang
2024-05-22  8:41                 ` Yang, Weijiang
2024-05-01 23:34   ` Sean Christopherson
2024-05-06  9:41     ` Yang, Weijiang
2024-05-16  7:20       ` Yang, Weijiang
2024-05-16 14:43         ` Sean Christopherson
2024-05-17  8:04           ` Yang, Weijiang
2024-02-19  7:47 ` [PATCH v10 25/27] KVM: nVMX: Introduce new VMX_BASIC bit for event error_code delivery to L1 Yang Weijiang
2024-05-01 23:19   ` Sean Christopherson
2024-05-06  9:19     ` Yang, Weijiang
2024-02-19  7:47 ` [PATCH v10 26/27] KVM: nVMX: Enable CET support for nested guest Yang Weijiang
2024-05-01 23:23   ` Sean Christopherson
2024-05-06  9:25     ` Yang, Weijiang
2024-02-19  7:47 ` [PATCH v10 27/27] KVM: x86: Don't emulate instructions guarded by CET Yang Weijiang
2024-05-01 23:24   ` Sean Christopherson
2024-05-06  9:26     ` Yang, Weijiang
2024-03-06 14:44 ` [PATCH v10 00/27] Enable CET Virtualization Yang, Weijiang
2024-05-01 23:27 ` Sean Christopherson
2024-05-06  9:31   ` Yang, Weijiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9f820b96-0e4b-4cdc-93ff-f63aed829f0d@intel.com \
    --to=weijiang.yang@intel.com \
    --cc=aaronlewis@google.com \
    --cc=chao.gao@intel.com \
    --cc=jmattson@google.com \
    --cc=john.allen@amd.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mizhang@google.com \
    --cc=mlevitsk@redhat.com \
    --cc=oupton@google.com \
    --cc=pbonzini@redhat.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=seanjc@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.