linux-kselftest.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Tian, Kevin" <kevin.tian@intel.com>
To: "Christopherson,, Sean" <seanjc@google.com>,
	"Zhong, Yang" <yang.zhong@intel.com>
Cc: "x86@kernel.org" <x86@kernel.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
	"linux-kselftest@vger.kernel.org"
	<linux-kselftest@vger.kernel.org>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"bp@alien8.de" <bp@alien8.de>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"corbet@lwn.net" <corbet@lwn.net>,
	"shuah@kernel.org" <shuah@kernel.org>,
	"Nakajima, Jun" <jun.nakajima@intel.com>,
	"jing2.liu@linux.intel.com" <jing2.liu@linux.intel.com>,
	"Liu, Jing2" <jing2.liu@intel.com>,
	"Zeng, Guang" <guang.zeng@intel.com>,
	"Wang, Wei W" <wei.w.wang@intel.com>
Subject: RE: [PATCH v4 08/21] kvm: x86: Check and enable permitted dynamic xfeatures at KVM_SET_CPUID2
Date: Thu, 30 Dec 2021 02:28:25 +0000	[thread overview]
Message-ID: <BN9PR11MB5276220CE6BE21025797A7BC8C459@BN9PR11MB5276.namprd11.prod.outlook.com> (raw)
In-Reply-To: <YcyS8lG7vq+jJtLy@google.com>

> From: Sean Christopherson <seanjc@google.com>
> Sent: Thursday, December 30, 2021 12:55 AM
> 
> On Wed, Dec 29, 2021, Yang Zhong wrote:
> > From: Jing Liu <jing2.liu@intel.com>
> >
> > Guest xstate permissions should be set by userspace VMM before vcpu
> > creation. Extend KVM_SET_CPUID2 to verify that every feature reported
> > in CPUID[0xD] has proper permission set. If permission allows, enable
> > all xfeatures in guest cpuid with fpstate buffer sized accordingly.
> >
> > This avoids introducing new KVM exit reason for reporting permission
> > violation to userspace VMM at run-time and also removes the need of
> > tricky fpstate buffer expansion in the emulation and restore path of
> > XCR0 and IA32_XFD MSR.
> 
> How so?  __do_cpuid_func() restricts what is advertised to userspace based
> on
> xstate_get_guest_group_perm(), so it's not like KVM is advertising something
> it
> can't provide?  There should never be any danger to KVM that's mitigated by
> restricing guest CPUID because KVM can and should check vcpu-
> >arch.guest_fpu.perm
> instead of guest CPUID.

Well, above explains why we choose to expand fpstate buffer at 
KVM_SET_CPUID2 instead of in the emulation path when required
permissions have been set, as discussed here:

https://lore.kernel.org/all/20211214024948.048572883@linutronix.de/

> 
> In other words, I believe you're conflating the overall approach of requiring
> userspace to pre-acquire the necessary permissions with enforcing what
> userspace
> advertises to the guest.
> 
> > Signed-off-by: Jing Liu <jing2.liu@intel.com>
> > Signed-off-by: Kevin Tian <kevin.tian@intel.com>
> > Signed-off-by: Yang Zhong <yang.zhong@intel.com>
> > ---
> >  arch/x86/kvm/cpuid.c | 62 +++++++++++++++++++++++++++++++++----------
> -
> >  1 file changed, 47 insertions(+), 15 deletions(-)
> >
> > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> > index 4855344091b8..acbc10db550e 100644
> > --- a/arch/x86/kvm/cpuid.c
> > +++ b/arch/x86/kvm/cpuid.c
> > @@ -81,9 +81,12 @@ static inline struct kvm_cpuid_entry2
> *cpuid_entry2_find(
> >  	return NULL;
> >  }
> >
> > -static int kvm_check_cpuid(struct kvm_cpuid_entry2 *entries, int nent)
> > +static int kvm_check_cpuid(struct kvm_vcpu *vcpu,
> > +			   struct kvm_cpuid_entry2 *entries,
> > +			   int nent)
> >  {
> >  	struct kvm_cpuid_entry2 *best;
> > +	int r = 0;
> >
> >  	/*
> >  	 * The existing code assumes virtual address is 48-bit or 57-bit in the
> > @@ -93,11 +96,40 @@ static int kvm_check_cpuid(struct
> kvm_cpuid_entry2 *entries, int nent)
> >  	if (best) {
> >  		int vaddr_bits = (best->eax & 0xff00) >> 8;
> >
> > -		if (vaddr_bits != 48 && vaddr_bits != 57 && vaddr_bits != 0)
> > -			return -EINVAL;
> > +		if (vaddr_bits != 48 && vaddr_bits != 57 && vaddr_bits != 0) {
> > +			r = -EINVAL;
> > +			goto out;
> 
> Please don't change this to a goto, a return is perfectly ok and more readable
> as it doesn't imply there's some functional change that needs to be unwound
> at
> the end.

will fix

> 
> > +		}
> >  	}
> >
> > -	return 0;
> > +	/*
> > +	 * Check guest permissions for dynamically-enabled xfeatures.
> > +	 * Userspace VMM is expected to acquire permission before vCPU
> > +	 * creation. If permission allows, enable all xfeatures with
> > +	 * fpstate buffer sized accordingly. This avoids complexity of
> > +	 * run-time expansion in the emulation and restore path of XCR0
> > +	 * and IA32_XFD MSR.
> > +	 */
> > +	best = cpuid_entry2_find(entries, nent, 0xd, 0);
> > +	if (best) {
> > +		u64 xfeatures;
> > +
> > +		xfeatures = best->eax | ((u64)best->edx << 32);
> > +		if (xfeatures & ~vcpu->arch.guest_fpu.perm) {
> > +			r = -ENXIO;
> 
> ENXIO is a rather odd error code for insufficient permissions, especially since
> the FPU returns -EPERM for what is effectively the same check.
> 
> > +			goto out;
> > +		}
> > +
> > +		if (xfeatures != vcpu->arch.guest_fpu.xfeatures) {
> 
> xfeatures is obviously not consumed anywhere, which is super confusing and
> arguably wrong, e.g. if userspace advertises xfeatures that are a subset of
> vcpu->arch.guest_fpu.perm, this will expand XSAVE state beyond what
> userspace
> actually wants to advertise to the guest.  The really confusing case would be
> if
> userspace reduced xfeatures relative to vcpu->arch.guest_fpu.xfeatures and
> got
> an -ENOMEM due to the FPU failing to expand the XSAVE size.

You are right.

> 
> I don't care about the waste of memory, and IIUC userspace would have to
> intentionally request permissions for the guest that it then ignores, but that
> doesn't make the code any less confusing.  And as written, this check also
> prevents
> advertising non-XFD features that are not supported in hardware.  I doubt
> there's
> a production use case for that (though MPX deprecation comes close), but
> I've
> certainly exposed unsupported features to a guest for testing purposes.
> 
> Rather than bleed details from the FPU into KVM, why not have the FPU do
> any and
> all checks?  That also gives the FPU access to requested xfeatures so that it
> can opportunistically avoid unnecessary expansion.  We can also tweak the
> kernel
> APIs to be more particular about input values.

All above makes sense, especially when we combine permission check
and buffer expansion in one step now.

> 
> At that point, I would be ok with fpu_update_guest_perm_features()
> rejecting
> attempts to advertise features that are not permitted, because then it's an
> FPU
> policy, not a KVM policy, and there's a valid reason for said policy.  It's a bit
> of a pedantic distinction, but to me it matters because having KVM explicitly
> restrict guest CPUID implies that doing so is necessary for KVM correctness,
> which
> AFAICT is not the case.
> 
> E.g. in KVM
> 
> 	/*
> 	 * Exposing dynamic xfeatures to the guest requires additional
> enabling
> 	 * in the FPU, e.g. to expand the guest XSAVE state size.
> 	 */
> 	best = cpuid_entry2_find(entries, nent, 0xd, 0);
> 	if (!best)
> 		return 0;
> 
> 	xfeatures = best->eax | ((u64)best->edx << 32);
> 	xfeatures &= XFEATURE_MASK_USER_DYNAMIC;
> 	if (!xfeatures)
> 		return 0;
> 
> 	return fpu_enable_guest_xfd_features(&vcpu->arch.guest_fpu,
> xfeatures);
> 
> and then
> 
>   int fpu_enable_guest_xfd_features(struct fpu_guest *guest_fpu, u64
> xfeatures)
>   {
> 	lockdep_assert_preemption_enabled();
> 
> 	/* Nothing to do if all requested features are already enabled. */
> 	xfeatures &= ~guest_fpu->xfeatures;
> 	if (!xfeatures)
> 		return 0;
> 
> 	/* Dynamic xfeatures are not supported with 32-bit kernels. */
> 	if (!IS_ENABLED(CONFIG_X86_64))
> 		return -EPERM;
> 
> 	return __xfd_enable_feature(xfeatures, guest_fpu);
>   }
> 
> with
> 
>   int __xfd_enable_feature(u64 xfd_err, struct fpu_guest *guest_fpu)
>   {
> 	struct fpu_state_perm *perm;
> 	unsigned int ksize, usize;
> 	struct fpu *fpu;
> 
> 	if (WARN_ON_ONCE(!xfd_err || (xfd_err &
> ~XFEATURE_MASK_USER_DYNAMIC)))
> 		return 0;

Currently this is done as:

int __xfd_enable_feature(u64 xfd_err, struct fpu_guest *guest_fpu)
 {
 	u64 xfd_event = xfd_err & XFEATURE_MASK_USER_DYNAMIC;

	...
 	if (!xfd_event) {
		if (!guest_fpu)
			pr_err_once("XFD: Invalid xfd error: %016llx\n", xfd_err);
 		return 0;
 	}
	...
}

is it necessary to convert the error print to WARN_ON() (and also
apply to guest_fpu)?

> 
> 	...
>   }
> 
> which addresses several things:
> 
>   a) avoids explicitly restricing guest CPUID in KVM, and in particular doesn't
>      prevent userspace from advertising non-XFD features that aren't
> supported in
>      hardware, which for better or worse is allowed today.
> 
>   b) returns -EPERM instead of '0' when userspace attempts to enable
> dynamic
>      xfeatures with 32-bit kernels, which isn't a bug as posted only because
>      KVM pre-checks vcpu->arch.guest_fpu.perm.
> 
>   b) avoids reading guest_perm outside of siglock, which was technically a
> TOCTOU
>      "bug", though it didn't put the kernel at risk because
> __xstate_request_perm()
>      doesn't allow reducing permissions.
> 
>   c) allows __xfd_enable_feature() to require the caller to provide just XFD
>      features
> 

All the sample code looks good to me, except WARN_ON() introduced in
the last bullet.

If no objection from other maintainers, we can incorporate it in next version.

Thanks
Kevin

  reply	other threads:[~2021-12-30  2:29 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-29 13:13 [PATCH v4 00/21] AMX Support in KVM Yang Zhong
2021-12-29 13:13 ` [PATCH v4 01/21] x86/fpu: Extend fpu_xstate_prctl() with guest permissions Yang Zhong
2021-12-29 13:13 ` [PATCH v4 02/21] x86/fpu: Prepare guest FPU for dynamically enabled FPU features Yang Zhong
2021-12-29 13:13 ` [PATCH v4 03/21] kvm: x86: Fix xstate_required_size() to follow XSTATE alignment rule Yang Zhong
2022-01-04 19:54   ` Sean Christopherson
2021-12-29 13:13 ` [PATCH v4 04/21] kvm: x86: Exclude unpermitted xfeatures at KVM_GET_SUPPORTED_CPUID Yang Zhong
2021-12-29 13:13 ` [PATCH v4 05/21] x86/fpu: Make XFD initialization in __fpstate_reset() a function argument Yang Zhong
2021-12-29 13:13 ` [PATCH v4 06/21] x86/fpu: Add guest support to xfd_enable_feature() Yang Zhong
2021-12-29 13:13 ` [PATCH v4 07/21] x86/fpu: Provide fpu_update_guest_perm_features() for guest Yang Zhong
2021-12-29 13:13 ` [PATCH v4 08/21] kvm: x86: Check and enable permitted dynamic xfeatures at KVM_SET_CPUID2 Yang Zhong
2021-12-29 16:55   ` Sean Christopherson
2021-12-30  2:28     ` Tian, Kevin [this message]
2021-12-29 13:13 ` [PATCH v4 09/21] x86/fpu: Provide fpu_update_guest_xfd() for IA32_XFD emulation Yang Zhong
2021-12-29 13:13 ` [PATCH v4 10/21] kvm: x86: Add emulation for IA32_XFD Yang Zhong
2022-01-04 19:32   ` Sean Christopherson
2022-01-05  0:22     ` Tian, Kevin
2021-12-29 13:13 ` [PATCH v4 11/21] x86/fpu: Prepare xfd_err in struct fpu_guest Yang Zhong
2021-12-29 13:13 ` [PATCH v4 12/21] kvm: x86: Intercept #NM for saving IA32_XFD_ERR Yang Zhong
2022-01-04 20:01   ` Sean Christopherson
2022-01-05  0:27     ` Tian, Kevin
2021-12-29 13:13 ` [PATCH v4 13/21] kvm: x86: Emulate IA32_XFD_ERR for guest Yang Zhong
2021-12-29 13:13 ` [PATCH v4 14/21] kvm: x86: Disable RDMSR interception of IA32_XFD_ERR Yang Zhong
2022-01-04 19:34   ` Sean Christopherson
2022-01-05  0:23     ` Tian, Kevin
2021-12-29 13:13 ` [PATCH v4 15/21] kvm: x86: Add XCR0 support for Intel AMX Yang Zhong
2021-12-29 13:13 ` [PATCH v4 16/21] kvm: x86: Add CPUID " Yang Zhong
2021-12-29 13:13 ` [PATCH v4 17/21] x86/fpu: Add uabi_size to guest_fpu Yang Zhong
2021-12-29 13:13 ` [PATCH v4 18/21] kvm: x86: Add support for getting/setting expanded xstate buffer Yang Zhong
2022-01-04 19:46   ` Sean Christopherson
2022-01-04 20:45     ` Paolo Bonzini
2022-01-05  4:03       ` Tian, Kevin
2021-12-29 13:13 ` [PATCH v4 19/21] kvm: selftests: Add support for KVM_CAP_XSAVE2 Yang Zhong
2021-12-29 13:13 ` [PATCH v4 20/21] x86/fpu: Provide fpu_sync_guest_vmexit_xfd_state() Yang Zhong
2021-12-29 13:13 ` [PATCH v4 21/21] kvm: x86: Disable interception for IA32_XFD on demand Yang Zhong
2021-12-30  8:46 ` [PATCH v4 00/21] AMX Support in KVM Tian, Kevin
2022-01-04 18:36 ` Paolo Bonzini
2022-01-04 18:54   ` Sean Christopherson
2022-01-05  0:10     ` Tian, Kevin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BN9PR11MB5276220CE6BE21025797A7BC8C459@BN9PR11MB5276.namprd11.prod.outlook.com \
    --to=kevin.tian@intel.com \
    --cc=bp@alien8.de \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=guang.zeng@intel.com \
    --cc=jing2.liu@intel.com \
    --cc=jing2.liu@linux.intel.com \
    --cc=jun.nakajima@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=shuah@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=wei.w.wang@intel.com \
    --cc=x86@kernel.org \
    --cc=yang.zhong@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).