All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Tian, Kevin" <kevin.tian@intel.com>
To: "Christopherson,, Sean" <seanjc@google.com>,
	"Zhong, Yang" <yang.zhong@intel.com>
Cc: "x86@kernel.org" <x86@kernel.org>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
	"linux-kselftest@vger.kernel.org"
	<linux-kselftest@vger.kernel.org>,
	"tglx@linutronix.de" <tglx@linutronix.de>,
	"mingo@redhat.com" <mingo@redhat.com>,
	"bp@alien8.de" <bp@alien8.de>,
	"dave.hansen@linux.intel.com" <dave.hansen@linux.intel.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"corbet@lwn.net" <corbet@lwn.net>,
	"shuah@kernel.org" <shuah@kernel.org>,
	"Nakajima, Jun" <jun.nakajima@intel.com>,
	"jing2.liu@linux.intel.com" <jing2.liu@linux.intel.com>,
	"Liu, Jing2" <jing2.liu@intel.com>,
	"Zeng, Guang" <guang.zeng@intel.com>,
	"Wang, Wei W" <wei.w.wang@intel.com>
Subject: RE: [PATCH v4 08/21] kvm: x86: Check and enable permitted dynamic xfeatures at KVM_SET_CPUID2
Date: Thu, 30 Dec 2021 02:28:25 +0000	[thread overview]
Message-ID: <BN9PR11MB5276220CE6BE21025797A7BC8C459@BN9PR11MB5276.namprd11.prod.outlook.com> (raw)
In-Reply-To: <YcyS8lG7vq+jJtLy@google.com>

> From: Sean Christopherson <seanjc@google.com>
> Sent: Thursday, December 30, 2021 12:55 AM
> 
> On Wed, Dec 29, 2021, Yang Zhong wrote:
> > From: Jing Liu <jing2.liu@intel.com>
> >
> > Guest xstate permissions should be set by userspace VMM before vcpu
> > creation. Extend KVM_SET_CPUID2 to verify that every feature reported
> > in CPUID[0xD] has proper permission set. If permission allows, enable
> > all xfeatures in guest cpuid with fpstate buffer sized accordingly.
> >
> > This avoids introducing new KVM exit reason for reporting permission
> > violation to userspace VMM at run-time and also removes the need of
> > tricky fpstate buffer expansion in the emulation and restore path of
> > XCR0 and IA32_XFD MSR.
> 
> How so?  __do_cpuid_func() restricts what is advertised to userspace based
> on
> xstate_get_guest_group_perm(), so it's not like KVM is advertising something
> it
> can't provide?  There should never be any danger to KVM that's mitigated by
> restricing guest CPUID because KVM can and should check vcpu-
> >arch.guest_fpu.perm
> instead of guest CPUID.

Well, above explains why we choose to expand fpstate buffer at 
KVM_SET_CPUID2 instead of in the emulation path when required
permissions have been set, as discussed here:

https://lore.kernel.org/all/20211214024948.048572883@linutronix.de/

> 
> In other words, I believe you're conflating the overall approach of requiring
> userspace to pre-acquire the necessary permissions with enforcing what
> userspace
> advertises to the guest.
> 
> > Signed-off-by: Jing Liu <jing2.liu@intel.com>
> > Signed-off-by: Kevin Tian <kevin.tian@intel.com>
> > Signed-off-by: Yang Zhong <yang.zhong@intel.com>
> > ---
> >  arch/x86/kvm/cpuid.c | 62 +++++++++++++++++++++++++++++++++----------
> -
> >  1 file changed, 47 insertions(+), 15 deletions(-)
> >
> > diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> > index 4855344091b8..acbc10db550e 100644
> > --- a/arch/x86/kvm/cpuid.c
> > +++ b/arch/x86/kvm/cpuid.c
> > @@ -81,9 +81,12 @@ static inline struct kvm_cpuid_entry2
> *cpuid_entry2_find(
> >  	return NULL;
> >  }
> >
> > -static int kvm_check_cpuid(struct kvm_cpuid_entry2 *entries, int nent)
> > +static int kvm_check_cpuid(struct kvm_vcpu *vcpu,
> > +			   struct kvm_cpuid_entry2 *entries,
> > +			   int nent)
> >  {
> >  	struct kvm_cpuid_entry2 *best;
> > +	int r = 0;
> >
> >  	/*
> >  	 * The existing code assumes virtual address is 48-bit or 57-bit in the
> > @@ -93,11 +96,40 @@ static int kvm_check_cpuid(struct
> kvm_cpuid_entry2 *entries, int nent)
> >  	if (best) {
> >  		int vaddr_bits = (best->eax & 0xff00) >> 8;
> >
> > -		if (vaddr_bits != 48 && vaddr_bits != 57 && vaddr_bits != 0)
> > -			return -EINVAL;
> > +		if (vaddr_bits != 48 && vaddr_bits != 57 && vaddr_bits != 0) {
> > +			r = -EINVAL;
> > +			goto out;
> 
> Please don't change this to a goto, a return is perfectly ok and more readable
> as it doesn't imply there's some functional change that needs to be unwound
> at
> the end.

will fix

> 
> > +		}
> >  	}
> >
> > -	return 0;
> > +	/*
> > +	 * Check guest permissions for dynamically-enabled xfeatures.
> > +	 * Userspace VMM is expected to acquire permission before vCPU
> > +	 * creation. If permission allows, enable all xfeatures with
> > +	 * fpstate buffer sized accordingly. This avoids complexity of
> > +	 * run-time expansion in the emulation and restore path of XCR0
> > +	 * and IA32_XFD MSR.
> > +	 */
> > +	best = cpuid_entry2_find(entries, nent, 0xd, 0);
> > +	if (best) {
> > +		u64 xfeatures;
> > +
> > +		xfeatures = best->eax | ((u64)best->edx << 32);
> > +		if (xfeatures & ~vcpu->arch.guest_fpu.perm) {
> > +			r = -ENXIO;
> 
> ENXIO is a rather odd error code for insufficient permissions, especially since
> the FPU returns -EPERM for what is effectively the same check.
> 
> > +			goto out;
> > +		}
> > +
> > +		if (xfeatures != vcpu->arch.guest_fpu.xfeatures) {
> 
> xfeatures is obviously not consumed anywhere, which is super confusing and
> arguably wrong, e.g. if userspace advertises xfeatures that are a subset of
> vcpu->arch.guest_fpu.perm, this will expand XSAVE state beyond what
> userspace
> actually wants to advertise to the guest.  The really confusing case would be
> if
> userspace reduced xfeatures relative to vcpu->arch.guest_fpu.xfeatures and
> got
> an -ENOMEM due to the FPU failing to expand the XSAVE size.

You are right.

> 
> I don't care about the waste of memory, and IIUC userspace would have to
> intentionally request permissions for the guest that it then ignores, but that
> doesn't make the code any less confusing.  And as written, this check also
> prevents
> advertising non-XFD features that are not supported in hardware.  I doubt
> there's
> a production use case for that (though MPX deprecation comes close), but
> I've
> certainly exposed unsupported features to a guest for testing purposes.
> 
> Rather than bleed details from the FPU into KVM, why not have the FPU do
> any and
> all checks?  That also gives the FPU access to requested xfeatures so that it
> can opportunistically avoid unnecessary expansion.  We can also tweak the
> kernel
> APIs to be more particular about input values.

All above makes sense, especially when we combine permission check
and buffer expansion in one step now.

> 
> At that point, I would be ok with fpu_update_guest_perm_features()
> rejecting
> attempts to advertise features that are not permitted, because then it's an
> FPU
> policy, not a KVM policy, and there's a valid reason for said policy.  It's a bit
> of a pedantic distinction, but to me it matters because having KVM explicitly
> restrict guest CPUID implies that doing so is necessary for KVM correctness,
> which
> AFAICT is not the case.
> 
> E.g. in KVM
> 
> 	/*
> 	 * Exposing dynamic xfeatures to the guest requires additional
> enabling
> 	 * in the FPU, e.g. to expand the guest XSAVE state size.
> 	 */
> 	best = cpuid_entry2_find(entries, nent, 0xd, 0);
> 	if (!best)
> 		return 0;
> 
> 	xfeatures = best->eax | ((u64)best->edx << 32);
> 	xfeatures &= XFEATURE_MASK_USER_DYNAMIC;
> 	if (!xfeatures)
> 		return 0;
> 
> 	return fpu_enable_guest_xfd_features(&vcpu->arch.guest_fpu,
> xfeatures);
> 
> and then
> 
>   int fpu_enable_guest_xfd_features(struct fpu_guest *guest_fpu, u64
> xfeatures)
>   {
> 	lockdep_assert_preemption_enabled();
> 
> 	/* Nothing to do if all requested features are already enabled. */
> 	xfeatures &= ~guest_fpu->xfeatures;
> 	if (!xfeatures)
> 		return 0;
> 
> 	/* Dynamic xfeatures are not supported with 32-bit kernels. */
> 	if (!IS_ENABLED(CONFIG_X86_64))
> 		return -EPERM;
> 
> 	return __xfd_enable_feature(xfeatures, guest_fpu);
>   }
> 
> with
> 
>   int __xfd_enable_feature(u64 xfd_err, struct fpu_guest *guest_fpu)
>   {
> 	struct fpu_state_perm *perm;
> 	unsigned int ksize, usize;
> 	struct fpu *fpu;
> 
> 	if (WARN_ON_ONCE(!xfd_err || (xfd_err &
> ~XFEATURE_MASK_USER_DYNAMIC)))
> 		return 0;

Currently this is done as:

int __xfd_enable_feature(u64 xfd_err, struct fpu_guest *guest_fpu)
 {
 	u64 xfd_event = xfd_err & XFEATURE_MASK_USER_DYNAMIC;

	...
 	if (!xfd_event) {
		if (!guest_fpu)
			pr_err_once("XFD: Invalid xfd error: %016llx\n", xfd_err);
 		return 0;
 	}
	...
}

is it necessary to convert the error print to WARN_ON() (and also
apply to guest_fpu)?

> 
> 	...
>   }
> 
> which addresses several things:
> 
>   a) avoids explicitly restricing guest CPUID in KVM, and in particular doesn't
>      prevent userspace from advertising non-XFD features that aren't
> supported in
>      hardware, which for better or worse is allowed today.
> 
>   b) returns -EPERM instead of '0' when userspace attempts to enable
> dynamic
>      xfeatures with 32-bit kernels, which isn't a bug as posted only because
>      KVM pre-checks vcpu->arch.guest_fpu.perm.
> 
>   b) avoids reading guest_perm outside of siglock, which was technically a
> TOCTOU
>      "bug", though it didn't put the kernel at risk because
> __xstate_request_perm()
>      doesn't allow reducing permissions.
> 
>   c) allows __xfd_enable_feature() to require the caller to provide just XFD
>      features
> 

All the sample code looks good to me, except WARN_ON() introduced in
the last bullet.

If no objection from other maintainers, we can incorporate it in next version.

Thanks
Kevin

  reply	other threads:[~2021-12-30  2:29 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-29 13:13 [PATCH v4 00/21] AMX Support in KVM Yang Zhong
2021-12-29 13:13 ` [PATCH v4 01/21] x86/fpu: Extend fpu_xstate_prctl() with guest permissions Yang Zhong
2021-12-29 13:13 ` [PATCH v4 02/21] x86/fpu: Prepare guest FPU for dynamically enabled FPU features Yang Zhong
2021-12-29 13:13 ` [PATCH v4 03/21] kvm: x86: Fix xstate_required_size() to follow XSTATE alignment rule Yang Zhong
2022-01-04 19:54   ` Sean Christopherson
2021-12-29 13:13 ` [PATCH v4 04/21] kvm: x86: Exclude unpermitted xfeatures at KVM_GET_SUPPORTED_CPUID Yang Zhong
2021-12-29 13:13 ` [PATCH v4 05/21] x86/fpu: Make XFD initialization in __fpstate_reset() a function argument Yang Zhong
2021-12-29 13:13 ` [PATCH v4 06/21] x86/fpu: Add guest support to xfd_enable_feature() Yang Zhong
2021-12-29 13:13 ` [PATCH v4 07/21] x86/fpu: Provide fpu_update_guest_perm_features() for guest Yang Zhong
2021-12-29 13:13 ` [PATCH v4 08/21] kvm: x86: Check and enable permitted dynamic xfeatures at KVM_SET_CPUID2 Yang Zhong
2021-12-29 16:55   ` Sean Christopherson
2021-12-30  2:28     ` Tian, Kevin [this message]
2021-12-29 13:13 ` [PATCH v4 09/21] x86/fpu: Provide fpu_update_guest_xfd() for IA32_XFD emulation Yang Zhong
2021-12-29 13:13 ` [PATCH v4 10/21] kvm: x86: Add emulation for IA32_XFD Yang Zhong
2022-01-04 19:32   ` Sean Christopherson
2022-01-05  0:22     ` Tian, Kevin
2021-12-29 13:13 ` [PATCH v4 11/21] x86/fpu: Prepare xfd_err in struct fpu_guest Yang Zhong
2021-12-29 13:13 ` [PATCH v4 12/21] kvm: x86: Intercept #NM for saving IA32_XFD_ERR Yang Zhong
2022-01-04 20:01   ` Sean Christopherson
2022-01-05  0:27     ` Tian, Kevin
2021-12-29 13:13 ` [PATCH v4 13/21] kvm: x86: Emulate IA32_XFD_ERR for guest Yang Zhong
2021-12-29 13:13 ` [PATCH v4 14/21] kvm: x86: Disable RDMSR interception of IA32_XFD_ERR Yang Zhong
2022-01-04 19:34   ` Sean Christopherson
2022-01-05  0:23     ` Tian, Kevin
2021-12-29 13:13 ` [PATCH v4 15/21] kvm: x86: Add XCR0 support for Intel AMX Yang Zhong
2021-12-29 13:13 ` [PATCH v4 16/21] kvm: x86: Add CPUID " Yang Zhong
2021-12-29 13:13 ` [PATCH v4 17/21] x86/fpu: Add uabi_size to guest_fpu Yang Zhong
2021-12-29 13:13 ` [PATCH v4 18/21] kvm: x86: Add support for getting/setting expanded xstate buffer Yang Zhong
2022-01-04 19:46   ` Sean Christopherson
2022-01-04 20:45     ` Paolo Bonzini
2022-01-05  4:03       ` Tian, Kevin
2021-12-29 13:13 ` [PATCH v4 19/21] kvm: selftests: Add support for KVM_CAP_XSAVE2 Yang Zhong
2021-12-29 13:13 ` [PATCH v4 20/21] x86/fpu: Provide fpu_sync_guest_vmexit_xfd_state() Yang Zhong
2021-12-29 13:13 ` [PATCH v4 21/21] kvm: x86: Disable interception for IA32_XFD on demand Yang Zhong
2021-12-30  8:46 ` [PATCH v4 00/21] AMX Support in KVM Tian, Kevin
2022-01-04 18:36 ` Paolo Bonzini
2022-01-04 18:54   ` Sean Christopherson
2022-01-05  0:10     ` Tian, Kevin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BN9PR11MB5276220CE6BE21025797A7BC8C459@BN9PR11MB5276.namprd11.prod.outlook.com \
    --to=kevin.tian@intel.com \
    --cc=bp@alien8.de \
    --cc=corbet@lwn.net \
    --cc=dave.hansen@linux.intel.com \
    --cc=guang.zeng@intel.com \
    --cc=jing2.liu@intel.com \
    --cc=jing2.liu@linux.intel.com \
    --cc=jun.nakajima@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=shuah@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=wei.w.wang@intel.com \
    --cc=x86@kernel.org \
    --cc=yang.zhong@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.