linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Liu, Jing2" <jing2.liu@intel.com>
To: Paolo Bonzini <pbonzini@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	LKML <linux-kernel@vger.kernel.org>
Cc: "x86@kernel.org" <x86@kernel.org>,
	"Bae, Chang Seok" <chang.seok.bae@intel.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"Arjan van de Ven" <arjan@linux.intel.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"Nakajima, Jun" <jun.nakajima@intel.com>,
	Jing Liu <jing2.liu@linux.intel.com>,
	"seanjc@google.com" <seanjc@google.com>,
	"Cooper, Andrew" <andrew.cooper3@citrix.com>
Subject: RE: [patch 13/31] x86/fpu: Move KVMs FPU swapping to FPU core
Date: Thu, 14 Oct 2021 11:21:37 +0000	[thread overview]
Message-ID: <BYAPR11MB3256A20F6BB9218BDB5B7988A9B89@BYAPR11MB3256.namprd11.prod.outlook.com> (raw)
In-Reply-To: <6bbc5184-a675-1937-eb98-639906a9cf15@redhat.com>

On 10/14/2021 5:01 PM, Paolo Bonzini wrote:

> On 14/10/21 10:02, Liu, Jing2 wrote:
> >> In principle I don't like it very much; it would be nicer to say "you
> >> enable it for QEMU itself via arch_prctl(ARCH_SET_STATE_ENABLE), and
> >> for the guests via ioctl(KVM_SET_CPUID2)".  But I can see why you
> >> want to keep things simple, so it's not a strong objection at all.
> >
> > Does this mean that KVM allocate 3 buffers via
> > 1) Qemu's request, instead of via 2) guest XCR0 trap?
> 
> Based on the input from Andy and Thomas, the new way would be like this:
> 
> 1) host_fpu must always be checked for reallocation in kvm_load_guest_fpu
> (or in the FPU functions that it calls, that depends on the rest of Thomas's
> patches).  That's because arch_prctl can enable AMX for QEMU at any point
> after KVM_CREATE_VCPU.

For Qemu's XFD, I'd like to confirm that:
Since the arch_prctl() onlys add current->group_leader->thread.fpu's  state_perm,
__state_size, (current->thread.fpu.* is not changed), thus in
kvm_load_guest_fpu, host_fpu->xfd is always 1. That is to say, Qemu's arch_prctl()
doesn't change any copies of XFD.

> 
> 2) every use of vcpu->arch.guest_supported_xcr0 is changed to only include
> those dynamic-feature bits that were enabled via arch_prctl.
> That is, something like:
> 
> static u64 kvm_guest_supported_cr0(struct kvm_vcpu *vcpu) {
> 	return vcpu->arch.guest_supported_xcr0 &
> 		(~xfeatures_mask_user_dynamic | \
> 		 current->thread.fpu.dynamic_state_perm);
> }
> 
> 3) Even with passthrough disabled, the guest can run with XFD set to
> vcpu->arch.guest_xfd (and likewise for XFD_ERR) which is much simpler
> than trapping #NM.  The traps for writing XCR0 and XFD are used to allocate
> dynamic state for guest_fpu, and start the passthrough of XFD and XFD_ERR.
> What we need is:
> 
> - if a dynamic state has XCR0[n]=0, bit n will never be set in XFD_ERR and the
> state will never be dirtied by the guest.
> 
> - if a dynamic state has XCR0[n]=1, but all enabled dynamic states have
> XFD[n]=1, the guest is not able to dirty any dynamic XSAVE state, because
> they all have either XCR0[n]=0 or XFD[n]=1.  An attempt to do so will cause an
> #NM trap and set the bit in XFD_ERR.
> 
> - if a dynamic state has XCR0[n]=1 and XFD[n]=0, the state for bit n is
> allocated in guest_fpu, and it can also disable the vmexits for XFD and
> XFD_ERR.
> 

Got it, the principle is once XCR0[n]=1 and XFD[n]=0, then guest is allowed
to use the dynamic XSAVE state, thus KVM must prepare all things well
before. This probably happens shortly after guest #NM.

Only one thing: it seems we assume that vcpu->arch.xfd is guest runtime
value. And before guest initializes XFD, KVM provides
vcpu->arch.xfd[18]=1, right? But the spec asks XFD reset value as zero.
If so, between guest init XCR0 to 1 and init XFD to 1, it's XCR0[n]=1 and
XFD[n]=0. If a guest never init XFD and directly use dynamic state...

Or do we want to provide guest a XFD[18]=1 value at the very beginning?

> Therefore:
> 
> - if passthrough is disabled, the XCR0 and XFD write traps can check
> guest_xcr0 & ~guest_xfd.  If it includes a dynamic state bit, dynamic state is
> allocated for all bits enabled in guest_xcr0 and passthrough is started; this
> should happen shortly after the guest gets its first #NM trap for AMX.
> 
> - if passthrough is enabled, the XCR0 write trap must still ensure that
> dynamic state is allocated for all bits enabled in guest_xcr0.
> 
> So something like this pseudocode is called by both XCR0 and XFD writes:
> 
> int kvm_alloc_fpu_dynamic_features(struct kvm_vcpu *vcpu) {
> 	u64 allowed_dynamic = current->thread.fpu.dynamic_state_perm;
> 	u64 enabled_dynamic =
> 		vcpu->arch.xcr0 & xfeatures_mask_user_dynamic;
> 
> 	/* All dynamic features have to be arch_prctl'd first.  */
> 	WARN_ON_ONCE(enabled_dynamic & ~allowed_dynamic);
> 
> 	if (!vcpu->arch.xfd_passthrough) {
> 		/* All dynamic states will #NM?  Wait and see.  */
> 		if ((enabled_dynamic & ~vcpu->arch.xfd) == 0)
Here, when guest init XCR0 to 1, vcpu->arch.xfd should be 1
otherwise XCR0 trap makes passthrough and allocates buffer, which
is not what we want.

> 			return 0;
> 
> 		kvm_x86_ops.enable_xfd_passthrough(vcpu);
> 	}
> 
> 	/* current->thread.fpu was already handled by arch_prctl.  */
It seems so far, arch_prctl does not change current->thread.fpu,
only #NM handler itself does it. We here alloc current too.

Thanks,
Jing
> 	return fpu_alloc_features(vcpu->guest_fpu,
> 		vcpu->guest_fpu.dynamic_state_perm | enabled_dynamic); }
> 
> Paolo


  reply	other threads:[~2021-10-14 11:21 UTC|newest]

Thread overview: 96+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-11 23:59 [patch 00/31] x86/fpu: Preparatory cleanups for AMX support (part 1) Thomas Gleixner
2021-10-11 23:59 ` [patch 01/31] x86/fpu: Remove pointless argument from switch_fpu_finish() Thomas Gleixner
2021-10-12  0:00 ` [patch 02/31] x86/fpu: Update stale comments Thomas Gleixner
2021-10-12  0:00 ` [patch 03/31] x86/pkru: Remove useless include Thomas Gleixner
2021-10-12  0:00 ` [patch 04/31] x86/fpu: Restrict xsaves()/xrstors() to independent states Thomas Gleixner
2021-10-12 14:24   ` Borislav Petkov
2021-10-12  0:00 ` [patch 05/31] x86/fpu: Cleanup the on_boot_cpu clutter Thomas Gleixner
2021-10-12  0:00 ` [patch 06/31] x86/fpu: Remove pointless memset in fpu_clone() Thomas Gleixner
2021-10-12  0:00 ` [patch 07/31] x86/process: Clone FPU in copy_thread() Thomas Gleixner
2021-10-12  0:00 ` [patch 08/31] x86/fpu: Do not inherit FPU context for kernel and IO worker threads Thomas Gleixner
2021-10-12  0:00 ` [patch 09/31] x86/fpu: Do not inherit FPU context for CLONE_THREAD Thomas Gleixner
2021-10-12 16:10   ` Borislav Petkov
2021-10-12 18:52     ` Thomas Gleixner
2021-10-12 19:01       ` Thomas Gleixner
2021-10-12  0:00 ` [patch 10/31] x86/fpu: Cleanup xstate xcomp_bv initialization Thomas Gleixner
2021-10-12  0:00 ` [patch 11/31] x86/fpu/xstate: Provide and use for_each_xfeature() Thomas Gleixner
2021-10-12 16:45   ` Borislav Petkov
2021-10-12  0:00 ` [patch 12/31] x86/fpu/xstate: Mark all init only functions __init Thomas Gleixner
2021-10-12  0:00 ` [patch 13/31] x86/fpu: Move KVMs FPU swapping to FPU core Thomas Gleixner
2021-10-12 16:53   ` Borislav Petkov
2021-10-12 18:25     ` Thomas Gleixner
2021-10-12 18:26       ` Thomas Gleixner
2021-10-12 17:22   ` Paolo Bonzini
2021-10-13  6:15     ` Liu, Jing2
2021-10-13  6:26       ` Paolo Bonzini
2021-10-13  7:46         ` Liu, Jing2
2021-10-13  8:42           ` Paolo Bonzini
2021-10-13 10:14             ` Andy Lutomirski
2021-10-13 12:26               ` Paolo Bonzini
2021-10-13 14:14                 ` Thomas Gleixner
2021-10-13 14:24                   ` Thomas Gleixner
2021-10-13 14:59                 ` Andy Lutomirski
2021-10-13 15:05                   ` Paolo Bonzini
2021-10-13 10:25             ` Liu, Jing2
2021-10-13 12:37               ` Paolo Bonzini
2021-10-13 14:06             ` Thomas Gleixner
2021-10-14  6:50               ` Paolo Bonzini
2021-10-14  8:02                 ` Liu, Jing2
2021-10-14  9:01                   ` Paolo Bonzini
2021-10-14 11:21                     ` Liu, Jing2 [this message]
2021-10-14 11:33                       ` Paolo Bonzini
2021-10-14 11:30                     ` Liu, Jing2
2021-10-14 11:39                       ` Paolo Bonzini
2021-11-22  8:50                         ` Liu, Jing2
2021-10-14 14:09                     ` Thomas Gleixner
2021-10-14 14:37                       ` Thomas Gleixner
2021-10-14 15:01                       ` Paolo Bonzini
2021-10-14 19:14                         ` Thomas Gleixner
2021-10-15  9:20                           ` Liu, Jing2
2021-10-15  9:36                           ` Thomas Gleixner
2021-10-15 14:24                             ` Liu, Jing2
2021-10-15 15:53                               ` Paolo Bonzini
2021-10-16 14:45                               ` Thomas Gleixner
2021-10-15  9:00                         ` Liu, Jing2
2021-10-15 10:50                           ` Thomas Gleixner
2021-10-15 11:17                             ` Paolo Bonzini
2021-10-15 13:01                             ` Liu, Jing2
2021-10-14 12:23                 ` Thomas Gleixner
2021-10-14 12:26                   ` Paolo Bonzini
2021-10-14 14:23                     ` Thomas Gleixner
2021-10-13 15:12       ` Thomas Gleixner
2021-10-14  8:21         ` Liu, Jing2
2021-10-14 13:08           ` Thomas Gleixner
2021-10-12  0:00 ` [patch 14/31] x86/fpu: Replace KVMs homebrewn FPU copy from user Thomas Gleixner
2021-10-12 17:00   ` Borislav Petkov
2021-10-13 14:57     ` Sean Christopherson
2021-10-13 15:12       ` Paolo Bonzini
2021-10-13 15:16       ` Thomas Gleixner
2021-10-12 17:30   ` Paolo Bonzini
2021-10-12  0:00 ` [patch 15/31] x86/fpu: Rework copy_xstate_to_uabi_buf() Thomas Gleixner
2021-10-12 17:30   ` Paolo Bonzini
2021-10-12  0:00 ` [patch 16/31] x86/fpu: Replace KVMs homebrewn FPU copy to user Thomas Gleixner
2021-10-12 17:10   ` Borislav Petkov
2021-10-12 17:36   ` Paolo Bonzini
2021-10-12 17:47     ` Thomas Gleixner
2021-10-12 18:40       ` [patch V2 16/31] x86/fpu: Replace KVMs home brewed " Thomas Gleixner
2021-10-13  5:34       ` [patch 16/31] x86/fpu: Replace KVMs homebrewn " Paolo Bonzini
2021-10-12  0:00 ` [patch 17/31] x86/fpu: Mark fpu__init_prepare_fx_sw_frame() as __init Thomas Gleixner
2021-10-12  0:00 ` [patch 18/31] x86/fpu: Move context switch and exit to user inlines into sched.h Thomas Gleixner
2021-10-12  0:00 ` [patch 19/31] x86/fpu: Clean up cpu feature tests Thomas Gleixner
2021-10-12  0:00 ` [patch 20/31] x86/fpu: Make os_xrstor_booting() private Thomas Gleixner
2021-10-12  0:00 ` [patch 21/31] x86/fpu: Move os_xsave() and os_xrstor() to core Thomas Gleixner
2021-10-12  0:00 ` [patch 22/31] x86/fpu: Move legacy ASM wrappers " Thomas Gleixner
2021-10-12  0:00 ` [patch 23/31] x86/fpu: Make WARN_ON_FPU() private Thomas Gleixner
2021-10-12  0:00 ` [patch 24/31] x86/fpu: Move fpregs_restore_userregs() to core Thomas Gleixner
2021-10-12 17:32   ` Borislav Petkov
2021-10-12  0:00 ` [patch 25/31] x86/fpu: Move mxcsr related code " Thomas Gleixner
2021-10-12  0:00 ` [patch 26/31] x86/fpu: Move fpstate functions to api.h Thomas Gleixner
2021-10-12 17:46   ` Borislav Petkov
2021-10-12  0:00 ` [patch 27/31] x86/fpu: Remove internal.h dependency from fpu/signal.h Thomas Gleixner
2021-10-12  0:00 ` [patch 28/31] x86/sev: Include fpu/xcr.h Thomas Gleixner
2021-10-12  7:24   ` Xiaoyao Li
2021-10-12  0:00 ` [patch 29/31] x86/fpu: Mop up the internal.h leftovers Thomas Gleixner
2021-10-12  0:00 ` [patch 30/31] x86/fpu: Replace the includes of fpu/internal.h Thomas Gleixner
2021-10-12  0:00 ` [patch 31/31] x86/fpu: Provide a proper function for ex_handler_fprestore() Thomas Gleixner
2021-10-12 21:15 ` [patch 00/31] x86/fpu: Preparatory cleanups for AMX support (part 1) Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BYAPR11MB3256A20F6BB9218BDB5B7988A9B89@BYAPR11MB3256.namprd11.prod.outlook.com \
    --to=jing2.liu@intel.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=arjan@linux.intel.com \
    --cc=chang.seok.bae@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=jing2.liu@linux.intel.com \
    --cc=jun.nakajima@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).