All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Liu, Jing2" <jing2.liu@intel.com>
To: Paolo Bonzini <pbonzini@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>,
	LKML <linux-kernel@vger.kernel.org>
Cc: "x86@kernel.org" <x86@kernel.org>,
	"Bae, Chang Seok" <chang.seok.bae@intel.com>,
	Dave Hansen <dave.hansen@linux.intel.com>,
	"Arjan van de Ven" <arjan@linux.intel.com>,
	"kvm@vger.kernel.org" <kvm@vger.kernel.org>,
	"Nakajima, Jun" <jun.nakajima@intel.com>,
	Jing Liu <jing2.liu@linux.intel.com>,
	"seanjc@google.com" <seanjc@google.com>,
	"Cooper, Andrew" <andrew.cooper3@citrix.com>
Subject: RE: [patch 13/31] x86/fpu: Move KVMs FPU swapping to FPU core
Date: Thu, 14 Oct 2021 11:21:37 +0000	[thread overview]
Message-ID: <BYAPR11MB3256A20F6BB9218BDB5B7988A9B89@BYAPR11MB3256.namprd11.prod.outlook.com> (raw)
In-Reply-To: <6bbc5184-a675-1937-eb98-639906a9cf15@redhat.com>

On 10/14/2021 5:01 PM, Paolo Bonzini wrote:

> On 14/10/21 10:02, Liu, Jing2 wrote:
> >> In principle I don't like it very much; it would be nicer to say "you
> >> enable it for QEMU itself via arch_prctl(ARCH_SET_STATE_ENABLE), and
> >> for the guests via ioctl(KVM_SET_CPUID2)".  But I can see why you
> >> want to keep things simple, so it's not a strong objection at all.
> >
> > Does this mean that KVM allocate 3 buffers via
> > 1) Qemu's request, instead of via 2) guest XCR0 trap?
> 
> Based on the input from Andy and Thomas, the new way would be like this:
> 
> 1) host_fpu must always be checked for reallocation in kvm_load_guest_fpu
> (or in the FPU functions that it calls, that depends on the rest of Thomas's
> patches).  That's because arch_prctl can enable AMX for QEMU at any point
> after KVM_CREATE_VCPU.

For Qemu's XFD, I'd like to confirm that:
Since the arch_prctl() onlys add current->group_leader->thread.fpu's  state_perm,
__state_size, (current->thread.fpu.* is not changed), thus in
kvm_load_guest_fpu, host_fpu->xfd is always 1. That is to say, Qemu's arch_prctl()
doesn't change any copies of XFD.

> 
> 2) every use of vcpu->arch.guest_supported_xcr0 is changed to only include
> those dynamic-feature bits that were enabled via arch_prctl.
> That is, something like:
> 
> static u64 kvm_guest_supported_cr0(struct kvm_vcpu *vcpu) {
> 	return vcpu->arch.guest_supported_xcr0 &
> 		(~xfeatures_mask_user_dynamic | \
> 		 current->thread.fpu.dynamic_state_perm);
> }
> 
> 3) Even with passthrough disabled, the guest can run with XFD set to
> vcpu->arch.guest_xfd (and likewise for XFD_ERR) which is much simpler
> than trapping #NM.  The traps for writing XCR0 and XFD are used to allocate
> dynamic state for guest_fpu, and start the passthrough of XFD and XFD_ERR.
> What we need is:
> 
> - if a dynamic state has XCR0[n]=0, bit n will never be set in XFD_ERR and the
> state will never be dirtied by the guest.
> 
> - if a dynamic state has XCR0[n]=1, but all enabled dynamic states have
> XFD[n]=1, the guest is not able to dirty any dynamic XSAVE state, because
> they all have either XCR0[n]=0 or XFD[n]=1.  An attempt to do so will cause an
> #NM trap and set the bit in XFD_ERR.
> 
> - if a dynamic state has XCR0[n]=1 and XFD[n]=0, the state for bit n is
> allocated in guest_fpu, and it can also disable the vmexits for XFD and
> XFD_ERR.
> 

Got it, the principle is once XCR0[n]=1 and XFD[n]=0, then guest is allowed
to use the dynamic XSAVE state, thus KVM must prepare all things well
before. This probably happens shortly after guest #NM.

Only one thing: it seems we assume that vcpu->arch.xfd is guest runtime
value. And before guest initializes XFD, KVM provides
vcpu->arch.xfd[18]=1, right? But the spec asks XFD reset value as zero.
If so, between guest init XCR0 to 1 and init XFD to 1, it's XCR0[n]=1 and
XFD[n]=0. If a guest never init XFD and directly use dynamic state...

Or do we want to provide guest a XFD[18]=1 value at the very beginning?

> Therefore:
> 
> - if passthrough is disabled, the XCR0 and XFD write traps can check
> guest_xcr0 & ~guest_xfd.  If it includes a dynamic state bit, dynamic state is
> allocated for all bits enabled in guest_xcr0 and passthrough is started; this
> should happen shortly after the guest gets its first #NM trap for AMX.
> 
> - if passthrough is enabled, the XCR0 write trap must still ensure that
> dynamic state is allocated for all bits enabled in guest_xcr0.
> 
> So something like this pseudocode is called by both XCR0 and XFD writes:
> 
> int kvm_alloc_fpu_dynamic_features(struct kvm_vcpu *vcpu) {
> 	u64 allowed_dynamic = current->thread.fpu.dynamic_state_perm;
> 	u64 enabled_dynamic =
> 		vcpu->arch.xcr0 & xfeatures_mask_user_dynamic;
> 
> 	/* All dynamic features have to be arch_prctl'd first.  */
> 	WARN_ON_ONCE(enabled_dynamic & ~allowed_dynamic);
> 
> 	if (!vcpu->arch.xfd_passthrough) {
> 		/* All dynamic states will #NM?  Wait and see.  */
> 		if ((enabled_dynamic & ~vcpu->arch.xfd) == 0)
Here, when guest init XCR0 to 1, vcpu->arch.xfd should be 1
otherwise XCR0 trap makes passthrough and allocates buffer, which
is not what we want.

> 			return 0;
> 
> 		kvm_x86_ops.enable_xfd_passthrough(vcpu);
> 	}
> 
> 	/* current->thread.fpu was already handled by arch_prctl.  */
It seems so far, arch_prctl does not change current->thread.fpu,
only #NM handler itself does it. We here alloc current too.

Thanks,
Jing
> 	return fpu_alloc_features(vcpu->guest_fpu,
> 		vcpu->guest_fpu.dynamic_state_perm | enabled_dynamic); }
> 
> Paolo


  reply	other threads:[~2021-10-14 11:21 UTC|newest]

Thread overview: 96+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-10-11 23:59 [patch 00/31] x86/fpu: Preparatory cleanups for AMX support (part 1) Thomas Gleixner
2021-10-11 23:59 ` [patch 01/31] x86/fpu: Remove pointless argument from switch_fpu_finish() Thomas Gleixner
2021-10-12  0:00 ` [patch 02/31] x86/fpu: Update stale comments Thomas Gleixner
2021-10-12  0:00 ` [patch 03/31] x86/pkru: Remove useless include Thomas Gleixner
2021-10-12  0:00 ` [patch 04/31] x86/fpu: Restrict xsaves()/xrstors() to independent states Thomas Gleixner
2021-10-12 14:24   ` Borislav Petkov
2021-10-12  0:00 ` [patch 05/31] x86/fpu: Cleanup the on_boot_cpu clutter Thomas Gleixner
2021-10-12  0:00 ` [patch 06/31] x86/fpu: Remove pointless memset in fpu_clone() Thomas Gleixner
2021-10-12  0:00 ` [patch 07/31] x86/process: Clone FPU in copy_thread() Thomas Gleixner
2021-10-12  0:00 ` [patch 08/31] x86/fpu: Do not inherit FPU context for kernel and IO worker threads Thomas Gleixner
2021-10-12  0:00 ` [patch 09/31] x86/fpu: Do not inherit FPU context for CLONE_THREAD Thomas Gleixner
2021-10-12 16:10   ` Borislav Petkov
2021-10-12 18:52     ` Thomas Gleixner
2021-10-12 19:01       ` Thomas Gleixner
2021-10-12  0:00 ` [patch 10/31] x86/fpu: Cleanup xstate xcomp_bv initialization Thomas Gleixner
2021-10-12  0:00 ` [patch 11/31] x86/fpu/xstate: Provide and use for_each_xfeature() Thomas Gleixner
2021-10-12 16:45   ` Borislav Petkov
2021-10-12  0:00 ` [patch 12/31] x86/fpu/xstate: Mark all init only functions __init Thomas Gleixner
2021-10-12  0:00 ` [patch 13/31] x86/fpu: Move KVMs FPU swapping to FPU core Thomas Gleixner
2021-10-12 16:53   ` Borislav Petkov
2021-10-12 18:25     ` Thomas Gleixner
2021-10-12 18:26       ` Thomas Gleixner
2021-10-12 17:22   ` Paolo Bonzini
2021-10-13  6:15     ` Liu, Jing2
2021-10-13  6:26       ` Paolo Bonzini
2021-10-13  7:46         ` Liu, Jing2
2021-10-13  8:42           ` Paolo Bonzini
2021-10-13 10:14             ` Andy Lutomirski
2021-10-13 12:26               ` Paolo Bonzini
2021-10-13 14:14                 ` Thomas Gleixner
2021-10-13 14:24                   ` Thomas Gleixner
2021-10-13 14:59                 ` Andy Lutomirski
2021-10-13 15:05                   ` Paolo Bonzini
2021-10-13 10:25             ` Liu, Jing2
2021-10-13 12:37               ` Paolo Bonzini
2021-10-13 14:06             ` Thomas Gleixner
2021-10-14  6:50               ` Paolo Bonzini
2021-10-14  8:02                 ` Liu, Jing2
2021-10-14  9:01                   ` Paolo Bonzini
2021-10-14 11:21                     ` Liu, Jing2 [this message]
2021-10-14 11:33                       ` Paolo Bonzini
2021-10-14 11:30                     ` Liu, Jing2
2021-10-14 11:39                       ` Paolo Bonzini
2021-11-22  8:50                         ` Liu, Jing2
2021-10-14 14:09                     ` Thomas Gleixner
2021-10-14 14:37                       ` Thomas Gleixner
2021-10-14 15:01                       ` Paolo Bonzini
2021-10-14 19:14                         ` Thomas Gleixner
2021-10-15  9:20                           ` Liu, Jing2
2021-10-15  9:36                           ` Thomas Gleixner
2021-10-15 14:24                             ` Liu, Jing2
2021-10-15 15:53                               ` Paolo Bonzini
2021-10-16 14:45                               ` Thomas Gleixner
2021-10-15  9:00                         ` Liu, Jing2
2021-10-15 10:50                           ` Thomas Gleixner
2021-10-15 11:17                             ` Paolo Bonzini
2021-10-15 13:01                             ` Liu, Jing2
2021-10-14 12:23                 ` Thomas Gleixner
2021-10-14 12:26                   ` Paolo Bonzini
2021-10-14 14:23                     ` Thomas Gleixner
2021-10-13 15:12       ` Thomas Gleixner
2021-10-14  8:21         ` Liu, Jing2
2021-10-14 13:08           ` Thomas Gleixner
2021-10-12  0:00 ` [patch 14/31] x86/fpu: Replace KVMs homebrewn FPU copy from user Thomas Gleixner
2021-10-12 17:00   ` Borislav Petkov
2021-10-13 14:57     ` Sean Christopherson
2021-10-13 15:12       ` Paolo Bonzini
2021-10-13 15:16       ` Thomas Gleixner
2021-10-12 17:30   ` Paolo Bonzini
2021-10-12  0:00 ` [patch 15/31] x86/fpu: Rework copy_xstate_to_uabi_buf() Thomas Gleixner
2021-10-12 17:30   ` Paolo Bonzini
2021-10-12  0:00 ` [patch 16/31] x86/fpu: Replace KVMs homebrewn FPU copy to user Thomas Gleixner
2021-10-12 17:10   ` Borislav Petkov
2021-10-12 17:36   ` Paolo Bonzini
2021-10-12 17:47     ` Thomas Gleixner
2021-10-12 18:40       ` [patch V2 16/31] x86/fpu: Replace KVMs home brewed " Thomas Gleixner
2021-10-13  5:34       ` [patch 16/31] x86/fpu: Replace KVMs homebrewn " Paolo Bonzini
2021-10-12  0:00 ` [patch 17/31] x86/fpu: Mark fpu__init_prepare_fx_sw_frame() as __init Thomas Gleixner
2021-10-12  0:00 ` [patch 18/31] x86/fpu: Move context switch and exit to user inlines into sched.h Thomas Gleixner
2021-10-12  0:00 ` [patch 19/31] x86/fpu: Clean up cpu feature tests Thomas Gleixner
2021-10-12  0:00 ` [patch 20/31] x86/fpu: Make os_xrstor_booting() private Thomas Gleixner
2021-10-12  0:00 ` [patch 21/31] x86/fpu: Move os_xsave() and os_xrstor() to core Thomas Gleixner
2021-10-12  0:00 ` [patch 22/31] x86/fpu: Move legacy ASM wrappers " Thomas Gleixner
2021-10-12  0:00 ` [patch 23/31] x86/fpu: Make WARN_ON_FPU() private Thomas Gleixner
2021-10-12  0:00 ` [patch 24/31] x86/fpu: Move fpregs_restore_userregs() to core Thomas Gleixner
2021-10-12 17:32   ` Borislav Petkov
2021-10-12  0:00 ` [patch 25/31] x86/fpu: Move mxcsr related code " Thomas Gleixner
2021-10-12  0:00 ` [patch 26/31] x86/fpu: Move fpstate functions to api.h Thomas Gleixner
2021-10-12 17:46   ` Borislav Petkov
2021-10-12  0:00 ` [patch 27/31] x86/fpu: Remove internal.h dependency from fpu/signal.h Thomas Gleixner
2021-10-12  0:00 ` [patch 28/31] x86/sev: Include fpu/xcr.h Thomas Gleixner
2021-10-12  7:24   ` Xiaoyao Li
2021-10-12  0:00 ` [patch 29/31] x86/fpu: Mop up the internal.h leftovers Thomas Gleixner
2021-10-12  0:00 ` [patch 30/31] x86/fpu: Replace the includes of fpu/internal.h Thomas Gleixner
2021-10-12  0:00 ` [patch 31/31] x86/fpu: Provide a proper function for ex_handler_fprestore() Thomas Gleixner
2021-10-12 21:15 ` [patch 00/31] x86/fpu: Preparatory cleanups for AMX support (part 1) Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BYAPR11MB3256A20F6BB9218BDB5B7988A9B89@BYAPR11MB3256.namprd11.prod.outlook.com \
    --to=jing2.liu@intel.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=arjan@linux.intel.com \
    --cc=chang.seok.bae@intel.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=jing2.liu@linux.intel.com \
    --cc=jun.nakajima@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=pbonzini@redhat.com \
    --cc=seanjc@google.com \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.