Re: [PATCH] SVM: re-work VMCB sync-ing

From: Andrew Cooper <andrew.cooper3@citrix.com>
To: Jan Beulich <JBeulich@suse.com>,
	xen-devel <xen-devel@lists.xenproject.org>
Cc: Juergen Gross <jgross@suse.com>,
	Boris Ostrovsky <boris.ostrovsky@oracle.com>,
	Razvan Cojocaru <rcojocaru@bitdefender.com>,
	Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
Subject: Re: [PATCH] SVM: re-work VMCB sync-ing
Date: Thu, 26 Apr 2018 13:43:59 +0100	[thread overview]
Message-ID: <22c156d4-c841-654c-c181-1ba4f4d4e0ab@citrix.com> (raw)
In-Reply-To: <5AE1BF9D02000078001BEC18@prv1-mh.provo.novell.com>

On 26/04/18 13:01, Jan Beulich wrote:
> While the main problem to be addressed here is the issue of what so far
> was named "vmcb_in_sync" starting out with the wrong value (should have
> been true instead of false, to prevent performing a VMSAVE without ever
> having VMLOADed the vCPU's state), go a step further and make the
> sync-ed state a tristate: CPU and memory may be in sync or an update
> may be required in either direction. Rename the field and introduce an
> enum. Callers of svm_sync_vmcb() now indicate the intended new state.
> With that, there's no need to VMLOAD the state perhaps multiple times;
> all that's needed is loading it once before VM entry.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> I've been considering to put the VMLOAD invocation in
> svm_asid_handle_vmrun() (instead of the two copies in svm_do_resume()
> and svm_vmexit_handler()), but that seemed a little too abusive of the
> function. Perhaps a more general helper function would be wanted, but
> perhaps that's something to be done when we've decided whether to put
> both VMSAVE and VMLOAD inside the CLGI protected region.
> I'm also not really certain about svm_vmexit_do_vmload(): All I'm doing
> here is a 1:1 change from previous behavior, but I'm unconvinced this
> was/is really correct.

FWIW, In my not-yet-complete patch for the issue, I'd gone with

@@ -90,7 +91,14 @@ UNLIKELY_END(svm_trace)
         pop  %r13
         pop  %r12
         pop  %rbp
+
         mov  VCPU_svm_vmcb_pa(%rbx),%rax
+        cmpb $0, VCPU_svm_vmload_needed(%rbx)
+        je  1f
+        VMLOAD
+        movb $0, VCPU_svm_vmload_needed(%rbx)
+1:
+
         pop  %rbx
         pop  %r11
         pop  %r10

(albeit with a slightly different state structure) and was considering
putting the VMLOAD in an unlikely section to get the static prediction
the correct way around.

> --- a/xen/arch/x86/hvm/svm/svm.c
> +++ b/xen/arch/x86/hvm/svm/svm.c
> @@ -680,16 +680,15 @@ static void svm_cpuid_policy_changed(str
>                        cp->extd.ibpb ? MSR_INTERCEPT_NONE : MSR_INTERCEPT_RW);
>  }
>  
> -static void svm_sync_vmcb(struct vcpu *v)
> +static void svm_sync_vmcb(struct vcpu *v, enum vmcb_sync_state new_state)
>  {
>      struct arch_svm_struct *arch_svm = &v->arch.hvm_svm;
>  
> -    if ( arch_svm->vmcb_in_sync )
> -        return;
> -
> -    arch_svm->vmcb_in_sync = 1;
> +    if ( arch_svm->vmcb_sync_state == vmcb_needs_vmsave )
> +        svm_vmsave(arch_svm->vmcb);
>  
> -    svm_vmsave(arch_svm->vmcb);
> +    if ( arch_svm->vmcb_sync_state != vmcb_needs_vmload )
> +        arch_svm->vmcb_sync_state = new_state;

This is slightly awkward for a couple of reasons.  First, passing
vmcb_in_sync in forget the fact that a vmload is needed.

In my patch, I introduced svm_sync_vmcb_for_update(), rather than
requiring a parameter to be passed in.  I think this is a better API,
and it shrinks the size of the patch.

Also, we currently need segment register codepaths to work from
non-current context, and will very shortly want the MSR paths to do the
same.

By putting an ASSERT(v == current) beside the
svm_vmsave(arch_svm->vmcb), we can drop the (v == current) check in the
callers, and catch corner cases where state isn't in sync for a remote
update.

> @@ -1086,7 +1079,7 @@ static void svm_ctxt_switch_from(struct
>      svm_lwp_save(v);
>      svm_tsc_ratio_save(v);
>  
> -    svm_sync_vmcb(v);
> +    svm_sync_vmcb(v, vmcb_needs_vmload);
>      svm_vmload_pa(per_cpu(host_vmcb, cpu));
>  
>      /* Resume use of ISTs now that the host TR is reinstated. */
> @@ -1114,7 +1107,6 @@ static void svm_ctxt_switch_to(struct vc
>      svm_restore_dr(v);
>  
>      svm_vmsave_pa(per_cpu(host_vmcb, cpu));

As an observation, this vmsave isn't needed.  All state in host_vmcb is
either constant after the AP bringup-path, or loaded properly in the PV
ctxt_switch_to() path.

This can be some future perf improvements though.

> -    svm_vmload(vmcb);
>      vmcb->cleanbits.bytes = 0;
>      svm_lwp_load(v);
>      svm_tsc_ratio_load(v);
> --- a/xen/include/asm-x86/hvm/svm/vmcb.h
> +++ b/xen/include/asm-x86/hvm/svm/vmcb.h
> @@ -495,12 +495,19 @@ struct vmcb_struct {
>  struct svm_domain {
>  };
>  
> +enum vmcb_sync_state {
> +    vmcb_in_sync,
> +    vmcb_needs_vmsave,    /* VMCB out of sync (VMSAVE needed)? */
> +    vmcb_needs_vmload     /* VMCB dirty (VMLOAD needed)? */
> +};

If I were you, I'd go with

enum vmcb_sync_state {
    vmcb_needs_vmsave,
    vmcb_in_sync,
    vmcb_needs_vmload,
};

So it numerically follows the logical progression.

> +
>  struct arch_svm_struct {
>      struct vmcb_struct *vmcb;
>      u64    vmcb_pa;
>      unsigned long *msrpm;
>      int    launch_core;
> -    bool_t vmcb_in_sync;    /* VMCB sync'ed with VMSAVE? */
> +

/*
 * VMRUN doesn't switch fs/gs/tr/ldtr and SHADOWGS/SYSCALL/SYSENTER state.
 * Therefore, guest state is in the hardware registers when servicing a
 * VMExit.
 *
 * Immediately after a VMExit, the vmcb is stale, and needs to be brought
 * into sync by VMSAVE.  If state in the vmcb is modified, a VMLOAD is
 * needed before the following VMRUN.
 */

~Andrew

> +    uint8_t vmcb_sync_state; /* enum vmcb_sync_state */
>  
>      /* VMCB has a cached instruction from #PF/#NPF Decode Assist? */
>      uint8_t cached_insn_len; /* Zero if no cached instruction. */
>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel