All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Jan Beulich" <JBeulich@suse.com>
To: xen-devel <xen-devel@lists.xenproject.org>
Cc: Sergey Dyasli <sergey.dyasli@citrix.com>,
	Kevin Tian <kevin.tian@intel.com>,
	Kevin.Mayer@gdata.de, Andrew Cooper <andrew.cooper3@citrix.com>,
	Anshul Makkar <anshul.makkar@citrix.com>,
	Jun Nakajima <jun.nakajima@intel.com>
Subject: [PATCH 1/2] VMX: fix VMCS race on context-switch paths
Date: Tue, 14 Feb 2017 03:28:57 -0700	[thread overview]
Message-ID: <58A2E9F9020000780013995D@prv-mh.provo.novell.com> (raw)
In-Reply-To: <58A2E8B70200007800139946@prv-mh.provo.novell.com>

[-- Attachment #1: Type: text/plain, Size: 4355 bytes --]

When __context_switch() is being bypassed during original context
switch handling, the vCPU "owning" the VMCS partially loses control of
it: It will appear non-running to remote CPUs, and hence their attempt
to pause the owning vCPU will have no effect on it (as it already
looks to be paused). At the same time the "owning" CPU will re-enable
interrupts eventually (the lastest when entering the idle loop) and
hence becomes subject to IPIs from other CPUs requesting access to the
VMCS. As a result, when __context_switch() finally gets run, the CPU
may no longer have the VMCS loaded, and hence any accesses to it would
fail. Hence we may need to re-load the VMCS in vmx_ctxt_switch_from().

Similarly, when __context_switch() is being bypassed also on the second
(switch-in) path, VMCS ownership may have been lost and hence needs
re-establishing. Since there's no existing hook to put this in, add a
new one.

Reported-by: Kevin Mayer <Kevin.Mayer@gdata.de>
Reported-by: Anshul Makkar <anshul.makkar@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -2098,11 +2098,14 @@ void context_switch(struct vcpu *prev, s
 
     set_current(next);
 
-    if ( (per_cpu(curr_vcpu, cpu) == next) ||
-         (is_idle_domain(nextd) && cpu_online(cpu)) )
+    if ( (per_cpu(curr_vcpu, cpu) == next) )
     {
+        if ( next->arch.ctxt_switch_same )
+            next->arch.ctxt_switch_same(next);
         local_irq_enable();
     }
+    else if ( is_idle_domain(nextd) && cpu_online(cpu) )
+        local_irq_enable();
     else
     {
         __context_switch();
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -552,6 +552,27 @@ static void vmx_load_vmcs(struct vcpu *v
     local_irq_restore(flags);
 }
 
+void vmx_vmcs_reload(struct vcpu *v)
+{
+    /*
+     * As we're running with interrupts disabled, we can't acquire
+     * v->arch.hvm_vmx.vmcs_lock here. However, with interrupts disabled
+     * the VMCS can't be taken away from us anymore if we still own it.
+     */
+    ASSERT(!local_irq_is_enabled());
+    if ( v->arch.hvm_vmx.vmcs_pa == this_cpu(current_vmcs) )
+        return;
+    ASSERT(!this_cpu(current_vmcs));
+
+    /*
+     * Wait for the remote side to be done with the VMCS before loading
+     * it here.
+     */
+    while ( v->arch.hvm_vmx.active_cpu != -1 )
+        cpu_relax();
+    vmx_load_vmcs(v);
+}
+
 int vmx_cpu_up_prepare(unsigned int cpu)
 {
     /*
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -298,6 +298,7 @@ static int vmx_vcpu_initialise(struct vc
     v->arch.schedule_tail    = vmx_do_resume;
     v->arch.ctxt_switch_from = vmx_ctxt_switch_from;
     v->arch.ctxt_switch_to   = vmx_ctxt_switch_to;
+    v->arch.ctxt_switch_same = vmx_vmcs_reload;
 
     if ( (rc = vmx_create_vmcs(v)) != 0 )
     {
@@ -936,6 +937,18 @@ static void vmx_ctxt_switch_from(struct
     if ( unlikely(!this_cpu(vmxon)) )
         return;
 
+    if ( !v->is_running )
+    {
+        /*
+         * When this vCPU isn't marked as running anymore, a remote pCPU's
+         * attempt to pause us (from vmx_vmcs_enter()) won't have a reason
+         * to spin in vcpu_sleep_sync(), and hence that pCPU might have taken
+         * away the VMCS from us. As we're running with interrupts disabled,
+         * we also can't call vmx_vmcs_enter().
+         */
+        vmx_vmcs_reload(v);
+    }
+
     vmx_fpu_leave(v);
     vmx_save_guest_msrs(v);
     vmx_restore_host_msrs();
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -514,6 +514,7 @@ struct arch_vcpu
 
     void (*ctxt_switch_from) (struct vcpu *);
     void (*ctxt_switch_to) (struct vcpu *);
+    void (*ctxt_switch_same) (struct vcpu *);
 
     struct vpmu_struct vpmu;
 
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -174,6 +174,7 @@ void vmx_destroy_vmcs(struct vcpu *v);
 void vmx_vmcs_enter(struct vcpu *v);
 bool_t __must_check vmx_vmcs_try_enter(struct vcpu *v);
 void vmx_vmcs_exit(struct vcpu *v);
+void vmx_vmcs_reload(struct vcpu *v);
 
 #define CPU_BASED_VIRTUAL_INTR_PENDING        0x00000004
 #define CPU_BASED_USE_TSC_OFFSETING           0x00000008



[-- Attachment #2: VMX-enter-VMCS-race.patch --]
[-- Type: text/plain, Size: 4397 bytes --]

VMX: fix VMCS race on context-switch paths

When __context_switch() is being bypassed during original context
switch handling, the vCPU "owning" the VMCS partially loses control of
it: It will appear non-running to remote CPUs, and hence their attempt
to pause the owning vCPU will have no effect on it (as it already
looks to be paused). At the same time the "owning" CPU will re-enable
interrupts eventually (the lastest when entering the idle loop) and
hence becomes subject to IPIs from other CPUs requesting access to the
VMCS. As a result, when __context_switch() finally gets run, the CPU
may no longer have the VMCS loaded, and hence any accesses to it would
fail. Hence we may need to re-load the VMCS in vmx_ctxt_switch_from().

Similarly, when __context_switch() is being bypassed also on the second
(switch-in) path, VMCS ownership may have been lost and hence needs
re-establishing. Since there's no existing hook to put this in, add a
new one.

Reported-by: Kevin Mayer <Kevin.Mayer@gdata.de>
Reported-by: Anshul Makkar <anshul.makkar@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -2098,11 +2098,14 @@ void context_switch(struct vcpu *prev, s
 
     set_current(next);
 
-    if ( (per_cpu(curr_vcpu, cpu) == next) ||
-         (is_idle_domain(nextd) && cpu_online(cpu)) )
+    if ( (per_cpu(curr_vcpu, cpu) == next) )
     {
+        if ( next->arch.ctxt_switch_same )
+            next->arch.ctxt_switch_same(next);
         local_irq_enable();
     }
+    else if ( is_idle_domain(nextd) && cpu_online(cpu) )
+        local_irq_enable();
     else
     {
         __context_switch();
--- a/xen/arch/x86/hvm/vmx/vmcs.c
+++ b/xen/arch/x86/hvm/vmx/vmcs.c
@@ -552,6 +552,27 @@ static void vmx_load_vmcs(struct vcpu *v
     local_irq_restore(flags);
 }
 
+void vmx_vmcs_reload(struct vcpu *v)
+{
+    /*
+     * As we're running with interrupts disabled, we can't acquire
+     * v->arch.hvm_vmx.vmcs_lock here. However, with interrupts disabled
+     * the VMCS can't be taken away from us anymore if we still own it.
+     */
+    ASSERT(!local_irq_is_enabled());
+    if ( v->arch.hvm_vmx.vmcs_pa == this_cpu(current_vmcs) )
+        return;
+    ASSERT(!this_cpu(current_vmcs));
+
+    /*
+     * Wait for the remote side to be done with the VMCS before loading
+     * it here.
+     */
+    while ( v->arch.hvm_vmx.active_cpu != -1 )
+        cpu_relax();
+    vmx_load_vmcs(v);
+}
+
 int vmx_cpu_up_prepare(unsigned int cpu)
 {
     /*
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -298,6 +298,7 @@ static int vmx_vcpu_initialise(struct vc
     v->arch.schedule_tail    = vmx_do_resume;
     v->arch.ctxt_switch_from = vmx_ctxt_switch_from;
     v->arch.ctxt_switch_to   = vmx_ctxt_switch_to;
+    v->arch.ctxt_switch_same = vmx_vmcs_reload;
 
     if ( (rc = vmx_create_vmcs(v)) != 0 )
     {
@@ -936,6 +937,18 @@ static void vmx_ctxt_switch_from(struct
     if ( unlikely(!this_cpu(vmxon)) )
         return;
 
+    if ( !v->is_running )
+    {
+        /*
+         * When this vCPU isn't marked as running anymore, a remote pCPU's
+         * attempt to pause us (from vmx_vmcs_enter()) won't have a reason
+         * to spin in vcpu_sleep_sync(), and hence that pCPU might have taken
+         * away the VMCS from us. As we're running with interrupts disabled,
+         * we also can't call vmx_vmcs_enter().
+         */
+        vmx_vmcs_reload(v);
+    }
+
     vmx_fpu_leave(v);
     vmx_save_guest_msrs(v);
     vmx_restore_host_msrs();
--- a/xen/include/asm-x86/domain.h
+++ b/xen/include/asm-x86/domain.h
@@ -514,6 +514,7 @@ struct arch_vcpu
 
     void (*ctxt_switch_from) (struct vcpu *);
     void (*ctxt_switch_to) (struct vcpu *);
+    void (*ctxt_switch_same) (struct vcpu *);
 
     struct vpmu_struct vpmu;
 
--- a/xen/include/asm-x86/hvm/vmx/vmcs.h
+++ b/xen/include/asm-x86/hvm/vmx/vmcs.h
@@ -174,6 +174,7 @@ void vmx_destroy_vmcs(struct vcpu *v);
 void vmx_vmcs_enter(struct vcpu *v);
 bool_t __must_check vmx_vmcs_try_enter(struct vcpu *v);
 void vmx_vmcs_exit(struct vcpu *v);
+void vmx_vmcs_reload(struct vcpu *v);
 
 #define CPU_BASED_VIRTUAL_INTR_PENDING        0x00000004
 #define CPU_BASED_USE_TSC_OFFSETING           0x00000008

[-- Attachment #3: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2017-02-14 10:29 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-14 10:23 [PATCH 0/2] x86: context switch handling adjustments Jan Beulich
2017-02-14 10:28 ` Jan Beulich [this message]
2017-02-14 15:16   ` [PATCH 1/2] VMX: fix VMCS race on context-switch paths Andrew Cooper
2017-02-15  8:37     ` Jan Beulich
2017-02-15 11:29       ` Andrew Cooper
2017-02-15 10:27   ` Sergey Dyasli
2017-02-15 11:00     ` Jan Beulich
2017-02-15 11:13       ` Sergey Dyasli
2017-02-15 11:24         ` Jan Beulich
2017-02-15 11:39     ` Jan Beulich
2017-02-15 11:48       ` Sergey Dyasli
2017-02-15 11:55         ` Jan Beulich
2017-02-15 13:03           ` Jan Beulich
2017-02-15 13:40             ` Sergey Dyasli
2017-02-15 14:29               ` Jan Beulich
2017-02-15 14:44                 ` Jan Beulich
2017-02-15 13:20     ` Jan Beulich
2017-02-15 14:55       ` Sergey Dyasli
2017-02-15 15:15         ` Jan Beulich
2017-02-16  8:29           ` Sergey Dyasli
2017-02-16  9:26             ` Jan Beulich
2017-02-14 10:29 ` [PATCH 2/2] x86: package up context switch hook pointers Jan Beulich
2017-02-14 15:26   ` Andrew Cooper
2017-02-15  8:42     ` Jan Beulich
2017-02-15 11:34       ` Andrew Cooper
2017-02-15 11:40         ` Jan Beulich
2017-02-14 22:18   ` Boris Ostrovsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=58A2E9F9020000780013995D@prv-mh.provo.novell.com \
    --to=jbeulich@suse.com \
    --cc=Kevin.Mayer@gdata.de \
    --cc=andrew.cooper3@citrix.com \
    --cc=anshul.makkar@citrix.com \
    --cc=jun.nakajima@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=sergey.dyasli@citrix.com \
    --cc=xen-devel@lists.xenproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.