[Xen-devel] [PATCH for-4.13 v3 0/2] x86/vmx: posted interrupt fixes

All of lore.kernel.org
 help / color / mirror / Atom feed

* [Xen-devel] [PATCH for-4.13 v3 0/2] x86/vmx: posted interrupt fixes
@ 2019-11-26 13:26 Roger Pau Monne
  2019-11-26 13:26 ` [Xen-devel] [PATCH for-4.13 v3 1/2] x86/vmx: add ASSERT to prevent syncing PIR to IRR Roger Pau Monne
                   ` (2 more replies)
  0 siblings, 3 replies; 18+ messages in thread
From: Roger Pau Monne @ 2019-11-26 13:26 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Kevin Tian, Jun Nakajima, Wei Liu, Andrew Cooper,
	Jan Beulich, Roger Pau Monne

Hello,

The following series aim to solve the issue reported by Joe Jin related
to posted interrupts.

I've decided to send a new version because the previous one was missing
the first patch, and I've also taken the opportunity to address Jan's
comments related to patch 2. It's still missing feedback from Intel
however.

Thanks, Roger.

Roger Pau Monne (2):
  x86/vmx: add ASSERT to prevent syncing PIR to IRR...
  x86/vmx: always sync PIR to IRR before vmentry

 xen/arch/x86/hvm/irq.c     |  7 +++-
 xen/arch/x86/hvm/vmx/vmx.c | 77 ++++++++++++++++----------------------
 2 files changed, 37 insertions(+), 47 deletions(-)

-- 
2.24.0

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* [Xen-devel] [PATCH for-4.13 v3 1/2] x86/vmx: add ASSERT to prevent syncing PIR to IRR...
  2019-11-26 13:26 [Xen-devel] [PATCH for-4.13 v3 0/2] x86/vmx: posted interrupt fixes Roger Pau Monne
@ 2019-11-26 13:26 ` Roger Pau Monne
  2019-11-26 16:32   ` Jan Beulich
  2019-11-26 13:26 ` [Xen-devel] [PATCH for-4.13 v3 2/2] x86/vmx: always sync PIR to IRR before vmentry Roger Pau Monne
  2019-11-26 16:35 ` [Xen-devel] [PATCH for-4.13 v3 0/2] x86/vmx: posted interrupt fixes Roger Pau Monné
  2 siblings, 1 reply; 18+ messages in thread
From: Roger Pau Monne @ 2019-11-26 13:26 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Kevin Tian, Jun Nakajima, Wei Liu, Andrew Cooper,
	Jan Beulich, Roger Pau Monne

... if the vCPU is different than the one currently running or if it's
not paused. Note that syncing PIR to IRR when the vCPU is running is
not allowed, since the hardware is in control of VMCS IRR field.

Allow syncing PIR to IRR when the vCPU is paused, this is required in
order to save the local APIC state.

No functional change intended.

Suggested by: Andrew Cooper <andrew.cooper3@citrix.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Juergen Gross <jgross@suse.com>
---
Changes since v2:
 - Only allow syncing if the vCPU is the current one or if it's
   paused.

Changes since v1:
 - Use vcpu_runnable instead of is_running.
---
 xen/arch/x86/hvm/vmx/vmx.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index a55ff37733..c817aec75d 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -2054,6 +2054,19 @@ static void vmx_sync_pir_to_irr(struct vcpu *v)
     unsigned int group, i;
     DECLARE_BITMAP(pending_intr, NR_VECTORS);
 
+    if ( v != current && !atomic_read(&v->pause_count) )
+    {
+        /*
+         * Syncing PIR to IRR must not be done behind the back of the CPU,
+         * since the IRR is controlled by the hardware when the vCPU is
+         * executing. Only allow Xen to do such sync if the vCPU is the current
+         * one or if it's paused: that's required in order to sync the lapic
+         * state before saving it.
+         */
+        ASSERT_UNREACHABLE();
+        return;
+    }
+
     if ( !pi_test_and_clear_on(&v->arch.hvm.vmx.pi_desc) )
         return;
 
-- 
2.24.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [Xen-devel] [PATCH for-4.13 v3 2/2] x86/vmx: always sync PIR to IRR before vmentry
  2019-11-26 13:26 [Xen-devel] [PATCH for-4.13 v3 0/2] x86/vmx: posted interrupt fixes Roger Pau Monne
  2019-11-26 13:26 ` [Xen-devel] [PATCH for-4.13 v3 1/2] x86/vmx: add ASSERT to prevent syncing PIR to IRR Roger Pau Monne
@ 2019-11-26 13:26 ` Roger Pau Monne
  2019-11-26 16:33   ` Joe Jin
                     ` (2 more replies)
  2019-11-26 16:35 ` [Xen-devel] [PATCH for-4.13 v3 0/2] x86/vmx: posted interrupt fixes Roger Pau Monné
  2 siblings, 3 replies; 18+ messages in thread
From: Roger Pau Monne @ 2019-11-26 13:26 UTC (permalink / raw)
  To: xen-devel
  Cc: Juergen Gross, Kevin Tian, Jan Beulich, Wei Liu, Andrew Cooper,
	Joe Jin, Jun Nakajima, Roger Pau Monne

When using posted interrupts on Intel hardware it's possible that the
vCPU resumes execution with a stale local APIC IRR register because
depending on the interrupts to be injected vlapic_has_pending_irq
might not be called, and thus PIR won't be synced into IRR.

Fix this by making sure PIR is always synced to IRR in
hvm_vcpu_has_pending_irq regardless of what interrupts are pending.

While there also simplify the code in __vmx_deliver_posted_interrupt:
only raise a softirq if the vCPU is the one currently running and
__vmx_deliver_posted_interrupt is called from interrupt context. The
softirq is raised to make sure vmx_intr_assist is retried if the
interrupt happens to arrive after vmx_intr_assist but before
interrupts are disabled in vmx_do_vmentry. Also simplify the logic for
IPIing other pCPUs, there's no need to check v->processor since the
IPI should be sent as long as the vCPU is not the current one and it's
running.

Reported-by: Joe Jin <joe.jin@oracle.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
Cc: Juergen Gross <jgross@suse.com>
---
Changes since v2:
 - Raise a softirq if in interrupt context and the vCPU is the current
   one.
 - Use is_running instead of runnable.
 - Remove the call to vmx_sync_pir_to_irr in vmx_intr_assist and
   instead always call vlapic_has_pending_irq in
   hvm_vcpu_has_pending_irq.
---
 xen/arch/x86/hvm/irq.c     |  7 +++--
 xen/arch/x86/hvm/vmx/vmx.c | 64 +++++++++++---------------------------
 2 files changed, 24 insertions(+), 47 deletions(-)

diff --git a/xen/arch/x86/hvm/irq.c b/xen/arch/x86/hvm/irq.c
index e03a87ad50..b50ac62a16 100644
--- a/xen/arch/x86/hvm/irq.c
+++ b/xen/arch/x86/hvm/irq.c
@@ -515,7 +515,11 @@ void hvm_set_callback_via(struct domain *d, uint64_t via)
 struct hvm_intack hvm_vcpu_has_pending_irq(struct vcpu *v)
 {
     struct hvm_domain *plat = &v->domain->arch.hvm;
-    int vector;
+    /*
+     * Always call vlapic_has_pending_irq so that PIR is synced into IRR when
+     * using posted interrupts.
+     */
+    int vector = vlapic_has_pending_irq(v);
 
     if ( unlikely(v->nmi_pending) )
         return hvm_intack_nmi;
@@ -530,7 +534,6 @@ struct hvm_intack hvm_vcpu_has_pending_irq(struct vcpu *v)
     if ( vlapic_accept_pic_intr(v) && plat->vpic[0].int_output )
         return hvm_intack_pic(0);
 
-    vector = vlapic_has_pending_irq(v);
     if ( vector != -1 )
         return hvm_intack_lapic(vector);
 
diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
index c817aec75d..4dea868cda 100644
--- a/xen/arch/x86/hvm/vmx/vmx.c
+++ b/xen/arch/x86/hvm/vmx/vmx.c
@@ -1945,57 +1945,31 @@ static void vmx_process_isr(int isr, struct vcpu *v)
 
 static void __vmx_deliver_posted_interrupt(struct vcpu *v)
 {
-    bool_t running = v->is_running;
-
     vcpu_unblock(v);
-    /*
-     * Just like vcpu_kick(), nothing is needed for the following two cases:
-     * 1. The target vCPU is not running, meaning it is blocked or runnable.
-     * 2. The target vCPU is the current vCPU and we're in non-interrupt
-     * context.
-     */
-    if ( running && (in_irq() || (v != current)) )
-    {
+    if ( v->is_running && v != current )
         /*
-         * Note: Only two cases will reach here:
-         * 1. The target vCPU is running on other pCPU.
-         * 2. The target vCPU is the current vCPU.
+         * If the vCPU is running on another pCPU send an IPI to the pCPU. When
+         * the IPI arrives, the target vCPU may be running in non-root mode,
+         * running in root mode, runnable or blocked. If the target vCPU is
+         * running in non-root mode, the hardware will sync PIR to vIRR for
+         * 'posted_intr_vector' is a special vector handled directly by the
+         * hardware.
          *
-         * Note2: Don't worry the v->processor may change. The vCPU being
-         * moved to another processor is guaranteed to sync PIR to vIRR,
-         * due to the involved scheduling cycle.
+         * If the target vCPU is running in root-mode, the interrupt handler
+         * starts to run. Considering an IPI may arrive in the window between
+         * the call to vmx_intr_assist() and interrupts getting disabled, the
+         * interrupt handler should raise a softirq to ensure events will be
+         * delivered in time.
          */
-        unsigned int cpu = v->processor;
-
-        /*
-         * For case 1, we send an IPI to the pCPU. When an IPI arrives, the
-         * target vCPU maybe is running in non-root mode, running in root
-         * mode, runnable or blocked. If the target vCPU is running in
-         * non-root mode, the hardware will sync PIR to vIRR for
-         * 'posted_intr_vector' is special to the pCPU. If the target vCPU is
-         * running in root-mode, the interrupt handler starts to run.
-         * Considering an IPI may arrive in the window between the call to
-         * vmx_intr_assist() and interrupts getting disabled, the interrupt
-         * handler should raise a softirq to ensure events will be delivered
-         * in time. If the target vCPU is runnable, it will sync PIR to
-         * vIRR next time it is chose to run. In this case, a IPI and a
-         * softirq is sent to a wrong vCPU which will not have any adverse
-         * effect. If the target vCPU is blocked, since vcpu_block() checks
-         * whether there is an event to be delivered through
-         * local_events_need_delivery() just after blocking, the vCPU must
-         * have synced PIR to vIRR. Similarly, there is a IPI and a softirq
-         * sent to a wrong vCPU.
-         */
-        if ( cpu != smp_processor_id() )
-            send_IPI_mask(cpumask_of(cpu), posted_intr_vector);
+        send_IPI_mask(cpumask_of(v->processor), posted_intr_vector);
+    else if ( v == current && in_irq() && !softirq_pending(smp_processor_id()) )
         /*
-         * For case 2, raising a softirq ensures PIR will be synced to vIRR.
-         * As any softirq will do, as an optimization we only raise one if
-         * none is pending already.
+         * If on interrupt context raise a softirq so that vmx_intr_assist is
+         * retried in case the interrupt arrives after the call to
+         * vmx_intr_assist and before interrupts are disabled in
+         * vmx_do_vmentry.
          */
-        else if ( !softirq_pending(cpu) )
-            raise_softirq(VCPU_KICK_SOFTIRQ);
-    }
+        raise_softirq(VCPU_KICK_SOFTIRQ);
 }
 
 static void vmx_deliver_posted_intr(struct vcpu *v, u8 vector)
-- 
2.24.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH for-4.13 v3 1/2] x86/vmx: add ASSERT to prevent syncing PIR to IRR...
  2019-11-26 13:26 ` [Xen-devel] [PATCH for-4.13 v3 1/2] x86/vmx: add ASSERT to prevent syncing PIR to IRR Roger Pau Monne
@ 2019-11-26 16:32   ` Jan Beulich
  2019-11-26 16:47     ` Roger Pau Monné
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2019-11-26 16:32 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Juergen Gross, Kevin Tian, Wei Liu, Andrew Cooper, Jun Nakajima,
	xen-devel

On 26.11.2019 14:26, Roger Pau Monne wrote:
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -2054,6 +2054,19 @@ static void vmx_sync_pir_to_irr(struct vcpu *v)
>      unsigned int group, i;
>      DECLARE_BITMAP(pending_intr, NR_VECTORS);
>  
> +    if ( v != current && !atomic_read(&v->pause_count) )
> +    {
> +        /*
> +         * Syncing PIR to IRR must not be done behind the back of the CPU,
> +         * since the IRR is controlled by the hardware when the vCPU is
> +         * executing. Only allow Xen to do such sync if the vCPU is the current
> +         * one or if it's paused: that's required in order to sync the lapic
> +         * state before saving it.
> +         */

Is this stated this way by the SDM anywhere? I ask because the
comment then really doesn't apply to just this function, but to
vlapic_{,test_and_}{set,clear}_vector() more generally. It's
not clear to me at all whether the CPU caches (in an incoherent
fashion) IRR (and maybe other APIC page elements), rather than
honoring the atomic updates these macros do.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH for-4.13 v3 2/2] x86/vmx: always sync PIR to IRR before vmentry
  2019-11-26 13:26 ` [Xen-devel] [PATCH for-4.13 v3 2/2] x86/vmx: always sync PIR to IRR before vmentry Roger Pau Monne
@ 2019-11-26 16:33   ` Joe Jin
  2019-11-26 16:50   ` Jan Beulich
  2019-11-27  2:23   ` Tian, Kevin
  2 siblings, 0 replies; 18+ messages in thread
From: Joe Jin @ 2019-11-26 16:33 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel
  Cc: Juergen Gross, Kevin Tian, Jan Beulich, Wei Liu, Andrew Cooper,
	Jun Nakajima

On 11/26/19 5:26 AM, Roger Pau Monne wrote:
> When using posted interrupts on Intel hardware it's possible that the
> vCPU resumes execution with a stale local APIC IRR register because
> depending on the interrupts to be injected vlapic_has_pending_irq
> might not be called, and thus PIR won't be synced into IRR.
> 
> Fix this by making sure PIR is always synced to IRR in
> hvm_vcpu_has_pending_irq regardless of what interrupts are pending.
> 
> While there also simplify the code in __vmx_deliver_posted_interrupt:
> only raise a softirq if the vCPU is the one currently running and
> __vmx_deliver_posted_interrupt is called from interrupt context. The
> softirq is raised to make sure vmx_intr_assist is retried if the
> interrupt happens to arrive after vmx_intr_assist but before
> interrupts are disabled in vmx_do_vmentry. Also simplify the logic for
> IPIing other pCPUs, there's no need to check v->processor since the
> IPI should be sent as long as the vCPU is not the current one and it's
> running.
> 
> Reported-by: Joe Jin <joe.jin@oracle.com>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> Cc: Juergen Gross <jgross@suse.com>

Patch works for me.
Tested-by: Joe Jin <joe.jin@oracle.com>

Thanks,
Joe

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH for-4.13 v3 0/2] x86/vmx: posted interrupt fixes
  2019-11-26 13:26 [Xen-devel] [PATCH for-4.13 v3 0/2] x86/vmx: posted interrupt fixes Roger Pau Monne
  2019-11-26 13:26 ` [Xen-devel] [PATCH for-4.13 v3 1/2] x86/vmx: add ASSERT to prevent syncing PIR to IRR Roger Pau Monne
  2019-11-26 13:26 ` [Xen-devel] [PATCH for-4.13 v3 2/2] x86/vmx: always sync PIR to IRR before vmentry Roger Pau Monne
@ 2019-11-26 16:35 ` Roger Pau Monné
  2 siblings, 0 replies; 18+ messages in thread
From: Roger Pau Monné @ 2019-11-26 16:35 UTC (permalink / raw)
  To: Juergen Gross
  Cc: Kevin Tian, Jun Nakajima, Wei Liu, Andrew Cooper, Jan Beulich, xen-devel

On Tue, Nov 26, 2019 at 02:26:46PM +0100, Roger Pau Monne wrote:
> Hello,
> 
> The following series aim to solve the issue reported by Joe Jin related
> to posted interrupts.

Regarding the release blockers email, and the qualification of this
series:

 - a regression introduced since 4.12

This is not a regression, since AFAICT the posted interrupt code has
always been like this.

 - a severe bug of a 4.13 feature

The bug seems to impact people using PCI-passthrough on Intel hardware
that supports posted interrupts (aka APICv). In my opinion, we either
fix it or disable APICv by default (now it's currently enabled by
default).

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH for-4.13 v3 1/2] x86/vmx: add ASSERT to prevent syncing PIR to IRR...
  2019-11-26 16:32   ` Jan Beulich
@ 2019-11-26 16:47     ` Roger Pau Monné
  2019-11-26 16:58       ` Jan Beulich
  0 siblings, 1 reply; 18+ messages in thread
From: Roger Pau Monné @ 2019-11-26 16:47 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Kevin Tian, Wei Liu, Andrew Cooper, Jun Nakajima,
	xen-devel

On Tue, Nov 26, 2019 at 05:32:04PM +0100, Jan Beulich wrote:
> On 26.11.2019 14:26, Roger Pau Monne wrote:
> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> > @@ -2054,6 +2054,19 @@ static void vmx_sync_pir_to_irr(struct vcpu *v)
> >      unsigned int group, i;
> >      DECLARE_BITMAP(pending_intr, NR_VECTORS);
> >  
> > +    if ( v != current && !atomic_read(&v->pause_count) )
> > +    {
> > +        /*
> > +         * Syncing PIR to IRR must not be done behind the back of the CPU,
> > +         * since the IRR is controlled by the hardware when the vCPU is
> > +         * executing. Only allow Xen to do such sync if the vCPU is the current
> > +         * one or if it's paused: that's required in order to sync the lapic
> > +         * state before saving it.
> > +         */
> 
> Is this stated this way by the SDM anywhere?

No, I think the SDM is not very clear on this, there's a paragraph
about PIR:

"The logical processor performs a logical-OR of PIR into VIRR and
clears PIR. No other agent can read or write a PIR bit (or group of
bits) between the time it is read (to determine what to OR into VIRR)
and when it is cleared."

> I ask because the
> comment then really doesn't apply to just this function, but to
> vlapic_{,test_and_}{set,clear}_vector() more generally. It's
> not clear to me at all whether the CPU caches (in an incoherent
> fashion) IRR (and maybe other APIC page elements), rather than
> honoring the atomic updates these macros do.

IMO syncing PIR to IRR when the vCPU is running on a different pCPU is
likely to at least defeat the purpose of posted interrupts: when the
CPU receives the posted interrupt vector it won't see the
outstanding-notification bit in the posted-interrupt descriptor
because the sync done from a different pCPU would have cleared it, at
which point it's not clear to me that the processor will check vIRR
for pending interrupts. The description in section 29.6
POSTED-INTERRUPT PROCESSING doesn't explicitly mention whether the
value of the outstanding-notification bit affects the logic of posted
interrupt processing.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH for-4.13 v3 2/2] x86/vmx: always sync PIR to IRR before vmentry
  2019-11-26 13:26 ` [Xen-devel] [PATCH for-4.13 v3 2/2] x86/vmx: always sync PIR to IRR before vmentry Roger Pau Monne
  2019-11-26 16:33   ` Joe Jin
@ 2019-11-26 16:50   ` Jan Beulich
  2019-11-26 18:02     ` Roger Pau Monné
  2019-11-27 11:22     ` Roger Pau Monné
  2019-11-27  2:23   ` Tian, Kevin
  2 siblings, 2 replies; 18+ messages in thread
From: Jan Beulich @ 2019-11-26 16:50 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: Juergen Gross, Kevin Tian, Wei Liu, Andrew Cooper, Joe Jin,
	Jun Nakajima, xen-devel

On 26.11.2019 14:26, Roger Pau Monne wrote:
> --- a/xen/arch/x86/hvm/irq.c
> +++ b/xen/arch/x86/hvm/irq.c
> @@ -515,7 +515,11 @@ void hvm_set_callback_via(struct domain *d, uint64_t via)
>  struct hvm_intack hvm_vcpu_has_pending_irq(struct vcpu *v)
>  {
>      struct hvm_domain *plat = &v->domain->arch.hvm;
> -    int vector;
> +    /*
> +     * Always call vlapic_has_pending_irq so that PIR is synced into IRR when
> +     * using posted interrupts.
> +     */
> +    int vector = vlapic_has_pending_irq(v);

Did you consider doing this conditionally either here ...

> @@ -530,7 +534,6 @@ struct hvm_intack hvm_vcpu_has_pending_irq(struct vcpu *v)
>      if ( vlapic_accept_pic_intr(v) && plat->vpic[0].int_output )
>          return hvm_intack_pic(0);
>  
> -    vector = vlapic_has_pending_irq(v);
>      if ( vector != -1 )
>          return hvm_intack_lapic(vector);

... or here? I ask not only because the function isn't exactly
cheap to call (as iirc you did also mention during the v2
discussion), but also because of its interaction with Viridian
and nested mode. In case of problems there, avoiding the use
of interrupt posting would be a workaround in such cases then.

> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -1945,57 +1945,31 @@ static void vmx_process_isr(int isr, struct vcpu *v)
>  
>  static void __vmx_deliver_posted_interrupt(struct vcpu *v)
>  {
> -    bool_t running = v->is_running;
> -
>      vcpu_unblock(v);
> -    /*
> -     * Just like vcpu_kick(), nothing is needed for the following two cases:
> -     * 1. The target vCPU is not running, meaning it is blocked or runnable.
> -     * 2. The target vCPU is the current vCPU and we're in non-interrupt
> -     * context.
> -     */
> -    if ( running && (in_irq() || (v != current)) )
> -    {
> +    if ( v->is_running && v != current )

I continue to be concerned by this evaluation of ->is_running
_after_ vcpu_unblock(), when previously (just like vcpu_kick()
does) it was intentionally done before. I wonder anyway
whether this and the change to irq.c should really be in a
single patch, the more that you start the respective part of
the description with "While there also simplify ...". But in
the end it is up to Kevin's or Jun's judgement anyway.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH for-4.13 v3 1/2] x86/vmx: add ASSERT to prevent syncing PIR to IRR...
  2019-11-26 16:47     ` Roger Pau Monné
@ 2019-11-26 16:58       ` Jan Beulich
  2019-11-27  3:09         ` Tian, Kevin
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2019-11-26 16:58 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Juergen Gross, KevinTian, Wei Liu, Andrew Cooper, Jun Nakajima,
	xen-devel

On 26.11.2019 17:47, Roger Pau Monné  wrote:
> On Tue, Nov 26, 2019 at 05:32:04PM +0100, Jan Beulich wrote:
>> On 26.11.2019 14:26, Roger Pau Monne wrote:
>>> --- a/xen/arch/x86/hvm/vmx/vmx.c
>>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
>>> @@ -2054,6 +2054,19 @@ static void vmx_sync_pir_to_irr(struct vcpu *v)
>>>      unsigned int group, i;
>>>      DECLARE_BITMAP(pending_intr, NR_VECTORS);
>>>  
>>> +    if ( v != current && !atomic_read(&v->pause_count) )
>>> +    {
>>> +        /*
>>> +         * Syncing PIR to IRR must not be done behind the back of the CPU,
>>> +         * since the IRR is controlled by the hardware when the vCPU is
>>> +         * executing. Only allow Xen to do such sync if the vCPU is the current
>>> +         * one or if it's paused: that's required in order to sync the lapic
>>> +         * state before saving it.
>>> +         */
>>
>> Is this stated this way by the SDM anywhere?
> 
> No, I think the SDM is not very clear on this, there's a paragraph
> about PIR:
> 
> "The logical processor performs a logical-OR of PIR into VIRR and
> clears PIR. No other agent can read or write a PIR bit (or group of
> bits) between the time it is read (to determine what to OR into VIRR)
> and when it is cleared."

Well, this is about PIR, but my question was rather towards the
effects on vIRR.

>> I ask because the
>> comment then really doesn't apply to just this function, but to
>> vlapic_{,test_and_}{set,clear}_vector() more generally. It's
>> not clear to me at all whether the CPU caches (in an incoherent
>> fashion) IRR (and maybe other APIC page elements), rather than
>> honoring the atomic updates these macros do.
> 
> IMO syncing PIR to IRR when the vCPU is running on a different pCPU is
> likely to at least defeat the purpose of posted interrupts:

I agree here.

> when the
> CPU receives the posted interrupt vector it won't see the
> outstanding-notification bit in the posted-interrupt descriptor
> because the sync done from a different pCPU would have cleared it, at
> which point it's not clear to me that the processor will check vIRR
> for pending interrupts. The description in section 29.6
> POSTED-INTERRUPT PROCESSING doesn't explicitly mention whether the
> value of the outstanding-notification bit affects the logic of posted
> interrupt processing.

But overall this again is all posted interrupt centric when my
question was about vIRR, in particular whether the asserting you
add may need to be even more rigid.

Anyway, let's see what the VMX maintainers have to say.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH for-4.13 v3 2/2] x86/vmx: always sync PIR to IRR before vmentry
  2019-11-26 16:50   ` Jan Beulich
@ 2019-11-26 18:02     ` Roger Pau Monné
  2019-11-27  9:52       ` Jan Beulich
  2019-11-27 11:22     ` Roger Pau Monné
  1 sibling, 1 reply; 18+ messages in thread
From: Roger Pau Monné @ 2019-11-26 18:02 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Kevin Tian, Wei Liu, Andrew Cooper, Joe Jin,
	Jun Nakajima, xen-devel

On Tue, Nov 26, 2019 at 05:50:32PM +0100, Jan Beulich wrote:
> On 26.11.2019 14:26, Roger Pau Monne wrote:
> > --- a/xen/arch/x86/hvm/irq.c
> > +++ b/xen/arch/x86/hvm/irq.c
> > @@ -515,7 +515,11 @@ void hvm_set_callback_via(struct domain *d, uint64_t via)
> >  struct hvm_intack hvm_vcpu_has_pending_irq(struct vcpu *v)
> >  {
> >      struct hvm_domain *plat = &v->domain->arch.hvm;
> > -    int vector;
> > +    /*
> > +     * Always call vlapic_has_pending_irq so that PIR is synced into IRR when
> > +     * using posted interrupts.
> > +     */
> > +    int vector = vlapic_has_pending_irq(v);
> 
> Did you consider doing this conditionally either here ...
> 
> > @@ -530,7 +534,6 @@ struct hvm_intack hvm_vcpu_has_pending_irq(struct vcpu *v)
> >      if ( vlapic_accept_pic_intr(v) && plat->vpic[0].int_output )
> >          return hvm_intack_pic(0);
> >  
> > -    vector = vlapic_has_pending_irq(v);
> >      if ( vector != -1 )
> >          return hvm_intack_lapic(vector);
> 
> ... or here? I ask not only because the function isn't exactly
> cheap to call (as iirc you did also mention during the v2
> discussion), but also because of its interaction with Viridian
> and nested mode. In case of problems there, avoiding the use
> of interrupt posting would be a workaround in such cases then.

TBH my preference was to do the PIR to IRR sync in vmx_intr_assist by
directly calling vmx_sync_pir_to_irr because it was IMO less intrusive
and confined to VMX code. I think this approach is more risky as
vlapic_has_pending_irq does way more than simply syncing PIR to IRR.

> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> > @@ -1945,57 +1945,31 @@ static void vmx_process_isr(int isr, struct vcpu *v)
> >  
> >  static void __vmx_deliver_posted_interrupt(struct vcpu *v)
> >  {
> > -    bool_t running = v->is_running;
> > -
> >      vcpu_unblock(v);
> > -    /*
> > -     * Just like vcpu_kick(), nothing is needed for the following two cases:
> > -     * 1. The target vCPU is not running, meaning it is blocked or runnable.
> > -     * 2. The target vCPU is the current vCPU and we're in non-interrupt
> > -     * context.
> > -     */
> > -    if ( running && (in_irq() || (v != current)) )
> > -    {
> > +    if ( v->is_running && v != current )
> 
> I continue to be concerned by this evaluation of ->is_running
> _after_ vcpu_unblock(), when previously (just like vcpu_kick()
> does) it was intentionally done before.

If the unblock sets v->is_running to true that's fine, Xen will send a
posted interrupt IPI and the destination pCPU will either be in root
mode and thus raise a softirq or in non-root mode and perform the PIR
to IRR sync and possible interrupt injection.

I see that caching the value of is_running might be helpful in order
to prevent pointless IPI'ing. If the vCPU wasn't running before the
unblock there's no reason to send an IPI to it, because the sync of
PIR to IRR will happen at vmentry anyway.

> I wonder anyway
> whether this and the change to irq.c should really be in a
> single patch, the more that you start the respective part of
> the description with "While there also simplify ...". But in
> the end it is up to Kevin's or Jun's judgement anyway.

Yes, that makes sense. Will wait for feedback from Kevin or Jun before
sending a new version anyway.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH for-4.13 v3 2/2] x86/vmx: always sync PIR to IRR before vmentry
  2019-11-26 13:26 ` [Xen-devel] [PATCH for-4.13 v3 2/2] x86/vmx: always sync PIR to IRR before vmentry Roger Pau Monne
  2019-11-26 16:33   ` Joe Jin
  2019-11-26 16:50   ` Jan Beulich
@ 2019-11-27  2:23   ` Tian, Kevin
  2 siblings, 0 replies; 18+ messages in thread
From: Tian, Kevin @ 2019-11-27  2:23 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel
  Cc: Juergen Gross, Jan Beulich, Wei Liu, Andrew Cooper, Joe Jin,
	Nakajima, Jun

> From: Roger Pau Monne <roger.pau@citrix.com>
> Sent: Tuesday, November 26, 2019 9:27 PM
> 
> When using posted interrupts on Intel hardware it's possible that the
> vCPU resumes execution with a stale local APIC IRR register because
> depending on the interrupts to be injected vlapic_has_pending_irq
> might not be called, and thus PIR won't be synced into IRR.
> 
> Fix this by making sure PIR is always synced to IRR in
> hvm_vcpu_has_pending_irq regardless of what interrupts are pending.
> 
> While there also simplify the code in __vmx_deliver_posted_interrupt:
> only raise a softirq if the vCPU is the one currently running and
> __vmx_deliver_posted_interrupt is called from interrupt context. The

as commented earlier, this is what exactly original code does. Then
what is the simplification?

> softirq is raised to make sure vmx_intr_assist is retried if the
> interrupt happens to arrive after vmx_intr_assist but before
> interrupts are disabled in vmx_do_vmentry. Also simplify the logic for
> IPIing other pCPUs, there's no need to check v->processor since the
> IPI should be sent as long as the vCPU is not the current one and it's
> running.
> 
> Reported-by: Joe Jin <joe.jin@oracle.com>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
> Cc: Juergen Gross <jgross@suse.com>
> ---
> Changes since v2:
>  - Raise a softirq if in interrupt context and the vCPU is the current
>    one.
>  - Use is_running instead of runnable.
>  - Remove the call to vmx_sync_pir_to_irr in vmx_intr_assist and
>    instead always call vlapic_has_pending_irq in
>    hvm_vcpu_has_pending_irq.
> ---
>  xen/arch/x86/hvm/irq.c     |  7 +++--
>  xen/arch/x86/hvm/vmx/vmx.c | 64 +++++++++++---------------------------
>  2 files changed, 24 insertions(+), 47 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/irq.c b/xen/arch/x86/hvm/irq.c
> index e03a87ad50..b50ac62a16 100644
> --- a/xen/arch/x86/hvm/irq.c
> +++ b/xen/arch/x86/hvm/irq.c
> @@ -515,7 +515,11 @@ void hvm_set_callback_via(struct domain *d,
> uint64_t via)
>  struct hvm_intack hvm_vcpu_has_pending_irq(struct vcpu *v)
>  {
>      struct hvm_domain *plat = &v->domain->arch.hvm;
> -    int vector;
> +    /*
> +     * Always call vlapic_has_pending_irq so that PIR is synced into IRR when
> +     * using posted interrupts.
> +     */
> +    int vector = vlapic_has_pending_irq(v);
> 
>      if ( unlikely(v->nmi_pending) )
>          return hvm_intack_nmi;
> @@ -530,7 +534,6 @@ struct hvm_intack hvm_vcpu_has_pending_irq(struct
> vcpu *v)
>      if ( vlapic_accept_pic_intr(v) && plat->vpic[0].int_output )
>          return hvm_intack_pic(0);
> 
> -    vector = vlapic_has_pending_irq(v);
>      if ( vector != -1 )
>          return hvm_intack_lapic(vector);
> 
> diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> index c817aec75d..4dea868cda 100644
> --- a/xen/arch/x86/hvm/vmx/vmx.c
> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> @@ -1945,57 +1945,31 @@ static void vmx_process_isr(int isr, struct vcpu
> *v)
> 
>  static void __vmx_deliver_posted_interrupt(struct vcpu *v)
>  {
> -    bool_t running = v->is_running;
> -
>      vcpu_unblock(v);
> -    /*
> -     * Just like vcpu_kick(), nothing is needed for the following two cases:
> -     * 1. The target vCPU is not running, meaning it is blocked or runnable.
> -     * 2. The target vCPU is the current vCPU and we're in non-interrupt
> -     * context.
> -     */
> -    if ( running && (in_irq() || (v != current)) )
> -    {
> +    if ( v->is_running && v != current )
>          /*
> -         * Note: Only two cases will reach here:
> -         * 1. The target vCPU is running on other pCPU.
> -         * 2. The target vCPU is the current vCPU.
> +         * If the vCPU is running on another pCPU send an IPI to the pCPU.
> When
> +         * the IPI arrives, the target vCPU may be running in non-root mode,
> +         * running in root mode, runnable or blocked. If the target vCPU is
> +         * running in non-root mode, the hardware will sync PIR to vIRR for
> +         * 'posted_intr_vector' is a special vector handled directly by the
> +         * hardware.
>           *
> -         * Note2: Don't worry the v->processor may change. The vCPU being
> -         * moved to another processor is guaranteed to sync PIR to vIRR,
> -         * due to the involved scheduling cycle.
> +         * If the target vCPU is running in root-mode, the interrupt handler
> +         * starts to run. Considering an IPI may arrive in the window between
> +         * the call to vmx_intr_assist() and interrupts getting disabled, the
> +         * interrupt handler should raise a softirq to ensure events will be
> +         * delivered in time.

I prefer to original comment which covers all possible conditions that the
target vcpu might be. You may help improve it if some words are not
well-written, but removing useful information is not good there.

>           */
> -        unsigned int cpu = v->processor;
> -
> -        /*
> -         * For case 1, we send an IPI to the pCPU. When an IPI arrives, the
> -         * target vCPU maybe is running in non-root mode, running in root
> -         * mode, runnable or blocked. If the target vCPU is running in
> -         * non-root mode, the hardware will sync PIR to vIRR for
> -         * 'posted_intr_vector' is special to the pCPU. If the target vCPU is
> -         * running in root-mode, the interrupt handler starts to run.
> -         * Considering an IPI may arrive in the window between the call to
> -         * vmx_intr_assist() and interrupts getting disabled, the interrupt
> -         * handler should raise a softirq to ensure events will be delivered
> -         * in time. If the target vCPU is runnable, it will sync PIR to
> -         * vIRR next time it is chose to run. In this case, a IPI and a
> -         * softirq is sent to a wrong vCPU which will not have any adverse
> -         * effect. If the target vCPU is blocked, since vcpu_block() checks
> -         * whether there is an event to be delivered through
> -         * local_events_need_delivery() just after blocking, the vCPU must
> -         * have synced PIR to vIRR. Similarly, there is a IPI and a softirq
> -         * sent to a wrong vCPU.
> -         */
> -        if ( cpu != smp_processor_id() )
> -            send_IPI_mask(cpumask_of(cpu), posted_intr_vector);
> +        send_IPI_mask(cpumask_of(v->processor), posted_intr_vector);
> +    else if ( v == current && in_irq()
> && !softirq_pending(smp_processor_id()) )
>          /*
> -         * For case 2, raising a softirq ensures PIR will be synced to vIRR.
> -         * As any softirq will do, as an optimization we only raise one if
> -         * none is pending already.
> +         * If on interrupt context raise a softirq so that vmx_intr_assist is
> +         * retried in case the interrupt arrives after the call to
> +         * vmx_intr_assist and before interrupts are disabled in
> +         * vmx_do_vmentry.
>           */
> -        else if ( !softirq_pending(cpu) )
> -            raise_softirq(VCPU_KICK_SOFTIRQ);
> -    }
> +        raise_softirq(VCPU_KICK_SOFTIRQ);
>  }
> 
>  static void vmx_deliver_posted_intr(struct vcpu *v, u8 vector)
> --
> 2.24.0

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH for-4.13 v3 1/2] x86/vmx: add ASSERT to prevent syncing PIR to IRR...
  2019-11-26 16:58       ` Jan Beulich
@ 2019-11-27  3:09         ` Tian, Kevin
  2019-11-27 11:18           ` Roger Pau Monné
  0 siblings, 1 reply; 18+ messages in thread
From: Tian, Kevin @ 2019-11-27  3:09 UTC (permalink / raw)
  To: Jan Beulich, Roger Pau Monné
  Cc: Juergen Gross, Andrew Cooper, Wei Liu, Nakajima, Jun, xen-devel

> From: Jan Beulich <jbeulich@suse.com>
> Sent: Wednesday, November 27, 2019 12:59 AM
> 
> On 26.11.2019 17:47, Roger Pau Monné  wrote:
> > On Tue, Nov 26, 2019 at 05:32:04PM +0100, Jan Beulich wrote:
> >> On 26.11.2019 14:26, Roger Pau Monne wrote:
> >>> --- a/xen/arch/x86/hvm/vmx/vmx.c
> >>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> >>> @@ -2054,6 +2054,19 @@ static void vmx_sync_pir_to_irr(struct vcpu
> *v)
> >>>      unsigned int group, i;
> >>>      DECLARE_BITMAP(pending_intr, NR_VECTORS);
> >>>
> >>> +    if ( v != current && !atomic_read(&v->pause_count) )
> >>> +    {
> >>> +        /*
> >>> +         * Syncing PIR to IRR must not be done behind the back of the CPU,
> >>> +         * since the IRR is controlled by the hardware when the vCPU is
> >>> +         * executing. Only allow Xen to do such sync if the vCPU is the
> current
> >>> +         * one or if it's paused: that's required in order to sync the lapic
> >>> +         * state before saving it.
> >>> +         */
> >>
> >> Is this stated this way by the SDM anywhere?
> >
> > No, I think the SDM is not very clear on this, there's a paragraph
> > about PIR:
> >
> > "The logical processor performs a logical-OR of PIR into VIRR and
> > clears PIR. No other agent can read or write a PIR bit (or group of
> > bits) between the time it is read (to determine what to OR into VIRR)
> > and when it is cleared."
> 
> Well, this is about PIR, but my question was rather towards the
> effects on vIRR.
> 
> >> I ask because the
> >> comment then really doesn't apply to just this function, but to
> >> vlapic_{,test_and_}{set,clear}_vector() more generally. It's
> >> not clear to me at all whether the CPU caches (in an incoherent
> >> fashion) IRR (and maybe other APIC page elements), rather than
> >> honoring the atomic updates these macros do.
> >
> > IMO syncing PIR to IRR when the vCPU is running on a different pCPU is
> > likely to at least defeat the purpose of posted interrupts:
> 
> I agree here.
> 
> > when the
> > CPU receives the posted interrupt vector it won't see the
> > outstanding-notification bit in the posted-interrupt descriptor
> > because the sync done from a different pCPU would have cleared it, at
> > which point it's not clear to me that the processor will check vIRR
> > for pending interrupts. The description in section 29.6
> > POSTED-INTERRUPT PROCESSING doesn't explicitly mention whether the
> > value of the outstanding-notification bit affects the logic of posted
> > interrupt processing.

I think the outstanding-notification is one-off checked for triggering 
interrupt posting process. Once the process starts, there is no need to 
look at it again. The step 3 of posting process in 29.6 clearly says:

"The processor clears the outstanding-notification bit in the posted-
interrupt descriptor. This is done atomically so as to leave the remainder 
of the descriptor unmodified (e.g., with a locked AND operation)."

But regardless of the hardware behavior, I think it's safe to restrict
sync_pir_to_irr as this patch does.

> 
> But overall this again is all posted interrupt centric when my
> question was about vIRR, in particular whether the asserting you
> add may need to be even more rigid.
> 
> Anyway, let's see what the VMX maintainers have to say.
> 

There is one paragraph in 29.6:

"Use of the posted-interrupt descriptor differs from that of other data 
structures that are referenced by pointers in a VMCS. There is a general 
requirement that software ensure that each such data structure is 
modified only when no logical processor with a current VMCS that 
references it is in VMX non-root operation. That requirement does
not apply to the posted-interrupt descriptor. There is a requirement, 
however, that such modifications be done using locked read-modify-write 
instructions."

virtual-APIC page is pointer-referenced by VMCS, thus it falls into above
general requirement. But I suppose there should be some exception with
this page too, otherwise the point of posted interrupt is killed (if we have
to kick the dest vcpu into root to update the vIRR). Let me confirm
internally.

Thanks
Kevin
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH for-4.13 v3 2/2] x86/vmx: always sync PIR to IRR before vmentry
  2019-11-26 18:02     ` Roger Pau Monné
@ 2019-11-27  9:52       ` Jan Beulich
  0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2019-11-27  9:52 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Juergen Gross, KevinTian, Wei Liu, Andrew Cooper, Joe Jin,
	Jun Nakajima, xen-devel

On 26.11.2019 19:02, Roger Pau Monné  wrote:
> On Tue, Nov 26, 2019 at 05:50:32PM +0100, Jan Beulich wrote:
>> On 26.11.2019 14:26, Roger Pau Monne wrote:
>>> --- a/xen/arch/x86/hvm/irq.c
>>> +++ b/xen/arch/x86/hvm/irq.c
>>> @@ -515,7 +515,11 @@ void hvm_set_callback_via(struct domain *d, uint64_t via)
>>>  struct hvm_intack hvm_vcpu_has_pending_irq(struct vcpu *v)
>>>  {
>>>      struct hvm_domain *plat = &v->domain->arch.hvm;
>>> -    int vector;
>>> +    /*
>>> +     * Always call vlapic_has_pending_irq so that PIR is synced into IRR when
>>> +     * using posted interrupts.
>>> +     */
>>> +    int vector = vlapic_has_pending_irq(v);
>>
>> Did you consider doing this conditionally either here ...
>>
>>> @@ -530,7 +534,6 @@ struct hvm_intack hvm_vcpu_has_pending_irq(struct vcpu *v)
>>>      if ( vlapic_accept_pic_intr(v) && plat->vpic[0].int_output )
>>>          return hvm_intack_pic(0);
>>>  
>>> -    vector = vlapic_has_pending_irq(v);
>>>      if ( vector != -1 )
>>>          return hvm_intack_lapic(vector);
>>
>> ... or here? I ask not only because the function isn't exactly
>> cheap to call (as iirc you did also mention during the v2
>> discussion), but also because of its interaction with Viridian
>> and nested mode. In case of problems there, avoiding the use
>> of interrupt posting would be a workaround in such cases then.
> 
> TBH my preference was to do the PIR to IRR sync in vmx_intr_assist by
> directly calling vmx_sync_pir_to_irr because it was IMO less intrusive
> and confined to VMX code. I think this approach is more risky as
> vlapic_has_pending_irq does way more than simply syncing PIR to IRR.

Confining to VMX code may indeed make sense as long as we don't
support the SVM equivalent, but from an abstract pov such syncing
will likely need to happen in a vendor neutral way. In any event,
if the VMX maintainers prefer the other placement, so be it.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH for-4.13 v3 1/2] x86/vmx: add ASSERT to prevent syncing PIR to IRR...
  2019-11-27  3:09         ` Tian, Kevin
@ 2019-11-27 11:18           ` Roger Pau Monné
  0 siblings, 0 replies; 18+ messages in thread
From: Roger Pau Monné @ 2019-11-27 11:18 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: Juergen Gross, Jan Beulich, Wei Liu, Andrew Cooper, Nakajima,
	Jun, xen-devel

On Wed, Nov 27, 2019 at 03:09:51AM +0000, Tian, Kevin wrote:
> > From: Jan Beulich <jbeulich@suse.com>
> > Sent: Wednesday, November 27, 2019 12:59 AM
> > 
> > On 26.11.2019 17:47, Roger Pau Monné  wrote:
> > > On Tue, Nov 26, 2019 at 05:32:04PM +0100, Jan Beulich wrote:
> > >> On 26.11.2019 14:26, Roger Pau Monne wrote:
> > >>> --- a/xen/arch/x86/hvm/vmx/vmx.c
> > >>> +++ b/xen/arch/x86/hvm/vmx/vmx.c
> > >>> @@ -2054,6 +2054,19 @@ static void vmx_sync_pir_to_irr(struct vcpu
> > *v)
> > >>>      unsigned int group, i;
> > >>>      DECLARE_BITMAP(pending_intr, NR_VECTORS);
> > >>>
> > >>> +    if ( v != current && !atomic_read(&v->pause_count) )
> > >>> +    {
> > >>> +        /*
> > >>> +         * Syncing PIR to IRR must not be done behind the back of the CPU,
> > >>> +         * since the IRR is controlled by the hardware when the vCPU is
> > >>> +         * executing. Only allow Xen to do such sync if the vCPU is the
> > current
> > >>> +         * one or if it's paused: that's required in order to sync the lapic
> > >>> +         * state before saving it.
> > >>> +         */
> > >>
> > >> Is this stated this way by the SDM anywhere?
> > >
> > > No, I think the SDM is not very clear on this, there's a paragraph
> > > about PIR:
> > >
> > > "The logical processor performs a logical-OR of PIR into VIRR and
> > > clears PIR. No other agent can read or write a PIR bit (or group of
> > > bits) between the time it is read (to determine what to OR into VIRR)
> > > and when it is cleared."
> > 
> > Well, this is about PIR, but my question was rather towards the
> > effects on vIRR.
> > 
> > >> I ask because the
> > >> comment then really doesn't apply to just this function, but to
> > >> vlapic_{,test_and_}{set,clear}_vector() more generally. It's
> > >> not clear to me at all whether the CPU caches (in an incoherent
> > >> fashion) IRR (and maybe other APIC page elements), rather than
> > >> honoring the atomic updates these macros do.
> > >
> > > IMO syncing PIR to IRR when the vCPU is running on a different pCPU is
> > > likely to at least defeat the purpose of posted interrupts:
> > 
> > I agree here.
> > 
> > > when the
> > > CPU receives the posted interrupt vector it won't see the
> > > outstanding-notification bit in the posted-interrupt descriptor
> > > because the sync done from a different pCPU would have cleared it, at
> > > which point it's not clear to me that the processor will check vIRR
> > > for pending interrupts. The description in section 29.6
> > > POSTED-INTERRUPT PROCESSING doesn't explicitly mention whether the
> > > value of the outstanding-notification bit affects the logic of posted
> > > interrupt processing.
> 
> I think the outstanding-notification is one-off checked for triggering 
> interrupt posting process. Once the process starts, there is no need to 
> look at it again. The step 3 of posting process in 29.6 clearly says:
> 
> "The processor clears the outstanding-notification bit in the posted-
> interrupt descriptor. This is done atomically so as to leave the remainder 
> of the descriptor unmodified (e.g., with a locked AND operation)."

Yes, my question would be what happens if the outstanding-notification
bit is 0, does the processor jump to step 6 then?

Does it just ignore the value of the outstanding-notification bit and
continue to step 4?

> But regardless of the hardware behavior, I think it's safe to restrict
> sync_pir_to_irr as this patch does.
> 
> > 
> > But overall this again is all posted interrupt centric when my
> > question was about vIRR, in particular whether the asserting you
> > add may need to be even more rigid.
> > 
> > Anyway, let's see what the VMX maintainers have to say.
> > 
> 
> There is one paragraph in 29.6:
> 
> "Use of the posted-interrupt descriptor differs from that of other data 
> structures that are referenced by pointers in a VMCS. There is a general 
> requirement that software ensure that each such data structure is 
> modified only when no logical processor with a current VMCS that 
> references it is in VMX non-root operation. That requirement does
> not apply to the posted-interrupt descriptor. There is a requirement, 
> however, that such modifications be done using locked read-modify-write 
> instructions."
> 
> virtual-APIC page is pointer-referenced by VMCS, thus it falls into above
> general requirement. But I suppose there should be some exception with
> this page too, otherwise the point of posted interrupt is killed (if we have
> to kick the dest vcpu into root to update the vIRR). Let me confirm
> internally.

Ack, thanks. I think we can can hold off this improvement/restriction
until we get confirmation of the intended software behavior here.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH for-4.13 v3 2/2] x86/vmx: always sync PIR to IRR before vmentry
  2019-11-26 16:50   ` Jan Beulich
  2019-11-26 18:02     ` Roger Pau Monné
@ 2019-11-27 11:22     ` Roger Pau Monné
  2019-11-27 11:30       ` Jan Beulich
  1 sibling, 1 reply; 18+ messages in thread
From: Roger Pau Monné @ 2019-11-27 11:22 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, Kevin Tian, Wei Liu, Andrew Cooper, Joe Jin,
	Jun Nakajima, xen-devel

On Tue, Nov 26, 2019 at 05:50:32PM +0100, Jan Beulich wrote:
> On 26.11.2019 14:26, Roger Pau Monne wrote:
> > --- a/xen/arch/x86/hvm/irq.c
> > +++ b/xen/arch/x86/hvm/irq.c
> > @@ -515,7 +515,11 @@ void hvm_set_callback_via(struct domain *d, uint64_t via)
> >  struct hvm_intack hvm_vcpu_has_pending_irq(struct vcpu *v)
> >  {
> >      struct hvm_domain *plat = &v->domain->arch.hvm;
> > -    int vector;
> > +    /*
> > +     * Always call vlapic_has_pending_irq so that PIR is synced into IRR when
> > +     * using posted interrupts.
> > +     */
> > +    int vector = vlapic_has_pending_irq(v);
> 
> Did you consider doing this conditionally either here ...
> 
> > @@ -530,7 +534,6 @@ struct hvm_intack hvm_vcpu_has_pending_irq(struct vcpu *v)
> >      if ( vlapic_accept_pic_intr(v) && plat->vpic[0].int_output )
> >          return hvm_intack_pic(0);
> >  
> > -    vector = vlapic_has_pending_irq(v);
> >      if ( vector != -1 )
> >          return hvm_intack_lapic(vector);
> 
> ... or here?

I'm afraid I don't follow. The whole point of this change is to ensure
vlapic_has_pending_irq is unconditionally called in
hvm_vcpu_has_pending_irq, so I'm not sure what you mean by "doing this
conditionally...".

> I ask not only because the function isn't exactly
> cheap to call (as iirc you did also mention during the v2
> discussion), but also because of its interaction with Viridian
> and nested mode. In case of problems there, avoiding the use
> of interrupt posting would be a workaround in such cases then.

Would you like me to export something like vlapic_sync_pir_to_irr and
call it unconditionally instead of calling vlapic_has_pending_irq?

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH for-4.13 v3 2/2] x86/vmx: always sync PIR to IRR before vmentry
  2019-11-27 11:22     ` Roger Pau Monné
@ 2019-11-27 11:30       ` Jan Beulich
  2019-11-27 11:56         ` Roger Pau Monné
  0 siblings, 1 reply; 18+ messages in thread
From: Jan Beulich @ 2019-11-27 11:30 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Juergen Gross, KevinTian, Wei Liu, Andrew Cooper, Joe Jin,
	Jun Nakajima, xen-devel

On 27.11.2019 12:22, Roger Pau Monné  wrote:
> On Tue, Nov 26, 2019 at 05:50:32PM +0100, Jan Beulich wrote:
>> On 26.11.2019 14:26, Roger Pau Monne wrote:
>>> --- a/xen/arch/x86/hvm/irq.c
>>> +++ b/xen/arch/x86/hvm/irq.c
>>> @@ -515,7 +515,11 @@ void hvm_set_callback_via(struct domain *d, uint64_t via)
>>>  struct hvm_intack hvm_vcpu_has_pending_irq(struct vcpu *v)
>>>  {
>>>      struct hvm_domain *plat = &v->domain->arch.hvm;
>>> -    int vector;
>>> +    /*
>>> +     * Always call vlapic_has_pending_irq so that PIR is synced into IRR when
>>> +     * using posted interrupts.
>>> +     */
>>> +    int vector = vlapic_has_pending_irq(v);
>>
>> Did you consider doing this conditionally either here ...
>>
>>> @@ -530,7 +534,6 @@ struct hvm_intack hvm_vcpu_has_pending_irq(struct vcpu *v)
>>>      if ( vlapic_accept_pic_intr(v) && plat->vpic[0].int_output )
>>>          return hvm_intack_pic(0);
>>>  
>>> -    vector = vlapic_has_pending_irq(v);
>>>      if ( vector != -1 )
>>>          return hvm_intack_lapic(vector);
>>
>> ... or here?
> 
> I'm afraid I don't follow. The whole point of this change is to ensure
> vlapic_has_pending_irq is unconditionally called in
> hvm_vcpu_has_pending_irq, so I'm not sure what you mean by "doing this
> conditionally...".

Do it early when using interrupt posting, and keep it in its
current place otherwise.

>> I ask not only because the function isn't exactly
>> cheap to call (as iirc you did also mention during the v2
>> discussion), but also because of its interaction with Viridian
>> and nested mode. In case of problems there, avoiding the use
>> of interrupt posting would be a workaround in such cases then.
> 
> Would you like me to export something like vlapic_sync_pir_to_irr and
> call it unconditionally instead of calling vlapic_has_pending_irq?

This looks to be another option, yes. Albeit instead of making
non-static (which I assume is what you mean by "export"), maybe
simply make this a static inline in vlapic.h then.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH for-4.13 v3 2/2] x86/vmx: always sync PIR to IRR before vmentry
  2019-11-27 11:30       ` Jan Beulich
@ 2019-11-27 11:56         ` Roger Pau Monné
  2019-11-27 15:39           ` Jan Beulich
  0 siblings, 1 reply; 18+ messages in thread
From: Roger Pau Monné @ 2019-11-27 11:56 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Juergen Gross, KevinTian, Wei Liu, Andrew Cooper, Joe Jin,
	Jun Nakajima, xen-devel

On Wed, Nov 27, 2019 at 12:30:06PM +0100, Jan Beulich wrote:
> On 27.11.2019 12:22, Roger Pau Monné  wrote:
> > On Tue, Nov 26, 2019 at 05:50:32PM +0100, Jan Beulich wrote:
> >> On 26.11.2019 14:26, Roger Pau Monne wrote:
> >>> --- a/xen/arch/x86/hvm/irq.c
> >>> +++ b/xen/arch/x86/hvm/irq.c
> >>> @@ -515,7 +515,11 @@ void hvm_set_callback_via(struct domain *d, uint64_t via)
> >>>  struct hvm_intack hvm_vcpu_has_pending_irq(struct vcpu *v)
> >>>  {
> >>>      struct hvm_domain *plat = &v->domain->arch.hvm;
> >>> -    int vector;
> >>> +    /*
> >>> +     * Always call vlapic_has_pending_irq so that PIR is synced into IRR when
> >>> +     * using posted interrupts.
> >>> +     */
> >>> +    int vector = vlapic_has_pending_irq(v);
> >>
> >> Did you consider doing this conditionally either here ...
> >>
> >>> @@ -530,7 +534,6 @@ struct hvm_intack hvm_vcpu_has_pending_irq(struct vcpu *v)
> >>>      if ( vlapic_accept_pic_intr(v) && plat->vpic[0].int_output )
> >>>          return hvm_intack_pic(0);
> >>>  
> >>> -    vector = vlapic_has_pending_irq(v);
> >>>      if ( vector != -1 )
> >>>          return hvm_intack_lapic(vector);
> >>
> >> ... or here?
> > 
> > I'm afraid I don't follow. The whole point of this change is to ensure
> > vlapic_has_pending_irq is unconditionally called in
> > hvm_vcpu_has_pending_irq, so I'm not sure what you mean by "doing this
> > conditionally...".
> 
> Do it early when using interrupt posting, and keep it in its
> current place otherwise.
> 
> >> I ask not only because the function isn't exactly
> >> cheap to call (as iirc you did also mention during the v2
> >> discussion), but also because of its interaction with Viridian
> >> and nested mode. In case of problems there, avoiding the use
> >> of interrupt posting would be a workaround in such cases then.
> > 
> > Would you like me to export something like vlapic_sync_pir_to_irr and
> > call it unconditionally instead of calling vlapic_has_pending_irq?
> 
> This looks to be another option, yes. Albeit instead of making
> non-static (which I assume is what you mean by "export"), maybe
> simply make this a static inline in vlapic.h then.

Yes, that would work and IMO is better than moving the call to
vlapic_has_pending_irq around. Are you OK with this approach?

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Xen-devel] [PATCH for-4.13 v3 2/2] x86/vmx: always sync PIR to IRR before vmentry
  2019-11-27 11:56         ` Roger Pau Monné
@ 2019-11-27 15:39           ` Jan Beulich
  0 siblings, 0 replies; 18+ messages in thread
From: Jan Beulich @ 2019-11-27 15:39 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Juergen Gross, KevinTian, Wei Liu, Andrew Cooper, Joe Jin,
	Jun Nakajima, xen-devel

On 27.11.2019 12:56, Roger Pau Monné  wrote:
> On Wed, Nov 27, 2019 at 12:30:06PM +0100, Jan Beulich wrote:
>> On 27.11.2019 12:22, Roger Pau Monné  wrote:
>>> On Tue, Nov 26, 2019 at 05:50:32PM +0100, Jan Beulich wrote:
>>>> On 26.11.2019 14:26, Roger Pau Monne wrote:
>>>>> --- a/xen/arch/x86/hvm/irq.c
>>>>> +++ b/xen/arch/x86/hvm/irq.c
>>>>> @@ -515,7 +515,11 @@ void hvm_set_callback_via(struct domain *d, uint64_t via)
>>>>>  struct hvm_intack hvm_vcpu_has_pending_irq(struct vcpu *v)
>>>>>  {
>>>>>      struct hvm_domain *plat = &v->domain->arch.hvm;
>>>>> -    int vector;
>>>>> +    /*
>>>>> +     * Always call vlapic_has_pending_irq so that PIR is synced into IRR when
>>>>> +     * using posted interrupts.
>>>>> +     */
>>>>> +    int vector = vlapic_has_pending_irq(v);
>>>>
>>>> Did you consider doing this conditionally either here ...
>>>>
>>>>> @@ -530,7 +534,6 @@ struct hvm_intack hvm_vcpu_has_pending_irq(struct vcpu *v)
>>>>>      if ( vlapic_accept_pic_intr(v) && plat->vpic[0].int_output )
>>>>>          return hvm_intack_pic(0);
>>>>>  
>>>>> -    vector = vlapic_has_pending_irq(v);
>>>>>      if ( vector != -1 )
>>>>>          return hvm_intack_lapic(vector);
>>>>
>>>> ... or here?
>>>
>>> I'm afraid I don't follow. The whole point of this change is to ensure
>>> vlapic_has_pending_irq is unconditionally called in
>>> hvm_vcpu_has_pending_irq, so I'm not sure what you mean by "doing this
>>> conditionally...".
>>
>> Do it early when using interrupt posting, and keep it in its
>> current place otherwise.
>>
>>>> I ask not only because the function isn't exactly
>>>> cheap to call (as iirc you did also mention during the v2
>>>> discussion), but also because of its interaction with Viridian
>>>> and nested mode. In case of problems there, avoiding the use
>>>> of interrupt posting would be a workaround in such cases then.
>>>
>>> Would you like me to export something like vlapic_sync_pir_to_irr and
>>> call it unconditionally instead of calling vlapic_has_pending_irq?
>>
>> This looks to be another option, yes. Albeit instead of making
>> non-static (which I assume is what you mean by "export"), maybe
>> simply make this a static inline in vlapic.h then.
> 
> Yes, that would work and IMO is better than moving the call to
> vlapic_has_pending_irq around. Are you OK with this approach?

I think so, yes.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2019-11-27 15:39 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-26 13:26 [Xen-devel] [PATCH for-4.13 v3 0/2] x86/vmx: posted interrupt fixes Roger Pau Monne
2019-11-26 13:26 ` [Xen-devel] [PATCH for-4.13 v3 1/2] x86/vmx: add ASSERT to prevent syncing PIR to IRR Roger Pau Monne
2019-11-26 16:32   ` Jan Beulich
2019-11-26 16:47     ` Roger Pau Monné
2019-11-26 16:58       ` Jan Beulich
2019-11-27  3:09         ` Tian, Kevin
2019-11-27 11:18           ` Roger Pau Monné
2019-11-26 13:26 ` [Xen-devel] [PATCH for-4.13 v3 2/2] x86/vmx: always sync PIR to IRR before vmentry Roger Pau Monne
2019-11-26 16:33   ` Joe Jin
2019-11-26 16:50   ` Jan Beulich
2019-11-26 18:02     ` Roger Pau Monné
2019-11-27  9:52       ` Jan Beulich
2019-11-27 11:22     ` Roger Pau Monné
2019-11-27 11:30       ` Jan Beulich
2019-11-27 11:56         ` Roger Pau Monné
2019-11-27 15:39           ` Jan Beulich
2019-11-27  2:23   ` Tian, Kevin
2019-11-26 16:35 ` [Xen-devel] [PATCH for-4.13 v3 0/2] x86/vmx: posted interrupt fixes Roger Pau Monné

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.