Xen-Devel Archive on lore.kernel.org
 help / color / Atom feed
* [Xen-devel] [PATCH 0/3] x86/nvmx: attempt to fix interrupt injection
@ 2020-03-20 19:07 Roger Pau Monne
  2020-03-20 19:07 ` [Xen-devel] [PATCH 1/3] Revert "x86/vvmx: fix virtual interrupt injection when Ack on exit control is used" Roger Pau Monne
                   ` (2 more replies)
  0 siblings, 3 replies; 22+ messages in thread
From: Roger Pau Monne @ 2020-03-20 19:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Jun Nakajima, Wei Liu, Andrew Cooper, Jan Beulich,
	Roger Pau Monne

Hello,

This is a fixup of a wrong previous bugfix. It has been tested on the
debina0 osstest host and it fixes the interrupt injection issues seen
there when running the nested HVM tests. I mention this explicitly
because albeit I don't expect it I don't discard it might cause issues
on other boxes with a different set of VMX features.

I've tried to make the patches as small and contained as possible in
order to avoid breaking other stuff.

Roger Pau Monne (3):
  Revert "x86/vvmx: fix virtual interrupt injection when Ack on exit
    control is used"
  x86/nvmx: clarify and fix usage of nvmx_update_apicv
  x86/nvmx: update exit bitmap on vmexit

 xen/arch/x86/hvm/vmx/intr.c       | 21 +++++++++++++--------
 xen/arch/x86/hvm/vmx/vvmx.c       | 19 ++++++++++++-------
 xen/include/asm-x86/hvm/vmx/vmx.h |  2 ++
 3 files changed, 27 insertions(+), 15 deletions(-)

-- 
2.25.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Xen-devel] [PATCH 1/3] Revert "x86/vvmx: fix virtual interrupt injection when Ack on exit control is used"
  2020-03-20 19:07 [Xen-devel] [PATCH 0/3] x86/nvmx: attempt to fix interrupt injection Roger Pau Monne
@ 2020-03-20 19:07 ` Roger Pau Monne
  2020-03-23  8:09   ` Jan Beulich
  2020-03-20 19:07 ` [Xen-devel] [PATCH 2/3] x86/nvmx: clarify and fix usage of nvmx_update_apicv Roger Pau Monne
  2020-03-20 19:07 ` [Xen-devel] [PATCH 3/3] x86/nvmx: update exit bitmap on vmexit Roger Pau Monne
  2 siblings, 1 reply; 22+ messages in thread
From: Roger Pau Monne @ 2020-03-20 19:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Jun Nakajima, Wei Liu, Andrew Cooper, Jan Beulich,
	Roger Pau Monne

This reverts commit f96e1469ad06b61796c60193daaeb9f8a96d7458.

The commit is wrong, as the whole point of nvmx_update_apicv is to
update the guest interrupt status field when the Ack on exit VMEXIT
control feature is enabled.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/hvm/vmx/vvmx.c | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index f049920196..1b8461ba30 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -1456,12 +1456,7 @@ static void virtual_vmexit(struct cpu_user_regs *regs)
     /* updating host cr0 to sync TS bit */
     __vmwrite(HOST_CR0, v->arch.hvm.vmx.host_cr0);
 
-    if ( cpu_has_vmx_virtual_intr_delivery &&
-         /*
-          * Only inject the vector if the Ack on exit bit is not set, else the
-          * interrupt will be signaled in the vmcs VM_EXIT_INTR_INFO field.
-          */
-         !(get_vvmcs(v, VM_EXIT_CONTROLS) & VM_EXIT_ACK_INTR_ON_EXIT) )
+    if ( cpu_has_vmx_virtual_intr_delivery )
         nvmx_update_apicv(v);
 
     nvcpu->nv_vmswitch_in_progress = 0;
-- 
2.25.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Xen-devel] [PATCH 2/3] x86/nvmx: clarify and fix usage of nvmx_update_apicv
  2020-03-20 19:07 [Xen-devel] [PATCH 0/3] x86/nvmx: attempt to fix interrupt injection Roger Pau Monne
  2020-03-20 19:07 ` [Xen-devel] [PATCH 1/3] Revert "x86/vvmx: fix virtual interrupt injection when Ack on exit control is used" Roger Pau Monne
@ 2020-03-20 19:07 ` Roger Pau Monne
  2020-03-24  6:03   ` Tian, Kevin
  2020-03-20 19:07 ` [Xen-devel] [PATCH 3/3] x86/nvmx: update exit bitmap on vmexit Roger Pau Monne
  2 siblings, 1 reply; 22+ messages in thread
From: Roger Pau Monne @ 2020-03-20 19:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Jun Nakajima, Wei Liu, Andrew Cooper, Jan Beulich,
	Roger Pau Monne

The current usage of nvmx_update_apicv is not clear: it is deeply
intertwined with the Ack interrupt on exit VMEXIT control.

The code in nvmx_update_apicv should update the SVI (in service interrupt)
field of the guest interrupt status only when the Ack interrupt on
exit is set, as it is used to record that the interrupt being
serviced is signaled in a vmcs field, and hence hasn't been injected
as on native. It's important to record the current in service
interrupt on the guest interrupt status field, or else further
interrupts won't respect the priority of the in service one.

While clarifying the usage make sure that the SVI is only updated when
Ack on exit is set and the nested vmcs interrupt info field is valid. Or
else a guest not using the Ack on exit feature would loose interrupts as
they would be signaled as being in service on the guest interrupt
status field but won't actually be recorded on the interrupt info vmcs
field, neither injected in any other way.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/hvm/vmx/vvmx.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index 1b8461ba30..180d01e385 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -1383,7 +1383,7 @@ static void nvmx_update_apicv(struct vcpu *v)
 {
     struct nestedvmx *nvmx = &vcpu_2_nvmx(v);
     unsigned long reason = get_vvmcs(v, VM_EXIT_REASON);
-    uint32_t intr_info = nvmx->intr.intr_info;
+    unsigned long intr_info = get_vvmcs(v, VM_EXIT_INTR_INFO);
 
     if ( reason == EXIT_REASON_EXTERNAL_INTERRUPT &&
          nvmx->intr.source == hvm_intsrc_lapic &&
@@ -1399,6 +1399,15 @@ static void nvmx_update_apicv(struct vcpu *v)
         ppr = vlapic_set_ppr(vlapic);
         WARN_ON((ppr & 0xf0) != (vector & 0xf0));
 
+        /*
+         * SVI must be updated when the interrupt has been signaled using the
+         * Ack on exit feature, or else the currently in-service interrupt
+         * won't be respected.
+         *
+         * Note that this is specific to the fact that when doing a VMEXIT an
+         * interrupt might get delivered using the interrupt info vmcs field
+         * instead of being injected normally.
+         */
         status = vector << VMX_GUEST_INTR_STATUS_SVI_OFFSET;
         rvi = vlapic_has_pending_irq(v);
         if ( rvi != -1 )
-- 
2.25.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* [Xen-devel] [PATCH 3/3] x86/nvmx: update exit bitmap on vmexit
  2020-03-20 19:07 [Xen-devel] [PATCH 0/3] x86/nvmx: attempt to fix interrupt injection Roger Pau Monne
  2020-03-20 19:07 ` [Xen-devel] [PATCH 1/3] Revert "x86/vvmx: fix virtual interrupt injection when Ack on exit control is used" Roger Pau Monne
  2020-03-20 19:07 ` [Xen-devel] [PATCH 2/3] x86/nvmx: clarify and fix usage of nvmx_update_apicv Roger Pau Monne
@ 2020-03-20 19:07 ` Roger Pau Monne
  2020-03-24  6:22   ` Tian, Kevin
  2 siblings, 1 reply; 22+ messages in thread
From: Roger Pau Monne @ 2020-03-20 19:07 UTC (permalink / raw)
  To: xen-devel
  Cc: Kevin Tian, Jun Nakajima, Wei Liu, Andrew Cooper, Jan Beulich,
	Roger Pau Monne

Current code in nvmx_update_apicv set the guest interrupt status field
but doesn't update the exit bitmap, which can cause issues of lost
interrupts on the L1 hypervisor if vmx_intr_assist gets
short-circuited by nvmx_intr_intercept returning true.

Extract the code to update the exit bitmap from vmx_intr_assist into a
helper and use it in nvmx_update_apicv when updating the guest
interrupt status field.

Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
---
 xen/arch/x86/hvm/vmx/intr.c       | 21 +++++++++++++--------
 xen/arch/x86/hvm/vmx/vvmx.c       |  1 +
 xen/include/asm-x86/hvm/vmx/vmx.h |  2 ++
 3 files changed, 16 insertions(+), 8 deletions(-)

diff --git a/xen/arch/x86/hvm/vmx/intr.c b/xen/arch/x86/hvm/vmx/intr.c
index 49a1295f09..000e14af49 100644
--- a/xen/arch/x86/hvm/vmx/intr.c
+++ b/xen/arch/x86/hvm/vmx/intr.c
@@ -224,6 +224,18 @@ static int nvmx_intr_intercept(struct vcpu *v, struct hvm_intack intack)
     return 0;
 }
 
+void vmx_sync_exit_bitmap(struct vcpu *v)
+{
+    const unsigned int n = ARRAY_SIZE(v->arch.hvm.vmx.eoi_exit_bitmap);
+    unsigned int i;
+
+    while ( (i = find_first_bit(&v->arch.hvm.vmx.eoi_exitmap_changed, n)) < n )
+    {
+        clear_bit(i, &v->arch.hvm.vmx.eoi_exitmap_changed);
+        __vmwrite(EOI_EXIT_BITMAP(i), v->arch.hvm.vmx.eoi_exit_bitmap[i]);
+    }
+}
+
 void vmx_intr_assist(void)
 {
     struct hvm_intack intack;
@@ -318,7 +330,6 @@ void vmx_intr_assist(void)
               intack.source != hvm_intsrc_vector )
     {
         unsigned long status;
-        unsigned int i, n;
 
        /*
         * intack.vector is the highest priority vector. So we set eoi_exit_bitmap
@@ -379,13 +390,7 @@ void vmx_intr_assist(void)
                     intack.vector;
         __vmwrite(GUEST_INTR_STATUS, status);
 
-        n = ARRAY_SIZE(v->arch.hvm.vmx.eoi_exit_bitmap);
-        while ( (i = find_first_bit(&v->arch.hvm.vmx.eoi_exitmap_changed,
-                                    n)) < n )
-        {
-            clear_bit(i, &v->arch.hvm.vmx.eoi_exitmap_changed);
-            __vmwrite(EOI_EXIT_BITMAP(i), v->arch.hvm.vmx.eoi_exit_bitmap[i]);
-        }
+        vmx_sync_exit_bitmap(v);
 
         pt_intr_post(v, intack);
     }
diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
index 180d01e385..e041ecc115 100644
--- a/xen/arch/x86/hvm/vmx/vvmx.c
+++ b/xen/arch/x86/hvm/vmx/vvmx.c
@@ -1414,6 +1414,7 @@ static void nvmx_update_apicv(struct vcpu *v)
             status |= rvi & VMX_GUEST_INTR_STATUS_SUBFIELD_BITMASK;
 
         __vmwrite(GUEST_INTR_STATUS, status);
+        vmx_sync_exit_bitmap(v);
     }
 }
 
diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h b/xen/include/asm-x86/hvm/vmx/vmx.h
index b334e1ec94..111ccd7e61 100644
--- a/xen/include/asm-x86/hvm/vmx/vmx.h
+++ b/xen/include/asm-x86/hvm/vmx/vmx.h
@@ -610,6 +610,8 @@ void update_guest_eip(void);
 void vmx_pi_per_cpu_init(unsigned int cpu);
 void vmx_pi_desc_fixup(unsigned int cpu);
 
+void vmx_sync_exit_bitmap(struct vcpu *v);
+
 #ifdef CONFIG_HVM
 void vmx_pi_hooks_assign(struct domain *d);
 void vmx_pi_hooks_deassign(struct domain *d);
-- 
2.25.0


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Xen-devel] [PATCH 1/3] Revert "x86/vvmx: fix virtual interrupt injection when Ack on exit control is used"
  2020-03-20 19:07 ` [Xen-devel] [PATCH 1/3] Revert "x86/vvmx: fix virtual interrupt injection when Ack on exit control is used" Roger Pau Monne
@ 2020-03-23  8:09   ` Jan Beulich
  2020-03-23 14:48     ` Roger Pau Monné
  0 siblings, 1 reply; 22+ messages in thread
From: Jan Beulich @ 2020-03-23  8:09 UTC (permalink / raw)
  To: Roger Pau Monne
  Cc: xen-devel, Kevin Tian, Wei Liu, Jun Nakajima, Andrew Cooper

On 20.03.2020 20:07, Roger Pau Monne wrote:
> This reverts commit f96e1469ad06b61796c60193daaeb9f8a96d7458.
> 
> The commit is wrong, as the whole point of nvmx_update_apicv is to
> update the guest interrupt status field when the Ack on exit VMEXIT
> control feature is enabled.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>

Before anyone gets to look at the other two patches, should this
be thrown in right away?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Xen-devel] [PATCH 1/3] Revert "x86/vvmx: fix virtual interrupt injection when Ack on exit control is used"
  2020-03-23  8:09   ` Jan Beulich
@ 2020-03-23 14:48     ` Roger Pau Monné
  2020-03-24  5:41       ` Tian, Kevin
  0 siblings, 1 reply; 22+ messages in thread
From: Roger Pau Monné @ 2020-03-23 14:48 UTC (permalink / raw)
  To: Jan Beulich; +Cc: xen-devel, Kevin Tian, Wei Liu, Jun Nakajima, Andrew Cooper

On Mon, Mar 23, 2020 at 09:09:59AM +0100, Jan Beulich wrote:
> On 20.03.2020 20:07, Roger Pau Monne wrote:
> > This reverts commit f96e1469ad06b61796c60193daaeb9f8a96d7458.
> > 
> > The commit is wrong, as the whole point of nvmx_update_apicv is to
> > update the guest interrupt status field when the Ack on exit VMEXIT
> > control feature is enabled.
> > 
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> 
> Before anyone gets to look at the other two patches, should this
> be thrown in right away?

I would like if possible get a confirmation from Kevin (or anyone
else) that my understanding is correct. I find the nested code very
confusing, and I've already made a mistake while trying to fix it.
That being said, this was spotted by osstest as introducing a
regression, so I guess it's safe to just toss it in now.

FWIW patch 2/3 attempts to provide a description of my understanding
of how nvmx_update_apicv works.

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Xen-devel] [PATCH 1/3] Revert "x86/vvmx: fix virtual interrupt injection when Ack on exit control is used"
  2020-03-23 14:48     ` Roger Pau Monné
@ 2020-03-24  5:41       ` Tian, Kevin
  2020-03-24  8:10         ` Jan Beulich
  0 siblings, 1 reply; 22+ messages in thread
From: Tian, Kevin @ 2020-03-24  5:41 UTC (permalink / raw)
  To: Roger Pau Monné, Jan Beulich
  Cc: xen-devel, Wei Liu, Nakajima, Jun, Andrew Cooper

> From: Roger Pau Monné <roger.pau@citrix.com>
> Sent: Monday, March 23, 2020 10:49 PM
> 
> On Mon, Mar 23, 2020 at 09:09:59AM +0100, Jan Beulich wrote:
> > On 20.03.2020 20:07, Roger Pau Monne wrote:
> > > This reverts commit f96e1469ad06b61796c60193daaeb9f8a96d7458.
> > >
> > > The commit is wrong, as the whole point of nvmx_update_apicv is to
> > > update the guest interrupt status field when the Ack on exit VMEXIT
> > > control feature is enabled.
> > >
> > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> >
> > Before anyone gets to look at the other two patches, should this
> > be thrown in right away?
> 
> I would like if possible get a confirmation from Kevin (or anyone
> else) that my understanding is correct. I find the nested code very
> confusing, and I've already made a mistake while trying to fix it.
> That being said, this was spotted by osstest as introducing a
> regression, so I guess it's safe to just toss it in now.
> 
> FWIW patch 2/3 attempts to provide a description of my understanding
> of how nvmx_update_apicv works.
> 

I feel it is not good to take this patch alone, as it was introduced to fix
another problem. W/o understanding whether the whole series can
fix both old and new problems, we may risk putting nested interrupt
logic in an even worse state...

Thanks
Kevin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Xen-devel] [PATCH 2/3] x86/nvmx: clarify and fix usage of nvmx_update_apicv
  2020-03-20 19:07 ` [Xen-devel] [PATCH 2/3] x86/nvmx: clarify and fix usage of nvmx_update_apicv Roger Pau Monne
@ 2020-03-24  6:03   ` Tian, Kevin
  2020-03-24  9:50     ` Roger Pau Monné
  0 siblings, 1 reply; 22+ messages in thread
From: Tian, Kevin @ 2020-03-24  6:03 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel
  Cc: Andrew Cooper, Wei Liu, Jan Beulich, Nakajima, Jun

> From: Roger Pau Monne <roger.pau@citrix.com>
> Sent: Saturday, March 21, 2020 3:08 AM
> 
> The current usage of nvmx_update_apicv is not clear: it is deeply
> intertwined with the Ack interrupt on exit VMEXIT control.
> 
> The code in nvmx_update_apicv should update the SVI (in service interrupt)
> field of the guest interrupt status only when the Ack interrupt on
> exit is set, as it is used to record that the interrupt being
> serviced is signaled in a vmcs field, and hence hasn't been injected
> as on native. It's important to record the current in service
> interrupt on the guest interrupt status field, or else further
> interrupts won't respect the priority of the in service one.
> 
> While clarifying the usage make sure that the SVI is only updated when
> Ack on exit is set and the nested vmcs interrupt info field is valid. Or
> else a guest not using the Ack on exit feature would loose interrupts as
> they would be signaled as being in service on the guest interrupt
> status field but won't actually be recorded on the interrupt info vmcs
> field, neither injected in any other way.

It is insufficient. You also need to update RVI to enable virtual injection
when Ack on exit is cleared.

> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
>  xen/arch/x86/hvm/vmx/vvmx.c | 11 ++++++++++-
>  1 file changed, 10 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
> index 1b8461ba30..180d01e385 100644
> --- a/xen/arch/x86/hvm/vmx/vvmx.c
> +++ b/xen/arch/x86/hvm/vmx/vvmx.c
> @@ -1383,7 +1383,7 @@ static void nvmx_update_apicv(struct vcpu *v)
>  {
>      struct nestedvmx *nvmx = &vcpu_2_nvmx(v);
>      unsigned long reason = get_vvmcs(v, VM_EXIT_REASON);
> -    uint32_t intr_info = nvmx->intr.intr_info;
> +    unsigned long intr_info = get_vvmcs(v, VM_EXIT_INTR_INFO);

well, above reminds me an interesting question. Combining last and this
patch, we'd see essentially that it goes back to the state before f96e1469
(at least when Ack on exit is true). iirc, that commit was introduced to enable
nested x2apic with apicv, and your very first version even just removed
the whole nvmx_update_apicv. Then now with the new reverted logic,
are you still suffering x2apic problem? If not, does it imply the real fix
is actually coming from patch 3/3 for eoi bitmap update?

> 
>      if ( reason == EXIT_REASON_EXTERNAL_INTERRUPT &&
>           nvmx->intr.source == hvm_intsrc_lapic &&
> @@ -1399,6 +1399,15 @@ static void nvmx_update_apicv(struct vcpu *v)
>          ppr = vlapic_set_ppr(vlapic);
>          WARN_ON((ppr & 0xf0) != (vector & 0xf0));
> 
> +        /*
> +         * SVI must be updated when the interrupt has been signaled using the
> +         * Ack on exit feature, or else the currently in-service interrupt
> +         * won't be respected.
> +         *
> +         * Note that this is specific to the fact that when doing a VMEXIT an
> +         * interrupt might get delivered using the interrupt info vmcs field
> +         * instead of being injected normally.
> +         */

I'm not sure mentioning SVI specifically here is necessary, since all steps
here are required - updating iir, isr, rvi, svi, ppr, etc. It is just an emulation
of updating virtual APIC state as if a virtual interrupt delivery happens.

>          status = vector << VMX_GUEST_INTR_STATUS_SVI_OFFSET;
>          rvi = vlapic_has_pending_irq(v);
>          if ( rvi != -1 )
> --
> 2.25.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Xen-devel] [PATCH 3/3] x86/nvmx: update exit bitmap on vmexit
  2020-03-20 19:07 ` [Xen-devel] [PATCH 3/3] x86/nvmx: update exit bitmap on vmexit Roger Pau Monne
@ 2020-03-24  6:22   ` Tian, Kevin
  2020-03-24  9:59     ` Roger Pau Monné
  0 siblings, 1 reply; 22+ messages in thread
From: Tian, Kevin @ 2020-03-24  6:22 UTC (permalink / raw)
  To: Roger Pau Monne, xen-devel
  Cc: Andrew Cooper, Wei Liu, Jan Beulich, Nakajima, Jun

> From: Roger Pau Monne <roger.pau@citrix.com>
> Sent: Saturday, March 21, 2020 3:08 AM
> 
> Current code in nvmx_update_apicv set the guest interrupt status field
> but doesn't update the exit bitmap, which can cause issues of lost
> interrupts on the L1 hypervisor if vmx_intr_assist gets
> short-circuited by nvmx_intr_intercept returning true.

Above is not accurate. Currently Xen didn't choose to update the EOI
exit bitmap every time when there is a change. Instead, it chose to 
batch the update before resuming to the guest. sort of optimization.
So it is not related to whether SVI is changed. We should always do the 
bitmap update in nvmx_update_apicv, regardless of the setting of
Ack-on-exit ...

Thanks
Kevin

> 
> Extract the code to update the exit bitmap from vmx_intr_assist into a
> helper and use it in nvmx_update_apicv when updating the guest
> interrupt status field.
> 
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> ---
>  xen/arch/x86/hvm/vmx/intr.c       | 21 +++++++++++++--------
>  xen/arch/x86/hvm/vmx/vvmx.c       |  1 +
>  xen/include/asm-x86/hvm/vmx/vmx.h |  2 ++
>  3 files changed, 16 insertions(+), 8 deletions(-)
> 
> diff --git a/xen/arch/x86/hvm/vmx/intr.c b/xen/arch/x86/hvm/vmx/intr.c
> index 49a1295f09..000e14af49 100644
> --- a/xen/arch/x86/hvm/vmx/intr.c
> +++ b/xen/arch/x86/hvm/vmx/intr.c
> @@ -224,6 +224,18 @@ static int nvmx_intr_intercept(struct vcpu *v, struct
> hvm_intack intack)
>      return 0;
>  }
> 
> +void vmx_sync_exit_bitmap(struct vcpu *v)
> +{
> +    const unsigned int n = ARRAY_SIZE(v->arch.hvm.vmx.eoi_exit_bitmap);
> +    unsigned int i;
> +
> +    while ( (i = find_first_bit(&v->arch.hvm.vmx.eoi_exitmap_changed, n)) <
> n )
> +    {
> +        clear_bit(i, &v->arch.hvm.vmx.eoi_exitmap_changed);
> +        __vmwrite(EOI_EXIT_BITMAP(i), v->arch.hvm.vmx.eoi_exit_bitmap[i]);
> +    }
> +}
> +
>  void vmx_intr_assist(void)
>  {
>      struct hvm_intack intack;
> @@ -318,7 +330,6 @@ void vmx_intr_assist(void)
>                intack.source != hvm_intsrc_vector )
>      {
>          unsigned long status;
> -        unsigned int i, n;
> 
>         /*
>          * intack.vector is the highest priority vector. So we set eoi_exit_bitmap
> @@ -379,13 +390,7 @@ void vmx_intr_assist(void)
>                      intack.vector;
>          __vmwrite(GUEST_INTR_STATUS, status);
> 
> -        n = ARRAY_SIZE(v->arch.hvm.vmx.eoi_exit_bitmap);
> -        while ( (i = find_first_bit(&v->arch.hvm.vmx.eoi_exitmap_changed,
> -                                    n)) < n )
> -        {
> -            clear_bit(i, &v->arch.hvm.vmx.eoi_exitmap_changed);
> -            __vmwrite(EOI_EXIT_BITMAP(i), v->arch.hvm.vmx.eoi_exit_bitmap[i]);
> -        }
> +        vmx_sync_exit_bitmap(v);
> 
>          pt_intr_post(v, intack);
>      }
> diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
> index 180d01e385..e041ecc115 100644
> --- a/xen/arch/x86/hvm/vmx/vvmx.c
> +++ b/xen/arch/x86/hvm/vmx/vvmx.c
> @@ -1414,6 +1414,7 @@ static void nvmx_update_apicv(struct vcpu *v)
>              status |= rvi & VMX_GUEST_INTR_STATUS_SUBFIELD_BITMASK;
> 
>          __vmwrite(GUEST_INTR_STATUS, status);
> +        vmx_sync_exit_bitmap(v);
>      }
>  }
> 
> diff --git a/xen/include/asm-x86/hvm/vmx/vmx.h b/xen/include/asm-
> x86/hvm/vmx/vmx.h
> index b334e1ec94..111ccd7e61 100644
> --- a/xen/include/asm-x86/hvm/vmx/vmx.h
> +++ b/xen/include/asm-x86/hvm/vmx/vmx.h
> @@ -610,6 +610,8 @@ void update_guest_eip(void);
>  void vmx_pi_per_cpu_init(unsigned int cpu);
>  void vmx_pi_desc_fixup(unsigned int cpu);
> 
> +void vmx_sync_exit_bitmap(struct vcpu *v);
> +
>  #ifdef CONFIG_HVM
>  void vmx_pi_hooks_assign(struct domain *d);
>  void vmx_pi_hooks_deassign(struct domain *d);
> --
> 2.25.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Xen-devel] [PATCH 1/3] Revert "x86/vvmx: fix virtual interrupt injection when Ack on exit control is used"
  2020-03-24  5:41       ` Tian, Kevin
@ 2020-03-24  8:10         ` Jan Beulich
  2020-03-24  8:49           ` Tian, Kevin
  0 siblings, 1 reply; 22+ messages in thread
From: Jan Beulich @ 2020-03-24  8:10 UTC (permalink / raw)
  To: Tian, Kevin
  Cc: xen-devel, Andrew Cooper, Wei Liu, Nakajima, Jun, Roger Pau Monné

On 24.03.2020 06:41, Tian, Kevin wrote:
>> From: Roger Pau Monné <roger.pau@citrix.com>
>> Sent: Monday, March 23, 2020 10:49 PM
>>
>> On Mon, Mar 23, 2020 at 09:09:59AM +0100, Jan Beulich wrote:
>>> On 20.03.2020 20:07, Roger Pau Monne wrote:
>>>> This reverts commit f96e1469ad06b61796c60193daaeb9f8a96d7458.
>>>>
>>>> The commit is wrong, as the whole point of nvmx_update_apicv is to
>>>> update the guest interrupt status field when the Ack on exit VMEXIT
>>>> control feature is enabled.
>>>>
>>>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
>>>
>>> Before anyone gets to look at the other two patches, should this
>>> be thrown in right away?
>>
>> I would like if possible get a confirmation from Kevin (or anyone
>> else) that my understanding is correct. I find the nested code very
>> confusing, and I've already made a mistake while trying to fix it.
>> That being said, this was spotted by osstest as introducing a
>> regression, so I guess it's safe to just toss it in now.
>>
>> FWIW patch 2/3 attempts to provide a description of my understanding
>> of how nvmx_update_apicv works.
>>
> 
> I feel it is not good to take this patch alone, as it was introduced to fix
> another problem. W/o understanding whether the whole series can
> fix both old and new problems, we may risk putting nested interrupt
> logic in an even worse state...

Well, okay, I'll wait then, but it would seem to me that reverting
wouldn't put us in a worse state than we were in before that change
was put in.

Jan


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Xen-devel] [PATCH 1/3] Revert "x86/vvmx: fix virtual interrupt injection when Ack on exit control is used"
  2020-03-24  8:10         ` Jan Beulich
@ 2020-03-24  8:49           ` Tian, Kevin
  2020-03-24 10:47             ` Roger Pau Monné
  0 siblings, 1 reply; 22+ messages in thread
From: Tian, Kevin @ 2020-03-24  8:49 UTC (permalink / raw)
  To: Jan Beulich
  Cc: xen-devel, Andrew Cooper, Wei Liu, Nakajima, Jun, Roger Pau Monné

> From: Jan Beulich <jbeulich@suse.com>
> Sent: Tuesday, March 24, 2020 4:10 PM
> 
> On 24.03.2020 06:41, Tian, Kevin wrote:
> >> From: Roger Pau Monné <roger.pau@citrix.com>
> >> Sent: Monday, March 23, 2020 10:49 PM
> >>
> >> On Mon, Mar 23, 2020 at 09:09:59AM +0100, Jan Beulich wrote:
> >>> On 20.03.2020 20:07, Roger Pau Monne wrote:
> >>>> This reverts commit f96e1469ad06b61796c60193daaeb9f8a96d7458.
> >>>>
> >>>> The commit is wrong, as the whole point of nvmx_update_apicv is to
> >>>> update the guest interrupt status field when the Ack on exit VMEXIT
> >>>> control feature is enabled.
> >>>>
> >>>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> >>>
> >>> Before anyone gets to look at the other two patches, should this
> >>> be thrown in right away?
> >>
> >> I would like if possible get a confirmation from Kevin (or anyone
> >> else) that my understanding is correct. I find the nested code very
> >> confusing, and I've already made a mistake while trying to fix it.
> >> That being said, this was spotted by osstest as introducing a
> >> regression, so I guess it's safe to just toss it in now.
> >>
> >> FWIW patch 2/3 attempts to provide a description of my understanding
> >> of how nvmx_update_apicv works.
> >>
> >
> > I feel it is not good to take this patch alone, as it was introduced to fix
> > another problem. W/o understanding whether the whole series can
> > fix both old and new problems, we may risk putting nested interrupt
> > logic in an even worse state...
> 
> Well, okay, I'll wait then, but it would seem to me that reverting
> wouldn't put us in a worse state than we were in before that change
> was put in.

Roger needs to make the call, i.e. which problem is more severe, old or
new one.

Thanks
Kevin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Xen-devel] [PATCH 2/3] x86/nvmx: clarify and fix usage of nvmx_update_apicv
  2020-03-24  6:03   ` Tian, Kevin
@ 2020-03-24  9:50     ` Roger Pau Monné
  2020-03-24 10:11       ` Tian, Kevin
  0 siblings, 1 reply; 22+ messages in thread
From: Roger Pau Monné @ 2020-03-24  9:50 UTC (permalink / raw)
  To: Tian, Kevin; +Cc: Nakajima, Jun, xen-devel, Jan Beulich, Wei Liu, Andrew Cooper

On Tue, Mar 24, 2020 at 06:03:28AM +0000, Tian, Kevin wrote:
> > From: Roger Pau Monne <roger.pau@citrix.com>
> > Sent: Saturday, March 21, 2020 3:08 AM
> > 
> > The current usage of nvmx_update_apicv is not clear: it is deeply
> > intertwined with the Ack interrupt on exit VMEXIT control.
> > 
> > The code in nvmx_update_apicv should update the SVI (in service interrupt)
> > field of the guest interrupt status only when the Ack interrupt on
> > exit is set, as it is used to record that the interrupt being
> > serviced is signaled in a vmcs field, and hence hasn't been injected
> > as on native. It's important to record the current in service
> > interrupt on the guest interrupt status field, or else further
> > interrupts won't respect the priority of the in service one.
> > 
> > While clarifying the usage make sure that the SVI is only updated when
> > Ack on exit is set and the nested vmcs interrupt info field is valid. Or
> > else a guest not using the Ack on exit feature would loose interrupts as
> > they would be signaled as being in service on the guest interrupt
> > status field but won't actually be recorded on the interrupt info vmcs
> > field, neither injected in any other way.
> 
> It is insufficient. You also need to update RVI to enable virtual injection
> when Ack on exit is cleared.

But RVI should be updated in vmx_intr_assist in that case, since
nvmx_intr_intercept shouldn't intercept the interrupt, as it should be
handled normally.

> > 
> > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > ---
> >  xen/arch/x86/hvm/vmx/vvmx.c | 11 ++++++++++-
> >  1 file changed, 10 insertions(+), 1 deletion(-)
> > 
> > diff --git a/xen/arch/x86/hvm/vmx/vvmx.c b/xen/arch/x86/hvm/vmx/vvmx.c
> > index 1b8461ba30..180d01e385 100644
> > --- a/xen/arch/x86/hvm/vmx/vvmx.c
> > +++ b/xen/arch/x86/hvm/vmx/vvmx.c
> > @@ -1383,7 +1383,7 @@ static void (struct vcpu *v)
> >  {
> >      struct nestedvmx *nvmx = &vcpu_2_nvmx(v);
> >      unsigned long reason = get_vvmcs(v, VM_EXIT_REASON);
> > -    uint32_t intr_info = nvmx->intr.intr_info;
> > +    unsigned long intr_info = get_vvmcs(v, VM_EXIT_INTR_INFO);
> 
> well, above reminds me an interesting question. Combining last and this
> patch, we'd see essentially that it goes back to the state before f96e1469
> (at least when Ack on exit is true).

Well, patch 1/3 is a revert of f96e1469, so just reverting f96e1469
gets us to that state.

This patch is an attempt to clarify that nvmx_update_apicv is closely
related to the Ack on exit feature, as it modifies SVI in order to
signal the vector currently being serviced by the EXIT_INTR_INFO vmcs
field. This was not obvious to me, as at first sight I assumed
nvmx_update_apicv was actually injecting that vector into the guest.

> iirc, that commit was introduced to enable
> nested x2apic with apicv, and your very first version even just removed
> the whole nvmx_update_apicv. Then now with the new reverted logic,
> are you still suffering x2apic problem? If not, does it imply the real fix
> is actually coming from patch 3/3 for eoi bitmap update?

Yes, patch 3/3 is the one that fixed the issue. Note however that
strangely enough removing the call to nvmx_update_apicv (as my first
attempt to solve the issue) did fix it on one of my boxes. It depends
a lot on the available vmx features AFAICT.

> > 
> >      if ( reason == EXIT_REASON_EXTERNAL_INTERRUPT &&
> >           nvmx->intr.source == hvm_intsrc_lapic &&
> > @@ -1399,6 +1399,15 @@ static void nvmx_update_apicv(struct vcpu *v)
> >          ppr = vlapic_set_ppr(vlapic);
> >          WARN_ON((ppr & 0xf0) != (vector & 0xf0));
> > 
> > +        /*
> > +         * SVI must be updated when the interrupt has been signaled using the
> > +         * Ack on exit feature, or else the currently in-service interrupt
> > +         * won't be respected.
> > +         *
> > +         * Note that this is specific to the fact that when doing a VMEXIT an
> > +         * interrupt might get delivered using the interrupt info vmcs field
> > +         * instead of being injected normally.
> > +         */
> 
> I'm not sure mentioning SVI specifically here is necessary, since all steps
> here are required - updating iir, isr, rvi, svi, ppr, etc. It is just an emulation
> of updating virtual APIC state as if a virtual interrupt delivery happens.

Hm, it was hard for me to figure out that SVI is modified here in
order to signal that the Ack on exit vector is currently in service,
as opposed to being injected using the virtual interrupt delivery
mechanism.

I just wanted to clarify that the purpose of this function is not to
inject the vector in intr_info, but rather to signal that such vector
has already been injected using a different mechanism.

I'm happy to reword this, but IMO it should be clear that the purpose
of the function is not so much to inject an interrupt, but to update
the virtual interrupt delivery field in order to signal that an
interrupt has been injected using a different mechanism.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Xen-devel] [PATCH 3/3] x86/nvmx: update exit bitmap on vmexit
  2020-03-24  6:22   ` Tian, Kevin
@ 2020-03-24  9:59     ` Roger Pau Monné
  2020-03-24 10:16       ` Tian, Kevin
  0 siblings, 1 reply; 22+ messages in thread
From: Roger Pau Monné @ 2020-03-24  9:59 UTC (permalink / raw)
  To: Tian, Kevin; +Cc: xen-devel, Wei Liu, Jan Beulich, Nakajima, Jun, Andrew Cooper

On Tue, Mar 24, 2020 at 06:22:43AM +0000, Tian, Kevin wrote:
> > From: Roger Pau Monne <roger.pau@citrix.com>
> > Sent: Saturday, March 21, 2020 3:08 AM
> > 
> > Current code in nvmx_update_apicv set the guest interrupt status field
> > but doesn't update the exit bitmap, which can cause issues of lost
> > interrupts on the L1 hypervisor if vmx_intr_assist gets
> > short-circuited by nvmx_intr_intercept returning true.
> 
> Above is not accurate. Currently Xen didn't choose to update the EOI
> exit bitmap every time when there is a change. Instead, it chose to 
> batch the update before resuming to the guest. sort of optimization.
> So it is not related to whether SVI is changed. We should always do the 
> bitmap update in nvmx_update_apicv, regardless of the setting of
> Ack-on-exit ...

But if Ack on exit is not set the GUEST_INTR_STATUS won't be changed
by nvmx_intr_intercept, and hence there's no need to update the EOI
exit map?

If OTOH the GUEST_INTR_STATUS field is changed by vmx_intr_assist the
bitmap will already be updated there.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Xen-devel] [PATCH 2/3] x86/nvmx: clarify and fix usage of nvmx_update_apicv
  2020-03-24  9:50     ` Roger Pau Monné
@ 2020-03-24 10:11       ` Tian, Kevin
  2020-03-24 11:22         ` Roger Pau Monné
  0 siblings, 1 reply; 22+ messages in thread
From: Tian, Kevin @ 2020-03-24 10:11 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Nakajima, Jun, xen-devel, Jan Beulich, Wei Liu, Andrew Cooper

> From: Roger Pau Monné <roger.pau@citrix.com>
> Sent: Tuesday, March 24, 2020 5:51 PM
> 
> On Tue, Mar 24, 2020 at 06:03:28AM +0000, Tian, Kevin wrote:
> > > From: Roger Pau Monne <roger.pau@citrix.com>
> > > Sent: Saturday, March 21, 2020 3:08 AM
> > >
> > > The current usage of nvmx_update_apicv is not clear: it is deeply
> > > intertwined with the Ack interrupt on exit VMEXIT control.
> > >
> > > The code in nvmx_update_apicv should update the SVI (in service
> interrupt)
> > > field of the guest interrupt status only when the Ack interrupt on
> > > exit is set, as it is used to record that the interrupt being
> > > serviced is signaled in a vmcs field, and hence hasn't been injected
> > > as on native. It's important to record the current in service
> > > interrupt on the guest interrupt status field, or else further
> > > interrupts won't respect the priority of the in service one.
> > >
> > > While clarifying the usage make sure that the SVI is only updated when
> > > Ack on exit is set and the nested vmcs interrupt info field is valid. Or
> > > else a guest not using the Ack on exit feature would loose interrupts as
> > > they would be signaled as being in service on the guest interrupt
> > > status field but won't actually be recorded on the interrupt info vmcs
> > > field, neither injected in any other way.
> >
> > It is insufficient. You also need to update RVI to enable virtual injection
> > when Ack on exit is cleared.
> 
> But RVI should be updated in vmx_intr_assist in that case, since
> nvmx_intr_intercept shouldn't intercept the interrupt, as it should be
> handled normally.

As we discussed before, vmx_intr_assist is invoked before nvmx_switch_guest.
It is incorrectly to update RVI at that time since it might be still vmcs02 being 
active (if no pending softirq to make it invoked again).

Also nvmx_intr_intercept does intercept Ack-on-exit=0 case:

        if ( intack.source == hvm_intsrc_pic ||
                 intack.source == hvm_intsrc_lapic )
        {
            vmx_inject_extint(intack.vector, intack.source);

            ctrl = get_vvmcs(v, VM_EXIT_CONTROLS);
            if ( ctrl & VM_EXIT_ACK_INTR_ON_EXIT )
            {
                /* for now, duplicate the ack path in vmx_intr_assist */
                hvm_vcpu_ack_pending_irq(v, intack);
                pt_intr_post(v, intack);

                intack = hvm_vcpu_has_pending_irq(v);
                if ( unlikely(intack.source != hvm_intsrc_none) )
                    vmx_enable_intr_window(v, intack);
            }
            else if ( !cpu_has_vmx_virtual_intr_delivery )
                vmx_enable_intr_window(v, intack);

            return 1; <<<<<<<<
        }

> 
> > >
> > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > > ---
> > >  xen/arch/x86/hvm/vmx/vvmx.c | 11 ++++++++++-
> > >  1 file changed, 10 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/xen/arch/x86/hvm/vmx/vvmx.c
> b/xen/arch/x86/hvm/vmx/vvmx.c
> > > index 1b8461ba30..180d01e385 100644
> > > --- a/xen/arch/x86/hvm/vmx/vvmx.c
> > > +++ b/xen/arch/x86/hvm/vmx/vvmx.c
> > > @@ -1383,7 +1383,7 @@ static void (struct vcpu *v)
> > >  {
> > >      struct nestedvmx *nvmx = &vcpu_2_nvmx(v);
> > >      unsigned long reason = get_vvmcs(v, VM_EXIT_REASON);
> > > -    uint32_t intr_info = nvmx->intr.intr_info;
> > > +    unsigned long intr_info = get_vvmcs(v, VM_EXIT_INTR_INFO);
> >
> > well, above reminds me an interesting question. Combining last and this
> > patch, we'd see essentially that it goes back to the state before f96e1469
> > (at least when Ack on exit is true).
> 
> Well, patch 1/3 is a revert of f96e1469, so just reverting f96e1469
> gets us to that state.

you are right. I just wanted to point out that this patch alone doesn't
do any real fix for ack-on-exit=1 case. It just makes sure that ack-on-exit=0
is skipped from that function. So it won't be the one fixing your previous
problem. 😊

> 
> This patch is an attempt to clarify that nvmx_update_apicv is closely
> related to the Ack on exit feature, as it modifies SVI in order to
> signal the vector currently being serviced by the EXIT_INTR_INFO vmcs
> field. This was not obvious to me, as at first sight I assumed
> nvmx_update_apicv was actually injecting that vector into the guest.
> 
> > iirc, that commit was introduced to enable
> > nested x2apic with apicv, and your very first version even just removed
> > the whole nvmx_update_apicv. Then now with the new reverted logic,
> > are you still suffering x2apic problem? If not, does it imply the real fix
> > is actually coming from patch 3/3 for eoi bitmap update?
> 
> Yes, patch 3/3 is the one that fixed the issue. Note however that
> strangely enough removing the call to nvmx_update_apicv (as my first
> attempt to solve the issue) did fix it on one of my boxes. It depends
> a lot on the available vmx features AFAICT.

Did you confirm that with 3/3 alone can fix that issue? Just want to make
sure the real gain of each patch, so we can reflect it in the commit msg
in updated version.

> 
> > >
> > >      if ( reason == EXIT_REASON_EXTERNAL_INTERRUPT &&
> > >           nvmx->intr.source == hvm_intsrc_lapic &&
> > > @@ -1399,6 +1399,15 @@ static void nvmx_update_apicv(struct vcpu *v)
> > >          ppr = vlapic_set_ppr(vlapic);
> > >          WARN_ON((ppr & 0xf0) != (vector & 0xf0));
> > >
> > > +        /*
> > > +         * SVI must be updated when the interrupt has been signaled using
> the
> > > +         * Ack on exit feature, or else the currently in-service interrupt
> > > +         * won't be respected.
> > > +         *
> > > +         * Note that this is specific to the fact that when doing a VMEXIT an
> > > +         * interrupt might get delivered using the interrupt info vmcs field
> > > +         * instead of being injected normally.
> > > +         */
> >
> > I'm not sure mentioning SVI specifically here is necessary, since all steps
> > here are required - updating iir, isr, rvi, svi, ppr, etc. It is just an emulation
> > of updating virtual APIC state as if a virtual interrupt delivery happens.
> 
> Hm, it was hard for me to figure out that SVI is modified here in
> order to signal that the Ack on exit vector is currently in service,
> as opposed to being injected using the virtual interrupt delivery
> mechanism.
> 
> I just wanted to clarify that the purpose of this function is not to
> inject the vector in intr_info, but rather to signal that such vector
> has already been injected using a different mechanism.
> 
> I'm happy to reword this, but IMO it should be clear that the purpose
> of the function is not so much to inject an interrupt, but to update
> the virtual interrupt delivery field in order to signal that an
> interrupt has been injected using a different mechanism.
> 

reading it again I feel possibly fine to put the comment there. But
I disagree the statement above. The purpose of this function is indeed 
for injecting an interrupt. Both RVI and SVI are additional requirements 
when virtual APIC page is in-use while virtual interrupt delivery is not 
used, i.e. when ack-on-exit is set.

Thanks
Kevin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Xen-devel] [PATCH 3/3] x86/nvmx: update exit bitmap on vmexit
  2020-03-24  9:59     ` Roger Pau Monné
@ 2020-03-24 10:16       ` Tian, Kevin
  2020-03-24 11:24         ` Roger Pau Monné
  0 siblings, 1 reply; 22+ messages in thread
From: Tian, Kevin @ 2020-03-24 10:16 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: xen-devel, Wei Liu, Jan Beulich, Nakajima, Jun, Andrew Cooper

> From: Roger Pau Monné <roger.pau@citrix.com>
> Sent: Tuesday, March 24, 2020 5:59 PM
> 
> On Tue, Mar 24, 2020 at 06:22:43AM +0000, Tian, Kevin wrote:
> > > From: Roger Pau Monne <roger.pau@citrix.com>
> > > Sent: Saturday, March 21, 2020 3:08 AM
> > >
> > > Current code in nvmx_update_apicv set the guest interrupt status field
> > > but doesn't update the exit bitmap, which can cause issues of lost
> > > interrupts on the L1 hypervisor if vmx_intr_assist gets
> > > short-circuited by nvmx_intr_intercept returning true.
> >
> > Above is not accurate. Currently Xen didn't choose to update the EOI
> > exit bitmap every time when there is a change. Instead, it chose to
> > batch the update before resuming to the guest. sort of optimization.
> > So it is not related to whether SVI is changed. We should always do the
> > bitmap update in nvmx_update_apicv, regardless of the setting of
> > Ack-on-exit ...
> 
> But if Ack on exit is not set the GUEST_INTR_STATUS won't be changed
> by nvmx_intr_intercept, and hence there's no need to update the EOI
> exit map?
> 
> If OTOH the GUEST_INTR_STATUS field is changed by vmx_intr_assist the
> bitmap will already be updated there.
> 

If you agree with my comment in patch 2/3 about setting RVI for 
ack-on-exit=0, then EOI bitmap update should be done there too.

Thanks
Kevin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Xen-devel] [PATCH 1/3] Revert "x86/vvmx: fix virtual interrupt injection when Ack on exit control is used"
  2020-03-24  8:49           ` Tian, Kevin
@ 2020-03-24 10:47             ` Roger Pau Monné
  2020-03-24 11:00               ` Tian, Kevin
  0 siblings, 1 reply; 22+ messages in thread
From: Roger Pau Monné @ 2020-03-24 10:47 UTC (permalink / raw)
  To: Tian, Kevin; +Cc: xen-devel, Wei Liu, Nakajima, Jun, Jan Beulich, Andrew Cooper

On Tue, Mar 24, 2020 at 08:49:27AM +0000, Tian, Kevin wrote:
> > From: Jan Beulich <jbeulich@suse.com>
> > Sent: Tuesday, March 24, 2020 4:10 PM
> > 
> > On 24.03.2020 06:41, Tian, Kevin wrote:
> > >> From: Roger Pau Monné <roger.pau@citrix.com>
> > >> Sent: Monday, March 23, 2020 10:49 PM
> > >>
> > >> On Mon, Mar 23, 2020 at 09:09:59AM +0100, Jan Beulich wrote:
> > >>> On 20.03.2020 20:07, Roger Pau Monne wrote:
> > >>>> This reverts commit f96e1469ad06b61796c60193daaeb9f8a96d7458.
> > >>>>
> > >>>> The commit is wrong, as the whole point of nvmx_update_apicv is to
> > >>>> update the guest interrupt status field when the Ack on exit VMEXIT
> > >>>> control feature is enabled.
> > >>>>
> > >>>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > >>>
> > >>> Before anyone gets to look at the other two patches, should this
> > >>> be thrown in right away?
> > >>
> > >> I would like if possible get a confirmation from Kevin (or anyone
> > >> else) that my understanding is correct. I find the nested code very
> > >> confusing, and I've already made a mistake while trying to fix it.
> > >> That being said, this was spotted by osstest as introducing a
> > >> regression, so I guess it's safe to just toss it in now.
> > >>
> > >> FWIW patch 2/3 attempts to provide a description of my understanding
> > >> of how nvmx_update_apicv works.
> > >>
> > >
> > > I feel it is not good to take this patch alone, as it was introduced to fix
> > > another problem. W/o understanding whether the whole series can
> > > fix both old and new problems, we may risk putting nested interrupt
> > > logic in an even worse state...
> > 
> > Well, okay, I'll wait then, but it would seem to me that reverting
> > wouldn't put us in a worse state than we were in before that change
> > was put in.
> 
> Roger needs to make the call, i.e. which problem is more severe, old or
> new one.

AFAICT those problems affect different kind of hardware, so it doesn't
make much difference. IMO f96e1469ad06b6 should be reverted because
it's plain wrong, and albeit it seemed to fix the issue in one of my
boxes it was just luck.

I don't thinks it's going to make matters worse, as osstest is already
blocked, but reverting it alone without the rest of the series is not
going to unblock osstest either. If you agree it's wrong I think it
could go in now, even if it's just to avoid having to repost it with
newer versions of the series.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Xen-devel] [PATCH 1/3] Revert "x86/vvmx: fix virtual interrupt injection when Ack on exit control is used"
  2020-03-24 10:47             ` Roger Pau Monné
@ 2020-03-24 11:00               ` Tian, Kevin
  0 siblings, 0 replies; 22+ messages in thread
From: Tian, Kevin @ 2020-03-24 11:00 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: xen-devel, Wei Liu, Nakajima, Jun, Jan Beulich, Andrew Cooper

> From: Roger Pau Monné <roger.pau@citrix.com>
> Sent: Tuesday, March 24, 2020 6:47 PM
> 
> On Tue, Mar 24, 2020 at 08:49:27AM +0000, Tian, Kevin wrote:
> > > From: Jan Beulich <jbeulich@suse.com>
> > > Sent: Tuesday, March 24, 2020 4:10 PM
> > >
> > > On 24.03.2020 06:41, Tian, Kevin wrote:
> > > >> From: Roger Pau Monné <roger.pau@citrix.com>
> > > >> Sent: Monday, March 23, 2020 10:49 PM
> > > >>
> > > >> On Mon, Mar 23, 2020 at 09:09:59AM +0100, Jan Beulich wrote:
> > > >>> On 20.03.2020 20:07, Roger Pau Monne wrote:
> > > >>>> This reverts commit f96e1469ad06b61796c60193daaeb9f8a96d7458.
> > > >>>>
> > > >>>> The commit is wrong, as the whole point of nvmx_update_apicv is
> to
> > > >>>> update the guest interrupt status field when the Ack on exit VMEXIT
> > > >>>> control feature is enabled.
> > > >>>>
> > > >>>> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > > >>>
> > > >>> Before anyone gets to look at the other two patches, should this
> > > >>> be thrown in right away?
> > > >>
> > > >> I would like if possible get a confirmation from Kevin (or anyone
> > > >> else) that my understanding is correct. I find the nested code very
> > > >> confusing, and I've already made a mistake while trying to fix it.
> > > >> That being said, this was spotted by osstest as introducing a
> > > >> regression, so I guess it's safe to just toss it in now.
> > > >>
> > > >> FWIW patch 2/3 attempts to provide a description of my
> understanding
> > > >> of how nvmx_update_apicv works.
> > > >>
> > > >
> > > > I feel it is not good to take this patch alone, as it was introduced to fix
> > > > another problem. W/o understanding whether the whole series can
> > > > fix both old and new problems, we may risk putting nested interrupt
> > > > logic in an even worse state...
> > >
> > > Well, okay, I'll wait then, but it would seem to me that reverting
> > > wouldn't put us in a worse state than we were in before that change
> > > was put in.
> >
> > Roger needs to make the call, i.e. which problem is more severe, old or
> > new one.
> 
> AFAICT those problems affect different kind of hardware, so it doesn't
> make much difference. IMO f96e1469ad06b6 should be reverted because
> it's plain wrong, and albeit it seemed to fix the issue in one of my
> boxes it was just luck.
> 
> I don't thinks it's going to make matters worse, as osstest is already
> blocked, but reverting it alone without the rest of the series is not
> going to unblock osstest either. If you agree it's wrong I think it
> could go in now, even if it's just to avoid having to repost it with
> newer versions of the series.
> 

yes, I agree it's wrong.

Reviewed-by: Kevin Tian <kevin.tian@intel.com>

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Xen-devel] [PATCH 2/3] x86/nvmx: clarify and fix usage of nvmx_update_apicv
  2020-03-24 10:11       ` Tian, Kevin
@ 2020-03-24 11:22         ` Roger Pau Monné
  2020-03-24 11:33           ` Tian, Kevin
  0 siblings, 1 reply; 22+ messages in thread
From: Roger Pau Monné @ 2020-03-24 11:22 UTC (permalink / raw)
  To: Tian, Kevin; +Cc: Nakajima, Jun, xen-devel, Jan Beulich, Wei Liu, Andrew Cooper

On Tue, Mar 24, 2020 at 10:11:15AM +0000, Tian, Kevin wrote:
> > From: Roger Pau Monné <roger.pau@citrix.com>
> > Sent: Tuesday, March 24, 2020 5:51 PM
> > 
> > On Tue, Mar 24, 2020 at 06:03:28AM +0000, Tian, Kevin wrote:
> > > > From: Roger Pau Monne <roger.pau@citrix.com>
> > > > Sent: Saturday, March 21, 2020 3:08 AM
> > > >
> > > > The current usage of nvmx_update_apicv is not clear: it is deeply
> > > > intertwined with the Ack interrupt on exit VMEXIT control.
> > > >
> > > > The code in nvmx_update_apicv should update the SVI (in service
> > interrupt)
> > > > field of the guest interrupt status only when the Ack interrupt on
> > > > exit is set, as it is used to record that the interrupt being
> > > > serviced is signaled in a vmcs field, and hence hasn't been injected
> > > > as on native. It's important to record the current in service
> > > > interrupt on the guest interrupt status field, or else further
> > > > interrupts won't respect the priority of the in service one.
> > > >
> > > > While clarifying the usage make sure that the SVI is only updated when
> > > > Ack on exit is set and the nested vmcs interrupt info field is valid. Or
> > > > else a guest not using the Ack on exit feature would loose interrupts as
> > > > they would be signaled as being in service on the guest interrupt
> > > > status field but won't actually be recorded on the interrupt info vmcs
> > > > field, neither injected in any other way.
> > >
> > > It is insufficient. You also need to update RVI to enable virtual injection
> > > when Ack on exit is cleared.
> > 
> > But RVI should be updated in vmx_intr_assist in that case, since
> > nvmx_intr_intercept shouldn't intercept the interrupt, as it should be
> > handled normally.
> 
> As we discussed before, vmx_intr_assist is invoked before nvmx_switch_guest.
> 
> It is incorrectly to update RVI at that time since it might be still vmcs02 being 
> active (if no pending softirq to make it invoked again).
> 
> Also nvmx_intr_intercept does intercept Ack-on-exit=0 case:
> 
>         if ( intack.source == hvm_intsrc_pic ||
>                  intack.source == hvm_intsrc_lapic )
>         {
>             vmx_inject_extint(intack.vector, intack.source);
> 
>             ctrl = get_vvmcs(v, VM_EXIT_CONTROLS);
>             if ( ctrl & VM_EXIT_ACK_INTR_ON_EXIT )
>             {
>                 /* for now, duplicate the ack path in vmx_intr_assist */
>                 hvm_vcpu_ack_pending_irq(v, intack);
>                 pt_intr_post(v, intack);
> 
>                 intack = hvm_vcpu_has_pending_irq(v);
>                 if ( unlikely(intack.source != hvm_intsrc_none) )
>                     vmx_enable_intr_window(v, intack);
>             }
>             else if ( !cpu_has_vmx_virtual_intr_delivery )
>                 vmx_enable_intr_window(v, intack);
> 
>             return 1; <<<<<<<<

Right, I always get confused by the switch happening in the vmentry
path.

That only happens when the vcpu is in nested mode
(nestedhvm_vcpu_in_guestmode == true). That would be the case before a
vmexit, at least for the first call to vmx_intr_assist if there are
pending softirqs.

Note that if there are pending softirqs the second time
nvmx_intr_intercept will return 0.

>         }
> 
> > 
> > > >
> > > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > > > ---
> > > >  xen/arch/x86/hvm/vmx/vvmx.c | 11 ++++++++++-
> > > >  1 file changed, 10 insertions(+), 1 deletion(-)
> > > >
> > > > diff --git a/xen/arch/x86/hvm/vmx/vvmx.c
> > b/xen/arch/x86/hvm/vmx/vvmx.c
> > > > index 1b8461ba30..180d01e385 100644
> > > > --- a/xen/arch/x86/hvm/vmx/vvmx.c
> > > > +++ b/xen/arch/x86/hvm/vmx/vvmx.c
> > > > @@ -1383,7 +1383,7 @@ static void (struct vcpu *v)
> > > >  {
> > > >      struct nestedvmx *nvmx = &vcpu_2_nvmx(v);
> > > >      unsigned long reason = get_vvmcs(v, VM_EXIT_REASON);
> > > > -    uint32_t intr_info = nvmx->intr.intr_info;
> > > > +    unsigned long intr_info = get_vvmcs(v, VM_EXIT_INTR_INFO);
> > >
> > > well, above reminds me an interesting question. Combining last and this
> > > patch, we'd see essentially that it goes back to the state before f96e1469
> > > (at least when Ack on exit is true).
> > 
> > Well, patch 1/3 is a revert of f96e1469, so just reverting f96e1469
> > gets us to that state.
> 
> you are right. I just wanted to point out that this patch alone doesn't
> do any real fix for ack-on-exit=1 case. It just makes sure that ack-on-exit=0
> is skipped from that function. So it won't be the one fixing your previous
> problem. 😊

Yes, that's correct.

> > 
> > This patch is an attempt to clarify that nvmx_update_apicv is closely
> > related to the Ack on exit feature, as it modifies SVI in order to
> > signal the vector currently being serviced by the EXIT_INTR_INFO vmcs
> > field. This was not obvious to me, as at first sight I assumed
> > nvmx_update_apicv was actually injecting that vector into the guest.
> > 
> > > iirc, that commit was introduced to enable
> > > nested x2apic with apicv, and your very first version even just removed
> > > the whole nvmx_update_apicv. Then now with the new reverted logic,
> > > are you still suffering x2apic problem? If not, does it imply the real fix
> > > is actually coming from patch 3/3 for eoi bitmap update?
> > 
> > Yes, patch 3/3 is the one that fixed the issue. Note however that
> > strangely enough removing the call to nvmx_update_apicv (as my first
> > attempt to solve the issue) did fix it on one of my boxes. It depends
> > a lot on the available vmx features AFAICT.
> 
> Did you confirm that with 3/3 alone can fix that issue? Just want to make
> sure the real gain of each patch, so we can reflect it in the commit msg
> in updated version.

Yes, the patch that actually fixes the issue in the box I've been
testing with is 3/3.

Xen will always use the Ack on exit feature, I currently have no way
to test whether not using Ack on exit works at all.

> > 
> > > >
> > > >      if ( reason == EXIT_REASON_EXTERNAL_INTERRUPT &&
> > > >           nvmx->intr.source == hvm_intsrc_lapic &&
> > > > @@ -1399,6 +1399,15 @@ static void nvmx_update_apicv(struct vcpu *v)
> > > >          ppr = vlapic_set_ppr(vlapic);
> > > >          WARN_ON((ppr & 0xf0) != (vector & 0xf0));
> > > >
> > > > +        /*
> > > > +         * SVI must be updated when the interrupt has been signaled using
> > the
> > > > +         * Ack on exit feature, or else the currently in-service interrupt
> > > > +         * won't be respected.
> > > > +         *
> > > > +         * Note that this is specific to the fact that when doing a VMEXIT an
> > > > +         * interrupt might get delivered using the interrupt info vmcs field
> > > > +         * instead of being injected normally.
> > > > +         */
> > >
> > > I'm not sure mentioning SVI specifically here is necessary, since all steps
> > > here are required - updating iir, isr, rvi, svi, ppr, etc. It is just an emulation
> > > of updating virtual APIC state as if a virtual interrupt delivery happens.
> > 
> > Hm, it was hard for me to figure out that SVI is modified here in
> > order to signal that the Ack on exit vector is currently in service,
> > as opposed to being injected using the virtual interrupt delivery
> > mechanism.
> > 
> > I just wanted to clarify that the purpose of this function is not to
> > inject the vector in intr_info, but rather to signal that such vector
> > has already been injected using a different mechanism.
> > 
> > I'm happy to reword this, but IMO it should be clear that the purpose
> > of the function is not so much to inject an interrupt, but to update
> > the virtual interrupt delivery field in order to signal that an
> > interrupt has been injected using a different mechanism.
> > 
> 
> reading it again I feel possibly fine to put the comment there. But
> I disagree the statement above. The purpose of this function is indeed 
> for injecting an interrupt. Both RVI and SVI are additional requirements 
> when virtual APIC page is in-use while virtual interrupt delivery is not 
> used, i.e. when ack-on-exit is set.

Right, so if I understand this correctly:

 - SVI: must be updated when Ack on exit is used, to signal the
   current in service interrupt which has been injected using a
   mechanism different than virtual interrupts.

 - RVI: should always be updated if there are pending interrupts to
   be delivered. Note that the interrupt signaled in the INTR_INFO
   field if Ack on exit is enabled is not considered pending anymore.

So I think RVI should always be updated, regardless of whether Ack on
exit is used or not. Do you agree?

Thanks, Roger.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Xen-devel] [PATCH 3/3] x86/nvmx: update exit bitmap on vmexit
  2020-03-24 10:16       ` Tian, Kevin
@ 2020-03-24 11:24         ` Roger Pau Monné
  0 siblings, 0 replies; 22+ messages in thread
From: Roger Pau Monné @ 2020-03-24 11:24 UTC (permalink / raw)
  To: Tian, Kevin; +Cc: xen-devel, Wei Liu, Jan Beulich, Nakajima, Jun, Andrew Cooper

On Tue, Mar 24, 2020 at 10:16:52AM +0000, Tian, Kevin wrote:
> > From: Roger Pau Monné <roger.pau@citrix.com>
> > Sent: Tuesday, March 24, 2020 5:59 PM
> > 
> > On Tue, Mar 24, 2020 at 06:22:43AM +0000, Tian, Kevin wrote:
> > > > From: Roger Pau Monne <roger.pau@citrix.com>
> > > > Sent: Saturday, March 21, 2020 3:08 AM
> > > >
> > > > Current code in nvmx_update_apicv set the guest interrupt status field
> > > > but doesn't update the exit bitmap, which can cause issues of lost
> > > > interrupts on the L1 hypervisor if vmx_intr_assist gets
> > > > short-circuited by nvmx_intr_intercept returning true.
> > >
> > > Above is not accurate. Currently Xen didn't choose to update the EOI
> > > exit bitmap every time when there is a change. Instead, it chose to
> > > batch the update before resuming to the guest. sort of optimization.
> > > So it is not related to whether SVI is changed. We should always do the
> > > bitmap update in nvmx_update_apicv, regardless of the setting of
> > > Ack-on-exit ...
> > 
> > But if Ack on exit is not set the GUEST_INTR_STATUS won't be changed
> > by nvmx_intr_intercept, and hence there's no need to update the EOI
> > exit map?
> > 
> > If OTOH the GUEST_INTR_STATUS field is changed by vmx_intr_assist the
> > bitmap will already be updated there.
> > 
> 
> If you agree with my comment in patch 2/3 about setting RVI for 
> ack-on-exit=0, then EOI bitmap update should be done there too.

Yes, I agree, see my reply to your comment. I (again) misremembered the
sequence and assumed the switch will happen prior to calling
vmx_intr_assist which is not the case.

Thanks, Roger.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Xen-devel] [PATCH 2/3] x86/nvmx: clarify and fix usage of nvmx_update_apicv
  2020-03-24 11:22         ` Roger Pau Monné
@ 2020-03-24 11:33           ` Tian, Kevin
  2020-03-24 12:18             ` Roger Pau Monné
  0 siblings, 1 reply; 22+ messages in thread
From: Tian, Kevin @ 2020-03-24 11:33 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Nakajima, Jun, xen-devel, Jan Beulich, Wei Liu, Andrew Cooper

> From: Roger Pau Monné <roger.pau@citrix.com>
> Sent: Tuesday, March 24, 2020 7:23 PM
> 
> On Tue, Mar 24, 2020 at 10:11:15AM +0000, Tian, Kevin wrote:
> > > From: Roger Pau Monné <roger.pau@citrix.com>
> > > Sent: Tuesday, March 24, 2020 5:51 PM
> > >
> > > On Tue, Mar 24, 2020 at 06:03:28AM +0000, Tian, Kevin wrote:
> > > > > From: Roger Pau Monne <roger.pau@citrix.com>
> > > > > Sent: Saturday, March 21, 2020 3:08 AM
> > > > >
> > > > > The current usage of nvmx_update_apicv is not clear: it is deeply
> > > > > intertwined with the Ack interrupt on exit VMEXIT control.
> > > > >
> > > > > The code in nvmx_update_apicv should update the SVI (in service
> > > interrupt)
> > > > > field of the guest interrupt status only when the Ack interrupt on
> > > > > exit is set, as it is used to record that the interrupt being
> > > > > serviced is signaled in a vmcs field, and hence hasn't been injected
> > > > > as on native. It's important to record the current in service
> > > > > interrupt on the guest interrupt status field, or else further
> > > > > interrupts won't respect the priority of the in service one.
> > > > >
> > > > > While clarifying the usage make sure that the SVI is only updated
> when
> > > > > Ack on exit is set and the nested vmcs interrupt info field is valid. Or
> > > > > else a guest not using the Ack on exit feature would loose interrupts
> as
> > > > > they would be signaled as being in service on the guest interrupt
> > > > > status field but won't actually be recorded on the interrupt info vmcs
> > > > > field, neither injected in any other way.
> > > >
> > > > It is insufficient. You also need to update RVI to enable virtual injection
> > > > when Ack on exit is cleared.
> > >
> > > But RVI should be updated in vmx_intr_assist in that case, since
> > > nvmx_intr_intercept shouldn't intercept the interrupt, as it should be
> > > handled normally.
> >
> > As we discussed before, vmx_intr_assist is invoked before
> nvmx_switch_guest.
> >
> > It is incorrectly to update RVI at that time since it might be still vmcs02
> being
> > active (if no pending softirq to make it invoked again).
> >
> > Also nvmx_intr_intercept does intercept Ack-on-exit=0 case:
> >
> >         if ( intack.source == hvm_intsrc_pic ||
> >                  intack.source == hvm_intsrc_lapic )
> >         {
> >             vmx_inject_extint(intack.vector, intack.source);
> >
> >             ctrl = get_vvmcs(v, VM_EXIT_CONTROLS);
> >             if ( ctrl & VM_EXIT_ACK_INTR_ON_EXIT )
> >             {
> >                 /* for now, duplicate the ack path in vmx_intr_assist */
> >                 hvm_vcpu_ack_pending_irq(v, intack);
> >                 pt_intr_post(v, intack);
> >
> >                 intack = hvm_vcpu_has_pending_irq(v);
> >                 if ( unlikely(intack.source != hvm_intsrc_none) )
> >                     vmx_enable_intr_window(v, intack);
> >             }
> >             else if ( !cpu_has_vmx_virtual_intr_delivery )
> >                 vmx_enable_intr_window(v, intack);
> >
> >             return 1; <<<<<<<<
> 
> Right, I always get confused by the switch happening in the vmentry
> path.
> 
> That only happens when the vcpu is in nested mode
> (nestedhvm_vcpu_in_guestmode == true). That would be the case before a
> vmexit, at least for the first call to vmx_intr_assist if there are
> pending softirqs.
> 
> Note that if there are pending softirqs the second time
> nvmx_intr_intercept will return 0.

Exactly.

> 
> >         }
> >
> > >
> > > > >
> > > > > Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> > > > > ---
> > > > >  xen/arch/x86/hvm/vmx/vvmx.c | 11 ++++++++++-
> > > > >  1 file changed, 10 insertions(+), 1 deletion(-)
> > > > >
> > > > > diff --git a/xen/arch/x86/hvm/vmx/vvmx.c
> > > b/xen/arch/x86/hvm/vmx/vvmx.c
> > > > > index 1b8461ba30..180d01e385 100644
> > > > > --- a/xen/arch/x86/hvm/vmx/vvmx.c
> > > > > +++ b/xen/arch/x86/hvm/vmx/vvmx.c
> > > > > @@ -1383,7 +1383,7 @@ static void (struct vcpu *v)
> > > > >  {
> > > > >      struct nestedvmx *nvmx = &vcpu_2_nvmx(v);
> > > > >      unsigned long reason = get_vvmcs(v, VM_EXIT_REASON);
> > > > > -    uint32_t intr_info = nvmx->intr.intr_info;
> > > > > +    unsigned long intr_info = get_vvmcs(v, VM_EXIT_INTR_INFO);
> > > >
> > > > well, above reminds me an interesting question. Combining last and this
> > > > patch, we'd see essentially that it goes back to the state before
> f96e1469
> > > > (at least when Ack on exit is true).
> > >
> > > Well, patch 1/3 is a revert of f96e1469, so just reverting f96e1469
> > > gets us to that state.
> >
> > you are right. I just wanted to point out that this patch alone doesn't
> > do any real fix for ack-on-exit=1 case. It just makes sure that ack-on-exit=0
> > is skipped from that function. So it won't be the one fixing your previous
> > problem. 😊
> 
> Yes, that's correct.
> 
> > >
> > > This patch is an attempt to clarify that nvmx_update_apicv is closely
> > > related to the Ack on exit feature, as it modifies SVI in order to
> > > signal the vector currently being serviced by the EXIT_INTR_INFO vmcs
> > > field. This was not obvious to me, as at first sight I assumed
> > > nvmx_update_apicv was actually injecting that vector into the guest.
> > >
> > > > iirc, that commit was introduced to enable
> > > > nested x2apic with apicv, and your very first version even just removed
> > > > the whole nvmx_update_apicv. Then now with the new reverted logic,
> > > > are you still suffering x2apic problem? If not, does it imply the real fix
> > > > is actually coming from patch 3/3 for eoi bitmap update?
> > >
> > > Yes, patch 3/3 is the one that fixed the issue. Note however that
> > > strangely enough removing the call to nvmx_update_apicv (as my first
> > > attempt to solve the issue) did fix it on one of my boxes. It depends
> > > a lot on the available vmx features AFAICT.
> >
> > Did you confirm that with 3/3 alone can fix that issue? Just want to make
> > sure the real gain of each patch, so we can reflect it in the commit msg
> > in updated version.
> 
> Yes, the patch that actually fixes the issue in the box I've been
> testing with is 3/3.
> 
> Xen will always use the Ack on exit feature, I currently have no way
> to test whether not using Ack on exit works at all.
> 
> > >
> > > > >
> > > > >      if ( reason == EXIT_REASON_EXTERNAL_INTERRUPT &&
> > > > >           nvmx->intr.source == hvm_intsrc_lapic &&
> > > > > @@ -1399,6 +1399,15 @@ static void nvmx_update_apicv(struct vcpu
> *v)
> > > > >          ppr = vlapic_set_ppr(vlapic);
> > > > >          WARN_ON((ppr & 0xf0) != (vector & 0xf0));
> > > > >
> > > > > +        /*
> > > > > +         * SVI must be updated when the interrupt has been signaled
> using
> > > the
> > > > > +         * Ack on exit feature, or else the currently in-service interrupt
> > > > > +         * won't be respected.
> > > > > +         *
> > > > > +         * Note that this is specific to the fact that when doing a VMEXIT
> an
> > > > > +         * interrupt might get delivered using the interrupt info vmcs
> field
> > > > > +         * instead of being injected normally.
> > > > > +         */
> > > >
> > > > I'm not sure mentioning SVI specifically here is necessary, since all steps
> > > > here are required - updating iir, isr, rvi, svi, ppr, etc. It is just an
> emulation
> > > > of updating virtual APIC state as if a virtual interrupt delivery happens.
> > >
> > > Hm, it was hard for me to figure out that SVI is modified here in
> > > order to signal that the Ack on exit vector is currently in service,
> > > as opposed to being injected using the virtual interrupt delivery
> > > mechanism.
> > >
> > > I just wanted to clarify that the purpose of this function is not to
> > > inject the vector in intr_info, but rather to signal that such vector
> > > has already been injected using a different mechanism.
> > >
> > > I'm happy to reword this, but IMO it should be clear that the purpose
> > > of the function is not so much to inject an interrupt, but to update
> > > the virtual interrupt delivery field in order to signal that an
> > > interrupt has been injected using a different mechanism.
> > >
> >
> > reading it again I feel possibly fine to put the comment there. But
> > I disagree the statement above. The purpose of this function is indeed
> > for injecting an interrupt. Both RVI and SVI are additional requirements
> > when virtual APIC page is in-use while virtual interrupt delivery is not
> > used, i.e. when ack-on-exit is set.
> 
> Right, so if I understand this correctly:
> 
>  - SVI: must be updated when Ack on exit is used, to signal the
>    current in service interrupt which has been injected using a
>    mechanism different than virtual interrupts.
> 
>  - RVI: should always be updated if there are pending interrupts to
>    be delivered. Note that the interrupt signaled in the INTR_INFO
>    field if Ack on exit is enabled is not considered pending anymore.
> 
> So I think RVI should always be updated, regardless of whether Ack on
> exit is used or not. Do you agree?
> 

Yes, I agree.

Thanks
Kevin

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Xen-devel] [PATCH 2/3] x86/nvmx: clarify and fix usage of nvmx_update_apicv
  2020-03-24 11:33           ` Tian, Kevin
@ 2020-03-24 12:18             ` Roger Pau Monné
  2020-03-26  5:49               ` Tian, Kevin
  0 siblings, 1 reply; 22+ messages in thread
From: Roger Pau Monné @ 2020-03-24 12:18 UTC (permalink / raw)
  To: Tian, Kevin; +Cc: Nakajima, Jun, xen-devel, Jan Beulich, Wei Liu, Andrew Cooper

On Tue, Mar 24, 2020 at 11:33:00AM +0000, Tian, Kevin wrote:
> > From: Roger Pau Monné <roger.pau@citrix.com>
> > Sent: Tuesday, March 24, 2020 7:23 PM
> > 
> > On Tue, Mar 24, 2020 at 10:11:15AM +0000, Tian, Kevin wrote:
> > > > From: Roger Pau Monné <roger.pau@citrix.com>
> > > > Sent: Tuesday, March 24, 2020 5:51 PM
> > > >
> > > > On Tue, Mar 24, 2020 at 06:03:28AM +0000, Tian, Kevin wrote:
> > > > > > From: Roger Pau Monne <roger.pau@citrix.com>
> > > > > > Sent: Saturday, March 21, 2020 3:08 AM
> > > > > >
> > > > > > The current usage of nvmx_update_apicv is not clear: it is deeply
> > > > > > intertwined with the Ack interrupt on exit VMEXIT control.
> > > > > >
> > > > > > The code in nvmx_update_apicv should update the SVI (in service
> > > > interrupt)
> > > > > > field of the guest interrupt status only when the Ack interrupt on
> > > > > > exit is set, as it is used to record that the interrupt being
> > > > > > serviced is signaled in a vmcs field, and hence hasn't been injected
> > > > > > as on native. It's important to record the current in service
> > > > > > interrupt on the guest interrupt status field, or else further
> > > > > > interrupts won't respect the priority of the in service one.
> > > > > >
> > > > > > While clarifying the usage make sure that the SVI is only updated
> > when
> > > > > > Ack on exit is set and the nested vmcs interrupt info field is valid. Or
> > > > > > else a guest not using the Ack on exit feature would loose interrupts
> > as
> > > > > > they would be signaled as being in service on the guest interrupt
> > > > > > status field but won't actually be recorded on the interrupt info vmcs
> > > > > > field, neither injected in any other way.
> > > > >
> > > > > It is insufficient. You also need to update RVI to enable virtual injection
> > > > > when Ack on exit is cleared.
> > > >
> > > > But RVI should be updated in vmx_intr_assist in that case, since
> > > > nvmx_intr_intercept shouldn't intercept the interrupt, as it should be
> > > > handled normally.
> > >
> > > As we discussed before, vmx_intr_assist is invoked before
> > nvmx_switch_guest.
> > >
> > > It is incorrectly to update RVI at that time since it might be still vmcs02
> > being
> > > active (if no pending softirq to make it invoked again).
> > >
> > > Also nvmx_intr_intercept does intercept Ack-on-exit=0 case:
> > >
> > >         if ( intack.source == hvm_intsrc_pic ||
> > >                  intack.source == hvm_intsrc_lapic )

I've realized that nvmx_intr_intercept will return 1 for interrupts
originating from the lapic or the pic, while nvmx_update_apicv will
only update GUEST_INTR_STATUS for interrupts originating from the
lapic. Is this correct?

Shouldn't both be in sync and handle the same interrupt sources?

Thanks, Roger.


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [Xen-devel] [PATCH 2/3] x86/nvmx: clarify and fix usage of nvmx_update_apicv
  2020-03-24 12:18             ` Roger Pau Monné
@ 2020-03-26  5:49               ` Tian, Kevin
  0 siblings, 0 replies; 22+ messages in thread
From: Tian, Kevin @ 2020-03-26  5:49 UTC (permalink / raw)
  To: Roger Pau Monné
  Cc: Nakajima, Jun, xen-devel, Jan Beulich, Wei Liu, Andrew Cooper

> From: Roger Pau Monné <roger.pau@citrix.com>
> Sent: Tuesday, March 24, 2020 8:19 PM
> 
> On Tue, Mar 24, 2020 at 11:33:00AM +0000, Tian, Kevin wrote:
> > > From: Roger Pau Monné <roger.pau@citrix.com>
> > > Sent: Tuesday, March 24, 2020 7:23 PM
> > >
> > > On Tue, Mar 24, 2020 at 10:11:15AM +0000, Tian, Kevin wrote:
> > > > > From: Roger Pau Monné <roger.pau@citrix.com>
> > > > > Sent: Tuesday, March 24, 2020 5:51 PM
> > > > >
> > > > > On Tue, Mar 24, 2020 at 06:03:28AM +0000, Tian, Kevin wrote:
> > > > > > > From: Roger Pau Monne <roger.pau@citrix.com>
> > > > > > > Sent: Saturday, March 21, 2020 3:08 AM
> > > > > > >
> > > > > > > The current usage of nvmx_update_apicv is not clear: it is deeply
> > > > > > > intertwined with the Ack interrupt on exit VMEXIT control.
> > > > > > >
> > > > > > > The code in nvmx_update_apicv should update the SVI (in service
> > > > > interrupt)
> > > > > > > field of the guest interrupt status only when the Ack interrupt on
> > > > > > > exit is set, as it is used to record that the interrupt being
> > > > > > > serviced is signaled in a vmcs field, and hence hasn't been injected
> > > > > > > as on native. It's important to record the current in service
> > > > > > > interrupt on the guest interrupt status field, or else further
> > > > > > > interrupts won't respect the priority of the in service one.
> > > > > > >
> > > > > > > While clarifying the usage make sure that the SVI is only updated
> > > when
> > > > > > > Ack on exit is set and the nested vmcs interrupt info field is valid.
> Or
> > > > > > > else a guest not using the Ack on exit feature would loose
> interrupts
> > > as
> > > > > > > they would be signaled as being in service on the guest interrupt
> > > > > > > status field but won't actually be recorded on the interrupt info
> vmcs
> > > > > > > field, neither injected in any other way.
> > > > > >
> > > > > > It is insufficient. You also need to update RVI to enable virtual
> injection
> > > > > > when Ack on exit is cleared.
> > > > >
> > > > > But RVI should be updated in vmx_intr_assist in that case, since
> > > > > nvmx_intr_intercept shouldn't intercept the interrupt, as it should be
> > > > > handled normally.
> > > >
> > > > As we discussed before, vmx_intr_assist is invoked before
> > > nvmx_switch_guest.
> > > >
> > > > It is incorrectly to update RVI at that time since it might be still vmcs02
> > > being
> > > > active (if no pending softirq to make it invoked again).
> > > >
> > > > Also nvmx_intr_intercept does intercept Ack-on-exit=0 case:
> > > >
> > > >         if ( intack.source == hvm_intsrc_pic ||
> > > >                  intack.source == hvm_intsrc_lapic )
> 
> I've realized that nvmx_intr_intercept will return 1 for interrupts
> originating from the lapic or the pic, while nvmx_update_apicv will
> only update GUEST_INTR_STATUS for interrupts originating from the
> lapic. Is this correct?

Isn't apicv for virtual lapic instead of virtual pic?

> 
> Shouldn't both be in sync and handle the same interrupt sources?
> 

But I do realize one potential issue about 67f9d0b9, which may break
vpic delivery when ack-on-exit is off. We should always use interrupt
window to handle that situation for vpic. Sorry I didn't catch it when
proposing that change...

Thanks
Kevin

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, back to index

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-20 19:07 [Xen-devel] [PATCH 0/3] x86/nvmx: attempt to fix interrupt injection Roger Pau Monne
2020-03-20 19:07 ` [Xen-devel] [PATCH 1/3] Revert "x86/vvmx: fix virtual interrupt injection when Ack on exit control is used" Roger Pau Monne
2020-03-23  8:09   ` Jan Beulich
2020-03-23 14:48     ` Roger Pau Monné
2020-03-24  5:41       ` Tian, Kevin
2020-03-24  8:10         ` Jan Beulich
2020-03-24  8:49           ` Tian, Kevin
2020-03-24 10:47             ` Roger Pau Monné
2020-03-24 11:00               ` Tian, Kevin
2020-03-20 19:07 ` [Xen-devel] [PATCH 2/3] x86/nvmx: clarify and fix usage of nvmx_update_apicv Roger Pau Monne
2020-03-24  6:03   ` Tian, Kevin
2020-03-24  9:50     ` Roger Pau Monné
2020-03-24 10:11       ` Tian, Kevin
2020-03-24 11:22         ` Roger Pau Monné
2020-03-24 11:33           ` Tian, Kevin
2020-03-24 12:18             ` Roger Pau Monné
2020-03-26  5:49               ` Tian, Kevin
2020-03-20 19:07 ` [Xen-devel] [PATCH 3/3] x86/nvmx: update exit bitmap on vmexit Roger Pau Monne
2020-03-24  6:22   ` Tian, Kevin
2020-03-24  9:59     ` Roger Pau Monné
2020-03-24 10:16       ` Tian, Kevin
2020-03-24 11:24         ` Roger Pau Monné

Xen-Devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/xen-devel/0 xen-devel/git/0.git
	git clone --mirror https://lore.kernel.org/xen-devel/1 xen-devel/git/1.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 xen-devel xen-devel/ https://lore.kernel.org/xen-devel \
		xen-devel@lists.xenproject.org xen-devel@lists.xen.org
	public-inbox-index xen-devel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.xenproject.lists.xen-devel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git