xen-devel.lists.xenproject.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] x86/apicv: fix RTC periodic timer and apicv issue
@ 2016-09-20 13:30 Xuquan (Euler)
  2016-09-23 15:33 ` Jan Beulich
  0 siblings, 1 reply; 23+ messages in thread
From: Xuquan (Euler) @ 2016-09-20 13:30 UTC (permalink / raw)
  To: xen-devel
  Cc: yang.zhang.wz, Kevin Tian, jbeulich, George.Dunlap,
	Andrew Cooper, Hanweidong (Randy),
	Jiangyifei, Nakajima, Jun

[-- Attachment #1: Type: text/plain, Size: 4047 bytes --]

>From 97760602b5c94745e76ed78d23e8fdf9988d234e Mon Sep 17 00:00:00 2001
From: Quan Xu <xuquan8@huawei.com>
Date: Tue, 20 Sep 2016 21:12:54 +0800
Subject: [PATCH v2] x86/apicv: fix RTC periodic timer and apicv issue

When Xen apicv is enabled, wall clock time is faster on Windows7-32
guest with high payload (with 2vCPU, captured from xentrace, in
high payload, the count of IPI interrupt increases rapidly between
these vCPUs).

If IPI intrrupt (vector 0xe1) and periodic timer interrupt (vector 0xd1)
are both pending (index of bit set in vIRR), unfortunately, the IPI
intrrupt is high priority than periodic timer interrupt. Xen updates
IPI interrupt bit set in vIRR to guest interrupt status (RVI) as a high
priority and apicv (Virtual-Interrupt Delivery) delivers IPI interrupt
within VMX non-root operation without a VM-Exit. Within VMX non-root
operation, if periodic timer interrupt index of bit is set in vIRR and
highest, the apicv delivers periodic timer interrupt within VMX non-root
operation as well.

But in current code, if Xen doesn't update periodic timer interrupt bit
set in vIRR to guest interrupt status (RVI) directly, Xen is not aware
of this case to decrease the count (pending_intr_nr) of pending periodic
timer interrupt, then Xen will deliver a periodic timer interrupt again.

And that we update periodic timer interrupt in every VM-entry, there is
a chance that already-injected instance (before EOI-induced exit happens)
will incur another pending IRR setting if there is a VM-exit happens
between virtual interrupt injection (vIRR->0, vISR->1) and EOI-induced
exit (vISR->0), since pt_intr_post hasn't been invoked yet, then the
guest receives more periodic timer interrupt.

So change to pt_intr_post in EOI-induced exit handler and skip periodic
timer when it is not be completely consumed (irq_issued is ture).

Signed-off-by: Yifei Jiang <jiangyifei@huawei.com>
Signed-off-by: Rongguang He <herongguang.he@huawei.com>
Signed-off-by: Quan Xu <xuquan8@huawei.com>

---
v2:
  -change to pt_intr_post in EOI-induced exit handler.
  -skip periodic timer when it is not be completely consumed
   (irq_issued is ture).
---
 xen/arch/x86/hvm/vlapic.c   | 6 ++++++
 xen/arch/x86/hvm/vmx/intr.c | 2 --
 xen/arch/x86/hvm/vpt.c      | 3 ++-
 3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index 1d5d287..f83d6ab 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -433,6 +433,12 @@ void vlapic_EOI_set(struct vlapic *vlapic)
 void vlapic_handle_EOI(struct vlapic *vlapic, u8 vector)
 {
     struct domain *d = vlapic_domain(vlapic);
+    struct vcpu *v = vlapic_vcpu(vlapic);
+    struct hvm_intack pt_intack;
+
+    pt_intack.vector = vector;
+    pt_intack.source = hvm_intsrc_lapic;
+    pt_intr_post(v, pt_intack);

     if ( vlapic_test_and_clear_vector(vector, &vlapic->regs->data[APIC_TMR]) )
         vioapic_update_EOI(d, vector);
diff --git a/xen/arch/x86/hvm/vmx/intr.c b/xen/arch/x86/hvm/vmx/intr.c
index 8fca08c..29d9bbf 100644
--- a/xen/arch/x86/hvm/vmx/intr.c
+++ b/xen/arch/x86/hvm/vmx/intr.c
@@ -333,8 +333,6 @@ void vmx_intr_assist(void)
             clear_bit(i, &v->arch.hvm_vmx.eoi_exitmap_changed);
             __vmwrite(EOI_EXIT_BITMAP(i), v->arch.hvm_vmx.eoi_exit_bitmap[i]);
         }
-
-        pt_intr_post(v, intack);
     }
     else
     {
diff --git a/xen/arch/x86/hvm/vpt.c b/xen/arch/x86/hvm/vpt.c
index 5c48fdb..a9da436 100644
--- a/xen/arch/x86/hvm/vpt.c
+++ b/xen/arch/x86/hvm/vpt.c
@@ -252,7 +252,8 @@ int pt_update_irq(struct vcpu *v)
             }
             else
             {
-                if ( (pt->last_plt_gtime + pt->period) < max_lag )
+                if ( (pt->last_plt_gtime + pt->period) < max_lag &&
+                     !pt->irq_issued )
                 {
                     max_lag = pt->last_plt_gtime + pt->period;
                     earliest_pt = pt;
--
1.8.3.4


[-- Attachment #2: 0001-x86-apicv-fix-RTC-periodic-timer-and-apicv-issue.patch --]
[-- Type: application/octet-stream, Size: 3951 bytes --]

From 97760602b5c94745e76ed78d23e8fdf9988d234e Mon Sep 17 00:00:00 2001
From: Quan Xu <xuquan8@huawei.com>
Date: Tue, 20 Sep 2016 21:12:54 +0800
Subject: [PATCH v2] x86/apicv: fix RTC periodic timer and apicv issue

When Xen apicv is enabled, wall clock time is faster on Windows7-32
guest with high payload (with 2vCPU, captured from xentrace, in
high payload, the count of IPI interrupt increases rapidly between
these vCPUs).

If IPI intrrupt (vector 0xe1) and periodic timer interrupt (vector 0xd1)
are both pending (index of bit set in vIRR), unfortunately, the IPI
intrrupt is high priority than periodic timer interrupt. Xen updates
IPI interrupt bit set in vIRR to guest interrupt status (RVI) as a high
priority and apicv (Virtual-Interrupt Delivery) delivers IPI interrupt
within VMX non-root operation without a VM-Exit. Within VMX non-root
operation, if periodic timer interrupt index of bit is set in vIRR and
highest, the apicv delivers periodic timer interrupt within VMX non-root
operation as well.

But in current code, if Xen doesn't update periodic timer interrupt bit
set in vIRR to guest interrupt status (RVI) directly, Xen is not aware
of this case to decrease the count (pending_intr_nr) of pending periodic
timer interrupt, then Xen will deliver a periodic timer interrupt again.

And that we update periodic timer interrupt in every VM-entry, there is
a chance that already-injected instance (before EOI-induced exit happens)
will incur another pending IRR setting if there is a VM-exit happens
between virtual interrupt injection (vIRR->0, vISR->1) and EOI-induced
exit (vISR->0), since pt_intr_post hasn't been invoked yet, then the
guest receives more periodic timer interrupt.

So change to pt_intr_post in EOI-induced exit handler and skip periodic
timer when it is not be completely consumed (irq_issued is ture).

Signed-off-by: Yifei Jiang <jiangyifei@huawei.com>
Signed-off-by: Rongguang He <herongguang.he@huawei.com>
Signed-off-by: Quan Xu <xuquan8@huawei.com>

---
v2:
  -change to pt_intr_post in EOI-induced exit handler.
  -skip periodic timer when it is not be completely consumed
   (irq_issued is ture).
---
 xen/arch/x86/hvm/vlapic.c   | 6 ++++++
 xen/arch/x86/hvm/vmx/intr.c | 2 --
 xen/arch/x86/hvm/vpt.c      | 3 ++-
 3 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
index 1d5d287..f83d6ab 100644
--- a/xen/arch/x86/hvm/vlapic.c
+++ b/xen/arch/x86/hvm/vlapic.c
@@ -433,6 +433,12 @@ void vlapic_EOI_set(struct vlapic *vlapic)
 void vlapic_handle_EOI(struct vlapic *vlapic, u8 vector)
 {
     struct domain *d = vlapic_domain(vlapic);
+    struct vcpu *v = vlapic_vcpu(vlapic);
+    struct hvm_intack pt_intack;
+
+    pt_intack.vector = vector;
+    pt_intack.source = hvm_intsrc_lapic;
+    pt_intr_post(v, pt_intack);
 
     if ( vlapic_test_and_clear_vector(vector, &vlapic->regs->data[APIC_TMR]) )
         vioapic_update_EOI(d, vector);
diff --git a/xen/arch/x86/hvm/vmx/intr.c b/xen/arch/x86/hvm/vmx/intr.c
index 8fca08c..29d9bbf 100644
--- a/xen/arch/x86/hvm/vmx/intr.c
+++ b/xen/arch/x86/hvm/vmx/intr.c
@@ -333,8 +333,6 @@ void vmx_intr_assist(void)
             clear_bit(i, &v->arch.hvm_vmx.eoi_exitmap_changed);
             __vmwrite(EOI_EXIT_BITMAP(i), v->arch.hvm_vmx.eoi_exit_bitmap[i]);
         }
-
-        pt_intr_post(v, intack);
     }
     else
     {
diff --git a/xen/arch/x86/hvm/vpt.c b/xen/arch/x86/hvm/vpt.c
index 5c48fdb..a9da436 100644
--- a/xen/arch/x86/hvm/vpt.c
+++ b/xen/arch/x86/hvm/vpt.c
@@ -252,7 +252,8 @@ int pt_update_irq(struct vcpu *v)
             }
             else
             {
-                if ( (pt->last_plt_gtime + pt->period) < max_lag )
+                if ( (pt->last_plt_gtime + pt->period) < max_lag &&
+                     !pt->irq_issued )
                 {
                     max_lag = pt->last_plt_gtime + pt->period;
                     earliest_pt = pt;
-- 
1.8.3.4


[-- Attachment #3: Type: text/plain, Size: 127 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply related	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2016-10-26 12:02 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-09-20 13:30 [PATCH v2] x86/apicv: fix RTC periodic timer and apicv issue Xuquan (Euler)
2016-09-23 15:33 ` Jan Beulich
2016-09-23 23:34   ` Tian, Kevin
2016-09-24  1:06     ` Xuquan (Euler)
2016-09-26  6:39       ` Jan Beulich
2016-10-10  9:21         ` Xuquan (Quan Xu)
2016-10-10  9:39           ` Jan Beulich
2016-10-10 10:49             ` Xuquan (Quan Xu)
2016-10-11  7:48               ` Tian, Kevin
2016-10-11 11:12                 ` Xuquan (Quan Xu)
2016-10-17  9:27                 ` Xuquan (Quan Xu)
2016-10-21  8:43                   ` Tian, Kevin
2016-10-24  7:02                   ` Tian, Kevin
2016-10-25  8:36                     ` Xuquan (Quan Xu)
2016-10-25 13:01                       ` Konrad Rzeszutek Wilk
2016-10-26  8:42                         ` Xuquan (Quan Xu)
2016-10-26  5:19                       ` Tian, Kevin
2016-10-26  8:38                         ` Xuquan (Quan Xu)
2016-10-26  9:35                           ` Tian, Kevin
2016-10-26 12:02                             ` Xuquan (Quan Xu)
2016-10-25  9:01                     ` Xuquan (Quan Xu)
2016-09-26  6:35     ` Jan Beulich
2016-09-26 19:34       ` Tian, Kevin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).