From: "Tian, Kevin" <kevin.tian@intel.com>
To: "Xuquan (Euler)" <xuquan8@huawei.com>,
"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Cc: "yang.zhang.wz@gmail.com" <yang.zhang.wz@gmail.com>,
"jbeulich@suse.com" <jbeulich@suse.com>,
"George.Dunlap@eu.citrix.com" <George.Dunlap@eu.citrix.com>,
Andrew Cooper <andrew.cooper3@citrix.com>,
"Nakajima, Jun" <jun.nakajima@intel.com>
Subject: Re: [RFC PATCH] x86/apicv: fix RTC periodic timer and apicv issue
Date: Mon, 12 Sep 2016 07:57:44 +0000 [thread overview]
Message-ID: <AADFC41AFE54684AB9EE6CBC0274A5D18DED4739@SHSMSX101.ccr.corp.intel.com> (raw)
In-Reply-To: <E0A769A898ADB6449596C41F51EF62C6AA3E8E@SZXEMI506-MBX.china.huawei.com>
> From: Xuquan (Euler) [mailto:xuquan8@huawei.com]
> Sent: Friday, September 09, 2016 11:02 AM
>
> On August 30, 2016 1:58 PM, Tian Kevin < kevin.tian@intel.com > wrote:
> >> From: Xuquan (Euler) [mailto:xuquan8@huawei.com]
> >> Sent: Friday, August 19, 2016 8:59 PM
> >>
> >> When Xen apicv is enabled, wall clock time is faster on Windows7-32
> >> guest with high payload (with 2vCPU, captured from xentrace, in high
> >> payload, the count of IPI interrupt increases rapidly between these vCPUs).
> >>
> >> If IPI intrrupt (vector 0xe1) and periodic timer interrupt (vector
> >> 0xd1) are both pending (index of bit set in VIRR), unfortunately, the
> >> IPI intrrupt is high priority than periodic timer interrupt. Xen
> >> updates IPI interrupt bit set in VIRR to guest interrupt status (RVI)
> >> as a high priority and apicv (Virtual-Interrupt Delivery) delivers IPI
> >> interrupt within VMX non-root operation without a VM exit. Within VMX
> >> non-root operation, if periodic timer interrupt index of bit is set in
> >> VIRR and highest, the apicv delivers periodic timer interrupt within VMX
> >non-root operation as well.
> >>
> >> But in current code, if Xen doesn't update periodic timer interrupt
> >> bit set in VIRR to guest interrupt status (RVI) directly, Xen is not
> >> aware of this case to decrease the count (pending_intr_nr) of pending
> >> periodic timer interrupt, then Xen will deliver a periodic timer
> >> interrupt again. The guest receives more periodic timer interrupt.
> >>
> >> If the periodic timer interrut is delivered and not the highest
> >> priority, make Xen be aware of this case to decrease the count of
> >> pending periodic timer interrupt.
> >>
> >> Signed-off-by: Yifei Jiang <jiangyifei@huawei.com>
> >> Signed-off-by: Rongguang He <herongguang.he@huawei.com>
> >> Signed-off-by: Quan Xu <xuquan8@huawei.com>
> >> ---
> >> Why RFC:
> >> 1. I am not quite sure for other cases, such as nested case.
> >> 2. Within VMX non-root operation, an Asynchronous Enclave Exit (including
> >external
> >> interrupts, non-maskable interrupt system-management interrrupts,
> >exceptions
> >> and VM exit) may occur before delivery of a periodic timer interrupt, the
> >periodic
> >> timer interrupt may be lost when a coming periodic timer interrupt is
> >delivered.
> >> Actually, and so current code is.
> >> ---
> >> xen/arch/x86/hvm/vmx/intr.c | 16 +++++++++++++++-
> >> 1 file changed, 15 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/xen/arch/x86/hvm/vmx/intr.c b/xen/arch/x86/hvm/vmx/intr.c
> >> index 8fca08c..d3a034e 100644
> >> --- a/xen/arch/x86/hvm/vmx/intr.c
> >> +++ b/xen/arch/x86/hvm/vmx/intr.c
> >> @@ -334,7 +334,21 @@ void vmx_intr_assist(void)
> >> __vmwrite(EOI_EXIT_BITMAP(i),
> >v->arch.hvm_vmx.eoi_exit_bitmap[i]);
> >> }
> >>
> >> - pt_intr_post(v, intack);
> >> + /*
> >> + * If the periodic timer interrut is delivered and not the highest
> >priority,
> >> + * make Xen be aware of this case to decrease the count of pending
> >periodic
> >> + * timer interrupt.
> >> + */
> >> + if ( pt_vector != -1 && intack.vector > pt_vector )
> >> + {
> >> + struct hvm_intack pt_intack;
> >> +
> >> + pt_intack.vector = pt_vector;
> >> + pt_intack.source = hvm_intsrc_lapic;
> >> + pt_intr_post(v, pt_intack);
> >> + }
> >> + else
> >> + pt_intr_post(v, intack);
> >> }
> >
> >Here is my thought. w/o above patch one pending pt interrupt may result in
> >multiple injections of guest timer interrupt, as you identified, when pt_vector
> >happens to be not the highest pending vector. Then w/ your patch, multiple
> >pending pt interrupt instances may be combined as one injection of guest timer
> >interrupt. Especially intack.vector
> >>pt_vector doesn't mean pt_intr is eligible for injection, which might
> >be blocked due to other reasons. As Jan pointed out, this path is very fragile, so
> >better we can have a fix to sustain the original behavior with one pending pt
> >instance causing one actual injection.
> >
> >What about doing pt_intr_post in EOI-induced VM-exit instead? EOI exit bit for
> >pt_intr is already set to 1 which means every injection would incur an
> >EOI-induced VM-exit:
> >
> > /*
> > * Set eoi_exit_bitmap for periodic timer interrup to cause EOI-induced
> >VM
> > * exit, then pending periodic time interrups have the chance to be
> >injected
> > * for compensation
> > */
> > if (pt_vector != -1)
> > vmx_set_eoi_exit_bitmap(v, pt_vector);
> >
> >I don't think delaying post makes much difference here. Even we do post earlier
> >like your patch, further pending instances have to wait until current instance is
> >completed. So as long as post happens before EOI is completed, it should be
> >fine.
>
> Kevin, I verify your suggestion with below modification. wall clock time is __still__ faster.
> I hope this modification is correct to your suggestion.
>
> I __guess__ that the vm-entry is more frequent than PT interrupt,
> Specifically, if there is a PT interrupt pending, the count (pending_intr_nr) is 1..
>
> before next PT interrupt coming, each PT interrupt injection may not incur an EOI-induced
> VM-exit directly, multiple PT interrupt inject for a pending PT interrupt,
> then multiple EOI-induced VM-exit incur with multiple pt_intr_post() to decrease the count
> (pending_intr_nr), but the count (pending_intr_nr) is still 1 before next PT interrupt coming,
> then only one pt_intr_post() is effective..
I don't quite understand your description here, but just for your patch see below...
>
> Thanks for your time!!
> Quan
>
> ======= modification ========
>
> diff --git a/xen/arch/x86/hvm/vlapic.c b/xen/arch/x86/hvm/vlapic.c
> index 1d5d287..cc247c3 100644
> --- a/xen/arch/x86/hvm/vlapic.c
> +++ b/xen/arch/x86/hvm/vlapic.c
> @@ -433,6 +433,11 @@ void vlapic_EOI_set(struct vlapic *vlapic)
> void vlapic_handle_EOI(struct vlapic *vlapic, u8 vector)
> {
> struct domain *d = vlapic_domain(vlapic);
> + struct hvm_intack pt_intack;
> +
> + pt_intack.vector = vector;
> + pt_intack.source = hvm_intsrc_lapic;
> + pt_intr_post(vlapic_vcpu(vlapic), pt_intack);
>
> if ( vlapic_test_and_clear_vector(vector, &vlapic->regs->data[APIC_TMR]) )
> vioapic_update_EOI(d, vector);
> diff --git a/xen/arch/x86/hvm/vmx/intr.c b/xen/arch/x86/hvm/vmx/intr.c
> index 8fca08c..29d9bbf 100644
> --- a/xen/arch/x86/hvm/vmx/intr.c
> +++ b/xen/arch/x86/hvm/vmx/intr.c
> @@ -333,8 +333,6 @@ void vmx_intr_assist(void)
> clear_bit(i, &v->arch.hvm_vmx.eoi_exitmap_changed);
> __vmwrite(EOI_EXIT_BITMAP(i), v->arch.hvm_vmx.eoi_exit_bitmap[i]);
> }
> -
> - pt_intr_post(v, intack);
> }
> else
> {
>
Because we update pt irq in every vmentry, there is a chance that
already-injected instance (before EOI-induced exit happens) will
incur another pending IRR setting if there is a VM-exit happens
between HW virtual interrupt injection (vIRR->0, vISR->1) and
EOI-induced exit (vISR->0), since pt_intr_post hasn't been invoked
yet. I guess this is the reason why you still see faster wallclock.
I think you need mark this pending_intr_post situation explicitly.
Then pt_update_irq should skip such pt timer when pending_intr_post
of that timer is true (otherwise the update is meaningless since
previous one hasn't been posted yet). Then with your change to post
in EOI-induced exit handler, it should work correctly to meet the goal
- one virtual interrupt delivery for one pending pt intr...
Thanks
Kevin
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel
next prev parent reply other threads:[~2016-09-12 7:57 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-08-19 12:58 [RFC PATCH] x86/apicv: fix RTC periodic timer and apicv issue Xuquan (Euler)
2016-08-22 9:38 ` Yang Zhang
2016-08-22 10:52 ` Xuquan (Euler)
2016-08-22 10:35 ` Jan Beulich
2016-08-22 11:40 ` Yang Zhang
2016-08-22 11:52 ` Xuquan (Euler)
2016-08-22 12:01 ` Jan Beulich
2016-08-22 11:41 ` Xuquan (Euler)
2016-08-22 12:04 ` Jan Beulich
2016-08-22 13:09 ` Yang Zhang
2016-08-22 13:18 ` Xuquan (Euler)
2016-08-22 14:57 ` Jan Beulich
2016-08-23 2:11 ` Yang Zhang
2016-08-22 14:02 ` Xuquan (Euler)
2016-08-22 14:07 ` Yang Zhang
2016-08-22 14:15 ` Yang Zhang
2016-08-22 15:11 ` Jan Beulich
2016-08-23 0:40 ` Xuquan (Euler)
2016-08-30 5:58 ` Tian, Kevin
2016-09-06 8:53 ` Xuquan (Euler)
2016-09-06 9:04 ` Jan Beulich
2016-09-06 11:22 ` Xuquan (Euler)
2016-09-06 13:20 ` Jan Beulich
2016-09-07 0:47 ` Xuquan (Euler)
2016-09-09 3:02 ` Xuquan (Euler)
2016-09-12 7:57 ` Tian, Kevin [this message]
2016-09-12 9:07 ` Xuquan (Euler)
2016-09-19 9:25 ` Tian, Kevin
2016-09-20 0:25 ` Xuquan (Euler)
2016-09-20 0:46 ` Tian, Kevin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=AADFC41AFE54684AB9EE6CBC0274A5D18DED4739@SHSMSX101.ccr.corp.intel.com \
--to=kevin.tian@intel.com \
--cc=George.Dunlap@eu.citrix.com \
--cc=andrew.cooper3@citrix.com \
--cc=jbeulich@suse.com \
--cc=jun.nakajima@intel.com \
--cc=xen-devel@lists.xen.org \
--cc=xuquan8@huawei.com \
--cc=yang.zhang.wz@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.