All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Xuquan (Euler)" <xuquan8@huawei.com>
To: "Tian, Kevin" <kevin.tian@intel.com>,
	"xen-devel@lists.xen.org" <xen-devel@lists.xen.org>
Cc: "yang.zhang.wz@gmail.com" <yang.zhang.wz@gmail.com>,
	"jbeulich@suse.com" <jbeulich@suse.com>,
	"George.Dunlap@eu.citrix.com" <George.Dunlap@eu.citrix.com>,
	Andrew Cooper <andrew.cooper3@citrix.com>,
	"Hanweidong (Randy)" <hanweidong@huawei.com>,
	"Nakajima, Jun" <jun.nakajima@intel.com>
Subject: Re: [RFC PATCH] x86/apicv: fix RTC periodic timer and apicv issue
Date: Tue, 6 Sep 2016 08:53:21 +0000	[thread overview]
Message-ID: <E0A769A898ADB6449596C41F51EF62C6AA3D8E@SZXEMI506-MBX.china.huawei.com> (raw)
In-Reply-To: <AADFC41AFE54684AB9EE6CBC0274A5D18DEBAC55@SHSMSX101.ccr.corp.intel.com>

On August 30, 2016 1:58 PM, Tian, Kevin <kevin.tian@intel.com> wrote:
>> From: Xuquan (Euler) [mailto:xuquan8@huawei.com]
>> Sent: Friday, August 19, 2016 8:59 PM
>>
>> When Xen apicv is enabled, wall clock time is faster on Windows7-32
>> guest with high payload (with 2vCPU, captured from xentrace, in high
>> payload, the count of IPI interrupt increases rapidly between these vCPUs).
>>
>> If IPI intrrupt (vector 0xe1) and periodic timer interrupt (vector
>> 0xd1) are both pending (index of bit set in VIRR), unfortunately, the
>> IPI intrrupt is high priority than periodic timer interrupt. Xen
>> updates IPI interrupt bit set in VIRR to guest interrupt status (RVI)
>> as a high priority and apicv (Virtual-Interrupt Delivery) delivers IPI
>> interrupt within VMX non-root operation without a VM exit. Within VMX
>> non-root operation, if periodic timer interrupt index of bit is set in
>> VIRR and highest, the apicv delivers periodic timer interrupt within VMX
>non-root operation as well.
>>
>> But in current code, if Xen doesn't update periodic timer interrupt
>> bit set in VIRR to guest interrupt status (RVI) directly, Xen is not
>> aware of this case to decrease the count (pending_intr_nr) of pending
>> periodic timer interrupt, then Xen will deliver a periodic timer
>> interrupt again. The guest receives more periodic timer interrupt.
>>
>> If the periodic timer interrut is delivered and not the highest
>> priority, make Xen be aware of this case to decrease the count of
>> pending periodic timer interrupt.
>>
>> Signed-off-by: Yifei Jiang <jiangyifei@huawei.com>
>> Signed-off-by: Rongguang He <herongguang.he@huawei.com>
>> Signed-off-by: Quan Xu <xuquan8@huawei.com>
>> ---
>> Why RFC:
>> 1. I am not quite sure for other cases, such as nested case.
>> 2. Within VMX non-root operation, an Asynchronous Enclave Exit (including
>external
>>    interrupts, non-maskable interrupt system-management interrrupts,
>exceptions
>>    and VM exit) may occur before delivery of a periodic timer interrupt, the
>periodic
>>    timer interrupt may be lost when a coming periodic timer interrupt is
>delivered.
>>    Actually, and so current code is.
>> ---
>>  xen/arch/x86/hvm/vmx/intr.c | 16 +++++++++++++++-
>>  1 file changed, 15 insertions(+), 1 deletion(-)
>>
>> diff --git a/xen/arch/x86/hvm/vmx/intr.c b/xen/arch/x86/hvm/vmx/intr.c
>> index 8fca08c..d3a034e 100644
>> --- a/xen/arch/x86/hvm/vmx/intr.c
>> +++ b/xen/arch/x86/hvm/vmx/intr.c
>> @@ -334,7 +334,21 @@ void vmx_intr_assist(void)
>>              __vmwrite(EOI_EXIT_BITMAP(i),
>v->arch.hvm_vmx.eoi_exit_bitmap[i]);
>>          }
>>
>> -        pt_intr_post(v, intack);
>> +       /*
>> +        * If the periodic timer interrut is delivered and not the highest
>priority,
>> +        * make Xen be aware of this case to decrease the count of pending
>periodic
>> +        * timer interrupt.
>> +        */
>> +        if ( pt_vector != -1 && intack.vector > pt_vector )
>> +        {
>> +            struct hvm_intack pt_intack;
>> +
>> +            pt_intack.vector = pt_vector;
>> +            pt_intack.source = hvm_intsrc_lapic;
>> +            pt_intr_post(v, pt_intack);
>> +        }
>> +        else
>> +            pt_intr_post(v, intack);
>>      }
>
>Here is my thought. w/o above patch one pending pt interrupt may result in
>multiple injections of guest timer interrupt, as you identified, when pt_vector
>happens to be not the highest pending vector. Then w/ your patch, multiple
>pending pt interrupt instances may be combined as one injection of guest timer
>interrupt. Especially intack.vector
>>pt_vector doesn't mean pt_intr is eligible for injection, which might
>be blocked due to other reasons. As Jan pointed out, this path is very fragile, so
>better we can have a fix to sustain the original behavior with one pending pt
>instance causing one actual injection.
>

Agreed. Good summary..

>What about doing pt_intr_post in EOI-induced VM-exit instead? EOI exit bit for
>pt_intr is already set to 1 which means every injection would incur an
>EOI-induced VM-exit:
>
>       /*
>        * Set eoi_exit_bitmap for periodic timer interrup to cause EOI-induced
>VM
>        * exit, then pending periodic time interrups have the chance to be
>injected
>        * for compensation
>        */
>        if (pt_vector != -1)
>            vmx_set_eoi_exit_bitmap(v, pt_vector);
>
>I don't think delaying post makes much difference here. Even we do post earlier
>like your patch, further pending instances have to wait until current instance is
>completed. So as long as post happens before EOI is completed, it should be
>fine.

Although I really wanted to fix here, I gave up.. 
We need to try our best to narrow down this 'window' -- from pt_update_irq() to pt_intr_post()..
I think It is too later to call pt_intr_post() in EOI-induced VM-exit, _iiuc_, we cannot make sure that each PT interrupt injection incurs an EOI-induced VM-exit before the next VM entry..
then one pending pt interrupt may result in multiple injections of guest timer interrupt.. 

to be honest, my fix is also a tradeoff.. w/ my patch, multiple pending pt interrupt instances may be combined as one injection of guest timer interrupt..
 but as Yang said, It should be ok since it happens rarely and even native OS will encounter the same problem if it disable the interrupt for too long.



Thanks for your comments!!

Quan
Huawei 2012 lab

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

  reply	other threads:[~2016-09-06  8:53 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-19 12:58 [RFC PATCH] x86/apicv: fix RTC periodic timer and apicv issue Xuquan (Euler)
2016-08-22  9:38 ` Yang Zhang
2016-08-22 10:52   ` Xuquan (Euler)
2016-08-22 10:35 ` Jan Beulich
2016-08-22 11:40   ` Yang Zhang
2016-08-22 11:52     ` Xuquan (Euler)
2016-08-22 12:01     ` Jan Beulich
2016-08-22 11:41   ` Xuquan (Euler)
2016-08-22 12:04     ` Jan Beulich
2016-08-22 13:09       ` Yang Zhang
2016-08-22 13:18         ` Xuquan (Euler)
2016-08-22 14:57         ` Jan Beulich
2016-08-23  2:11           ` Yang Zhang
2016-08-22 14:02       ` Xuquan (Euler)
2016-08-22 14:07         ` Yang Zhang
2016-08-22 14:15         ` Yang Zhang
2016-08-22 15:11         ` Jan Beulich
2016-08-23  0:40           ` Xuquan (Euler)
2016-08-30  5:58 ` Tian, Kevin
2016-09-06  8:53   ` Xuquan (Euler) [this message]
2016-09-06  9:04     ` Jan Beulich
2016-09-06 11:22       ` Xuquan (Euler)
2016-09-06 13:20         ` Jan Beulich
2016-09-07  0:47           ` Xuquan (Euler)
2016-09-09  3:02   ` Xuquan (Euler)
2016-09-12  7:57     ` Tian, Kevin
2016-09-12  9:07       ` Xuquan (Euler)
2016-09-19  9:25         ` Tian, Kevin
2016-09-20  0:25           ` Xuquan (Euler)
2016-09-20  0:46             ` Tian, Kevin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E0A769A898ADB6449596C41F51EF62C6AA3D8E@SZXEMI506-MBX.china.huawei.com \
    --to=xuquan8@huawei.com \
    --cc=George.Dunlap@eu.citrix.com \
    --cc=andrew.cooper3@citrix.com \
    --cc=hanweidong@huawei.com \
    --cc=jbeulich@suse.com \
    --cc=jun.nakajima@intel.com \
    --cc=kevin.tian@intel.com \
    --cc=xen-devel@lists.xen.org \
    --cc=yang.zhang.wz@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.