kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] KVM: lapic: restart counter on change to periodic mode
@ 2019-08-19 23:04 Matt delco
  2019-08-19 23:42 ` Paolo Bonzini
  0 siblings, 1 reply; 12+ messages in thread
From: Matt delco @ 2019-08-19 23:04 UTC (permalink / raw)
  To: pbonzini, rkrcmar; +Cc: kvm, Matt Delco

From: Matt Delco <delco@google.com>

Time seems to eventually stop in a Windows VM when using Skype.
Instrumentation shows that the OS is frequently switching the APIC
timer between one-shot and periodic mode.  The OS is typically writing
to both LVTT and TMICT.  When time stops the sequence observed is that
the APIC was in one-shot mode, the timer expired, and the OS writes to
LVTT (but not TMICT) to change to periodic mode.  No future timer events
are received by the OS since the timer is only re-armed on TMICT writes.

With this change time continues to advance in the VM.  TBD if physical
hardware will reset the current count if/when the mode is changed to
period and the current count is zero.

Signed-off-by: Matt Delco <delco@google.com>
---
 arch/x86/kvm/lapic.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 685d17c11461..fddd810eeca5 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1935,14 +1935,19 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
 
 		break;
 
-	case APIC_LVTT:
+	case APIC_LVTT: {
+		u32 timer_mode = apic->lapic_timer.timer_mode;
 		if (!kvm_apic_sw_enabled(apic))
 			val |= APIC_LVT_MASKED;
 		val &= (apic_lvt_mask[0] | apic->lapic_timer.timer_mode_mask);
 		kvm_lapic_set_reg(apic, APIC_LVTT, val);
 		apic_update_lvtt(apic);
+		if (timer_mode == APIC_LVT_TIMER_ONESHOT &&
+		    apic_lvtt_period(apic) &&
+		    !hrtimer_active(&apic->lapic_timer.timer))
+			start_apic_timer(apic);
 		break;
-
+	}
 	case APIC_TMICT:
 		if (apic_lvtt_tscdeadline(apic))
 			break;
-- 
2.23.0.rc1.153.gdeed80330f-goog


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH] KVM: lapic: restart counter on change to periodic mode
  2019-08-19 23:04 [PATCH] KVM: lapic: restart counter on change to periodic mode Matt delco
@ 2019-08-19 23:42 ` Paolo Bonzini
  2019-08-20  0:37   ` Sean Christopherson
  0 siblings, 1 reply; 12+ messages in thread
From: Paolo Bonzini @ 2019-08-19 23:42 UTC (permalink / raw)
  To: Matt delco, rkrcmar; +Cc: kvm

On 20/08/19 01:04, Matt delco wrote:
> From: Matt Delco <delco@google.com>
> 
> Time seems to eventually stop in a Windows VM when using Skype.
> Instrumentation shows that the OS is frequently switching the APIC
> timer between one-shot and periodic mode.  The OS is typically writing
> to both LVTT and TMICT.  When time stops the sequence observed is that
> the APIC was in one-shot mode, the timer expired, and the OS writes to
> LVTT (but not TMICT) to change to periodic mode.  No future timer events
> are received by the OS since the timer is only re-armed on TMICT writes.
> 
> With this change time continues to advance in the VM.  TBD if physical
> hardware will reset the current count if/when the mode is changed to
> period and the current count is zero.
> 
> Signed-off-by: Matt Delco <delco@google.com>
> ---
>  arch/x86/kvm/lapic.c | 9 +++++++--
>  1 file changed, 7 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 685d17c11461..fddd810eeca5 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -1935,14 +1935,19 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
>  
>  		break;
>  
> -	case APIC_LVTT:
> +	case APIC_LVTT: {
> +		u32 timer_mode = apic->lapic_timer.timer_mode;
>  		if (!kvm_apic_sw_enabled(apic))
>  			val |= APIC_LVT_MASKED;
>  		val &= (apic_lvt_mask[0] | apic->lapic_timer.timer_mode_mask);
>  		kvm_lapic_set_reg(apic, APIC_LVTT, val);
>  		apic_update_lvtt(apic);
> +		if (timer_mode == APIC_LVT_TIMER_ONESHOT &&
> +		    apic_lvtt_period(apic) &&
> +		    !hrtimer_active(&apic->lapic_timer.timer))
> +			start_apic_timer(apic);

The manual says "A write to the LVT Timer Register that changes the
timer mode disarms the local APIC timer", but we already know this is
not true (commit dedf9c5e216902c6d34b5a0d0c40f4acbb3706d8).

Still, this needs some more explanation.  Can you cover this, as well as
the oneshot->periodic transition, in kvm-unit-tests' x86/apic.c
testcase?  Then we could try running it on bare metal and see what happens.

Thanks,

Paolo


>  		break;
> -
> +	}
>  	case APIC_TMICT:
>  		if (apic_lvtt_tscdeadline(apic))
>  			break;
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] KVM: lapic: restart counter on change to periodic mode
  2019-08-19 23:42 ` Paolo Bonzini
@ 2019-08-20  0:37   ` Sean Christopherson
       [not found]     ` <CAHGX9VrZyPQ8OxnYnOWg-ES3=kghSx1LSyzrX8i3=O+o0JAsig@mail.gmail.com>
  0 siblings, 1 reply; 12+ messages in thread
From: Sean Christopherson @ 2019-08-20  0:37 UTC (permalink / raw)
  To: Paolo Bonzini; +Cc: Matt delco, rkrcmar, kvm

On Tue, Aug 20, 2019 at 01:42:37AM +0200, Paolo Bonzini wrote:
> On 20/08/19 01:04, Matt delco wrote:
> > From: Matt Delco <delco@google.com>
> > 
> > Time seems to eventually stop in a Windows VM when using Skype.
> > Instrumentation shows that the OS is frequently switching the APIC
> > timer between one-shot and periodic mode.  The OS is typically writing
> > to both LVTT and TMICT.  When time stops the sequence observed is that
> > the APIC was in one-shot mode, the timer expired, and the OS writes to
> > LVTT (but not TMICT) to change to periodic mode.  No future timer events
> > are received by the OS since the timer is only re-armed on TMICT writes.
> > 
> > With this change time continues to advance in the VM.  TBD if physical
> > hardware will reset the current count if/when the mode is changed to
> > period and the current count is zero.
> > 
> > Signed-off-by: Matt Delco <delco@google.com>
> > ---
> >  arch/x86/kvm/lapic.c | 9 +++++++--
> >  1 file changed, 7 insertions(+), 2 deletions(-)
> > 
> > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > index 685d17c11461..fddd810eeca5 100644
> > --- a/arch/x86/kvm/lapic.c
> > +++ b/arch/x86/kvm/lapic.c
> > @@ -1935,14 +1935,19 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
> >  
> >  		break;
> >  
> > -	case APIC_LVTT:
> > +	case APIC_LVTT: {
> > +		u32 timer_mode = apic->lapic_timer.timer_mode;
> >  		if (!kvm_apic_sw_enabled(apic))
> >  			val |= APIC_LVT_MASKED;
> >  		val &= (apic_lvt_mask[0] | apic->lapic_timer.timer_mode_mask);
> >  		kvm_lapic_set_reg(apic, APIC_LVTT, val);
> >  		apic_update_lvtt(apic);
> > +		if (timer_mode == APIC_LVT_TIMER_ONESHOT &&
> > +		    apic_lvtt_period(apic) &&
> > +		    !hrtimer_active(&apic->lapic_timer.timer))
> > +			start_apic_timer(apic);
> 
> The manual says "A write to the LVT Timer Register that changes the
> timer mode disarms the local APIC timer", but we already know this is
> not true (commit dedf9c5e216902c6d34b5a0d0c40f4acbb3706d8).

That was a confirmed SDM bug that has been fixed as of the May 2019
version of the SDM.

> 
> Still, this needs some more explanation.  Can you cover this, as well as
> the oneshot->periodic transition, in kvm-unit-tests' x86/apic.c
> testcase?  Then we could try running it on bare metal and see what happens.

Only transitions to/from deadline should disable the timer, i.e. this
blurb from the SDM was found to be correct.

  Transitioning between TSC-deadline mode and other timer modes also
  disarms the timer.

But yeah, tests are in order, at least for oneshot->periodic and vice
versa.  I can't find any internal code that tests whether transitioning
between oneshot and periodic actually rearms the timer or if it simply
doesn't disable it, and the SDM doesn't clarify what constitutes
"reprogrammed".

If possible, we should also test what happens if APIC_TMCCT != 0, though
that might be tricky and/or fragile.  If the timer is rearmed on a
transition between oneshot and periodic, then I would expect it to happen
for both APIC_TMCCT==0 and APIC_TMCCT!=0.

> 
> Thanks,
> 
> Paolo
> 
> 
> >  		break;
> > -
> > +	}
> >  	case APIC_TMICT:
> >  		if (apic_lvtt_tscdeadline(apic))
> >  			break;
> > 
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] KVM: lapic: restart counter on change to periodic mode
       [not found]     ` <CAHGX9VrZyPQ8OxnYnOWg-ES3=kghSx1LSyzrX8i3=O+o0JAsig@mail.gmail.com>
@ 2019-08-20  1:56       ` Sean Christopherson
  2019-08-20  4:08         ` Nadav Amit
  0 siblings, 1 reply; 12+ messages in thread
From: Sean Christopherson @ 2019-08-20  1:56 UTC (permalink / raw)
  To: Matt Delco; +Cc: Paolo Bonzini, rkrcmar, kvm, Nadav Amit

+Cc Nadav

On Mon, Aug 19, 2019 at 06:07:01PM -0700, Matt Delco wrote:
> On Mon, Aug 19, 2019 at 5:37 PM Sean Christopherson <
> sean.j.christopherson@intel.com> wrote:
> 
> > On Tue, Aug 20, 2019 at 01:42:37AM +0200, Paolo Bonzini wrote:
> > > On 20/08/19 01:04, Matt delco wrote:
> > > > From: Matt Delco <delco@google.com>
> > > >
> > > > Time seems to eventually stop in a Windows VM when using Skype.
> > > > Instrumentation shows that the OS is frequently switching the APIC
> > > > timer between one-shot and periodic mode.  The OS is typically writing
> > > > to both LVTT and TMICT.  When time stops the sequence observed is that
> > > > the APIC was in one-shot mode, the timer expired, and the OS writes to
> > > > LVTT (but not TMICT) to change to periodic mode.  No future timer
> > events
> > > > are received by the OS since the timer is only re-armed on TMICT
> > writes.
> > > >
> > > > With this change time continues to advance in the VM.  TBD if physical
> > > > hardware will reset the current count if/when the mode is changed to
> > > > period and the current count is zero.
> > > >
> > > > Signed-off-by: Matt Delco <delco@google.com>
> > > > ---
> > > >  arch/x86/kvm/lapic.c | 9 +++++++--
> > > >  1 file changed, 7 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > > > index 685d17c11461..fddd810eeca5 100644
> > > > --- a/arch/x86/kvm/lapic.c
> > > > +++ b/arch/x86/kvm/lapic.c
> > > > @@ -1935,14 +1935,19 @@ int kvm_lapic_reg_write(struct kvm_lapic
> > *apic, u32 reg, u32 val)
> > > >
> > > >             break;
> > > >
> > > > -   case APIC_LVTT:
> > > > +   case APIC_LVTT: {
> > > > +           u32 timer_mode = apic->lapic_timer.timer_mode;
> > > >             if (!kvm_apic_sw_enabled(apic))
> > > >                     val |= APIC_LVT_MASKED;
> > > >             val &= (apic_lvt_mask[0] |
> > apic->lapic_timer.timer_mode_mask);
> > > >             kvm_lapic_set_reg(apic, APIC_LVTT, val);
> > > >             apic_update_lvtt(apic);
> > > > +           if (timer_mode == APIC_LVT_TIMER_ONESHOT &&
> > > > +               apic_lvtt_period(apic) &&
> > > > +               !hrtimer_active(&apic->lapic_timer.timer))
> > > > +                   start_apic_timer(apic);
> > >
> > > Still, this needs some more explanation.  Can you cover this, as well as
> > > the oneshot->periodic transition, in kvm-unit-tests' x86/apic.c
> > > testcase?  Then we could try running it on bare metal and see what
> > happens.
> >
> 
> I looked at apic.c and test_apic_change_mode() might already be testing
> this.  It sets oneshot & TMICT, waits for the current value to get
> half-way, changes the mode to periodic, and then tries to test that the
> value wraps back to the upper half.  It then waits again for the half-way
> point, changes the mode back to oneshot, and waits for zero.  After
> reaching zero it does:
> 
> /* now tmcct == 0 and tmict != 0 */
> apic_change_mode(APIC_LVT_TIMER_PERIODIC);
> report("TMCCT should stay at zero", !apic_read(APIC_TMCCT));
> 
> which seems to be testing that oneshot->periodic won't reset the timer if
> it's already zero.  A possible caveat is there's hardly any delay between
> the mode change and the timer read.  Emulated hardware will react
> instantaneously (at least as seen from within the VM), but hardware might
> need more time to react (though offhand I'd expect HW to be fast enough for
> this particular timer).
> 
> So, it looks like the code might already be ready to run on physical
> hardware, and if it has (or does already as part of a regular test), then
> that does raise some doubt on what's the appropriate code change to make
> this work.

Nadav has been running tests on bare metal, maybe he can weigh in on
whether or not test_apic_change_mode() passes on bare metal.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] KVM: lapic: restart counter on change to periodic mode
  2019-08-20  1:56       ` Sean Christopherson
@ 2019-08-20  4:08         ` Nadav Amit
  2019-08-20  5:08           ` Wanpeng Li
  0 siblings, 1 reply; 12+ messages in thread
From: Nadav Amit @ 2019-08-20  4:08 UTC (permalink / raw)
  To: Sean Christopherson; +Cc: Matt Delco, Paolo Bonzini, rkrcmar, kvm

> On Aug 19, 2019, at 6:56 PM, Sean Christopherson <sean.j.christopherson@intel.com> wrote:
> 
> +Cc Nadav
> 
> On Mon, Aug 19, 2019 at 06:07:01PM -0700, Matt Delco wrote:
>> On Mon, Aug 19, 2019 at 5:37 PM Sean Christopherson <
>> sean.j.christopherson@intel.com> wrote:
>> 
>>> On Tue, Aug 20, 2019 at 01:42:37AM +0200, Paolo Bonzini wrote:
>>>> On 20/08/19 01:04, Matt delco wrote:
>>>>> From: Matt Delco <delco@google.com>
>>>>> 
>>>>> Time seems to eventually stop in a Windows VM when using Skype.
>>>>> Instrumentation shows that the OS is frequently switching the APIC
>>>>> timer between one-shot and periodic mode.  The OS is typically writing
>>>>> to both LVTT and TMICT.  When time stops the sequence observed is that
>>>>> the APIC was in one-shot mode, the timer expired, and the OS writes to
>>>>> LVTT (but not TMICT) to change to periodic mode.  No future timer
>>> events
>>>>> are received by the OS since the timer is only re-armed on TMICT
>>> writes.
>>>>> With this change time continues to advance in the VM.  TBD if physical
>>>>> hardware will reset the current count if/when the mode is changed to
>>>>> period and the current count is zero.
>>>>> 
>>>>> Signed-off-by: Matt Delco <delco@google.com>
>>>>> ---
>>>>> arch/x86/kvm/lapic.c | 9 +++++++--
>>>>> 1 file changed, 7 insertions(+), 2 deletions(-)
>>>>> 
>>>>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>>>>> index 685d17c11461..fddd810eeca5 100644
>>>>> --- a/arch/x86/kvm/lapic.c
>>>>> +++ b/arch/x86/kvm/lapic.c
>>>>> @@ -1935,14 +1935,19 @@ int kvm_lapic_reg_write(struct kvm_lapic
>>> *apic, u32 reg, u32 val)
>>>>>            break;
>>>>> 
>>>>> -   case APIC_LVTT:
>>>>> +   case APIC_LVTT: {
>>>>> +           u32 timer_mode = apic->lapic_timer.timer_mode;
>>>>>            if (!kvm_apic_sw_enabled(apic))
>>>>>                    val |= APIC_LVT_MASKED;
>>>>>            val &= (apic_lvt_mask[0] |
>>> apic->lapic_timer.timer_mode_mask);
>>>>>            kvm_lapic_set_reg(apic, APIC_LVTT, val);
>>>>>            apic_update_lvtt(apic);
>>>>> +           if (timer_mode == APIC_LVT_TIMER_ONESHOT &&
>>>>> +               apic_lvtt_period(apic) &&
>>>>> +               !hrtimer_active(&apic->lapic_timer.timer))
>>>>> +                   start_apic_timer(apic);
>>>> 
>>>> Still, this needs some more explanation.  Can you cover this, as well as
>>>> the oneshot->periodic transition, in kvm-unit-tests' x86/apic.c
>>>> testcase?  Then we could try running it on bare metal and see what
>>> happens.
>> 
>> I looked at apic.c and test_apic_change_mode() might already be testing
>> this.  It sets oneshot & TMICT, waits for the current value to get
>> half-way, changes the mode to periodic, and then tries to test that the
>> value wraps back to the upper half.  It then waits again for the half-way
>> point, changes the mode back to oneshot, and waits for zero.  After
>> reaching zero it does:
>> 
>> /* now tmcct == 0 and tmict != 0 */
>> apic_change_mode(APIC_LVT_TIMER_PERIODIC);
>> report("TMCCT should stay at zero", !apic_read(APIC_TMCCT));
>> 
>> which seems to be testing that oneshot->periodic won't reset the timer if
>> it's already zero.  A possible caveat is there's hardly any delay between
>> the mode change and the timer read.  Emulated hardware will react
>> instantaneously (at least as seen from within the VM), but hardware might
>> need more time to react (though offhand I'd expect HW to be fast enough for
>> this particular timer).
>> 
>> So, it looks like the code might already be ready to run on physical
>> hardware, and if it has (or does already as part of a regular test), then
>> that does raise some doubt on what's the appropriate code change to make
>> this work.
> 
> Nadav has been running tests on bare metal, maybe he can weigh in on
> whether or not test_apic_change_mode() passes on bare metal.

These tests pass on bare-metal.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] KVM: lapic: restart counter on change to periodic mode
  2019-08-20  4:08         ` Nadav Amit
@ 2019-08-20  5:08           ` Wanpeng Li
  2019-08-20  7:34             ` Matt Delco
  2019-08-20 16:33             ` Nadav Amit
  0 siblings, 2 replies; 12+ messages in thread
From: Wanpeng Li @ 2019-08-20  5:08 UTC (permalink / raw)
  To: Nadav Amit
  Cc: Sean Christopherson, Matt Delco, Paolo Bonzini, Radim Krcmar, kvm

On Tue, 20 Aug 2019 at 12:10, Nadav Amit <nadav.amit@gmail.com> wrote:
>
> > On Aug 19, 2019, at 6:56 PM, Sean Christopherson <sean.j.christopherson@intel.com> wrote:
> >
> > +Cc Nadav
> >
> > On Mon, Aug 19, 2019 at 06:07:01PM -0700, Matt Delco wrote:
> >> On Mon, Aug 19, 2019 at 5:37 PM Sean Christopherson <
> >> sean.j.christopherson@intel.com> wrote:
> >>
> >>> On Tue, Aug 20, 2019 at 01:42:37AM +0200, Paolo Bonzini wrote:
> >>>> On 20/08/19 01:04, Matt delco wrote:
> >>>>> From: Matt Delco <delco@google.com>
> >>>>>
> >>>>> Time seems to eventually stop in a Windows VM when using Skype.
> >>>>> Instrumentation shows that the OS is frequently switching the APIC
> >>>>> timer between one-shot and periodic mode.  The OS is typically writing
> >>>>> to both LVTT and TMICT.  When time stops the sequence observed is that
> >>>>> the APIC was in one-shot mode, the timer expired, and the OS writes to
> >>>>> LVTT (but not TMICT) to change to periodic mode.  No future timer
> >>> events
> >>>>> are received by the OS since the timer is only re-armed on TMICT
> >>> writes.
> >>>>> With this change time continues to advance in the VM.  TBD if physical
> >>>>> hardware will reset the current count if/when the mode is changed to
> >>>>> period and the current count is zero.
> >>>>>
> >>>>> Signed-off-by: Matt Delco <delco@google.com>
> >>>>> ---
> >>>>> arch/x86/kvm/lapic.c | 9 +++++++--
> >>>>> 1 file changed, 7 insertions(+), 2 deletions(-)
> >>>>>
> >>>>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> >>>>> index 685d17c11461..fddd810eeca5 100644
> >>>>> --- a/arch/x86/kvm/lapic.c
> >>>>> +++ b/arch/x86/kvm/lapic.c
> >>>>> @@ -1935,14 +1935,19 @@ int kvm_lapic_reg_write(struct kvm_lapic
> >>> *apic, u32 reg, u32 val)
> >>>>>            break;
> >>>>>
> >>>>> -   case APIC_LVTT:
> >>>>> +   case APIC_LVTT: {
> >>>>> +           u32 timer_mode = apic->lapic_timer.timer_mode;
> >>>>>            if (!kvm_apic_sw_enabled(apic))
> >>>>>                    val |= APIC_LVT_MASKED;
> >>>>>            val &= (apic_lvt_mask[0] |
> >>> apic->lapic_timer.timer_mode_mask);
> >>>>>            kvm_lapic_set_reg(apic, APIC_LVTT, val);
> >>>>>            apic_update_lvtt(apic);
> >>>>> +           if (timer_mode == APIC_LVT_TIMER_ONESHOT &&
> >>>>> +               apic_lvtt_period(apic) &&
> >>>>> +               !hrtimer_active(&apic->lapic_timer.timer))
> >>>>> +                   start_apic_timer(apic);
> >>>>
> >>>> Still, this needs some more explanation.  Can you cover this, as well as
> >>>> the oneshot->periodic transition, in kvm-unit-tests' x86/apic.c
> >>>> testcase?  Then we could try running it on bare metal and see what
> >>> happens.
> >>
> >> I looked at apic.c and test_apic_change_mode() might already be testing
> >> this.  It sets oneshot & TMICT, waits for the current value to get
> >> half-way, changes the mode to periodic, and then tries to test that the
> >> value wraps back to the upper half.  It then waits again for the half-way
> >> point, changes the mode back to oneshot, and waits for zero.  After
> >> reaching zero it does:
> >>
> >> /* now tmcct == 0 and tmict != 0 */
> >> apic_change_mode(APIC_LVT_TIMER_PERIODIC);
> >> report("TMCCT should stay at zero", !apic_read(APIC_TMCCT));
> >>
> >> which seems to be testing that oneshot->periodic won't reset the timer if
> >> it's already zero.  A possible caveat is there's hardly any delay between
> >> the mode change and the timer read.  Emulated hardware will react
> >> instantaneously (at least as seen from within the VM), but hardware might
> >> need more time to react (though offhand I'd expect HW to be fast enough for
> >> this particular timer).
> >>
> >> So, it looks like the code might already be ready to run on physical
> >> hardware, and if it has (or does already as part of a regular test), then
> >> that does raise some doubt on what's the appropriate code change to make
> >> this work.
> >
> > Nadav has been running tests on bare metal, maybe he can weigh in on
> > whether or not test_apic_change_mode() passes on bare metal.
>
> These tests pass on bare-metal.

Good to know this. In addition, in linux apic driver, during mode
switch __setup_APIC_LVTT() always sets lapic_timer_period(number of
clock cycles per jiffy)/APIC_DIVISOR to APIC_TMICT which can avoid the
issue Matt report. So is it because there is no such stuff in windows
or the windows version which Matt testing is too old?

Regards,
Wanpeng Li

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] KVM: lapic: restart counter on change to periodic mode
  2019-08-20  5:08           ` Wanpeng Li
@ 2019-08-20  7:34             ` Matt Delco
  2019-08-21 17:17               ` Sean Christopherson
  2019-08-20 16:33             ` Nadav Amit
  1 sibling, 1 reply; 12+ messages in thread
From: Matt Delco @ 2019-08-20  7:34 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: Nadav Amit, Sean Christopherson, Paolo Bonzini, Radim Krcmar, kvm

On Mon, Aug 19, 2019 at 10:09 PM Wanpeng Li <kernellwp@gmail.com> wrote:
>
> On Tue, 20 Aug 2019 at 12:10, Nadav Amit <nadav.amit@gmail.com> wrote:
> >
> > > On Aug 19, 2019, at 6:56 PM, Sean Christopherson <sean.j.christopherson@intel.com> wrote:
> > >
> > > +Cc Nadav
> > >
> > > On Mon, Aug 19, 2019 at 06:07:01PM -0700, Matt Delco wrote:
> > >> On Mon, Aug 19, 2019 at 5:37 PM Sean Christopherson <
> > >> sean.j.christopherson@intel.com> wrote:
> > >>
> > >>> On Tue, Aug 20, 2019 at 01:42:37AM +0200, Paolo Bonzini wrote:
> > >>>> On 20/08/19 01:04, Matt delco wrote:
> > >>>>> From: Matt Delco <delco@google.com>
> > >>>>>
> > >>>>> Time seems to eventually stop in a Windows VM when using Skype.
> > >>>>> Instrumentation shows that the OS is frequently switching the APIC
> > >>>>> timer between one-shot and periodic mode.  The OS is typically writing
> > >>>>> to both LVTT and TMICT.  When time stops the sequence observed is that
> > >>>>> the APIC was in one-shot mode, the timer expired, and the OS writes to
> > >>>>> LVTT (but not TMICT) to change to periodic mode.  No future timer
> > >>> events
> > >>>>> are received by the OS since the timer is only re-armed on TMICT
> > >>> writes.
> > >>>>> With this change time continues to advance in the VM.  TBD if physical
> > >>>>> hardware will reset the current count if/when the mode is changed to
> > >>>>> period and the current count is zero.
> > >>>>>
> > >>>>> Signed-off-by: Matt Delco <delco@google.com>
> > >>>>> ---
> > >>>>> arch/x86/kvm/lapic.c | 9 +++++++--
> > >>>>> 1 file changed, 7 insertions(+), 2 deletions(-)
> > >>>>>
> > >>>>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > >>>>> index 685d17c11461..fddd810eeca5 100644
> > >>>>> --- a/arch/x86/kvm/lapic.c
> > >>>>> +++ b/arch/x86/kvm/lapic.c
> > >>>>> @@ -1935,14 +1935,19 @@ int kvm_lapic_reg_write(struct kvm_lapic
> > >>> *apic, u32 reg, u32 val)
> > >>>>>            break;
> > >>>>>
> > >>>>> -   case APIC_LVTT:
> > >>>>> +   case APIC_LVTT: {
> > >>>>> +           u32 timer_mode = apic->lapic_timer.timer_mode;
> > >>>>>            if (!kvm_apic_sw_enabled(apic))
> > >>>>>                    val |= APIC_LVT_MASKED;
> > >>>>>            val &= (apic_lvt_mask[0] |
> > >>> apic->lapic_timer.timer_mode_mask);
> > >>>>>            kvm_lapic_set_reg(apic, APIC_LVTT, val);
> > >>>>>            apic_update_lvtt(apic);
> > >>>>> +           if (timer_mode == APIC_LVT_TIMER_ONESHOT &&
> > >>>>> +               apic_lvtt_period(apic) &&
> > >>>>> +               !hrtimer_active(&apic->lapic_timer.timer))
> > >>>>> +                   start_apic_timer(apic);
> > >>>>
> > >>>> Still, this needs some more explanation.  Can you cover this, as well as
> > >>>> the oneshot->periodic transition, in kvm-unit-tests' x86/apic.c
> > >>>> testcase?  Then we could try running it on bare metal and see what
> > >>> happens.
> > >>
> > >> I looked at apic.c and test_apic_change_mode() might already be testing
> > >> this.  It sets oneshot & TMICT, waits for the current value to get
> > >> half-way, changes the mode to periodic, and then tries to test that the
> > >> value wraps back to the upper half.  It then waits again for the half-way
> > >> point, changes the mode back to oneshot, and waits for zero.  After
> > >> reaching zero it does:
> > >>
> > >> /* now tmcct == 0 and tmict != 0 */
> > >> apic_change_mode(APIC_LVT_TIMER_PERIODIC);
> > >> report("TMCCT should stay at zero", !apic_read(APIC_TMCCT));
> > >>
> > >> which seems to be testing that oneshot->periodic won't reset the timer if
> > >> it's already zero.  A possible caveat is there's hardly any delay between
> > >> the mode change and the timer read.  Emulated hardware will react
> > >> instantaneously (at least as seen from within the VM), but hardware might
> > >> need more time to react (though offhand I'd expect HW to be fast enough for
> > >> this particular timer).
> > >>
> > >> So, it looks like the code might already be ready to run on physical
> > >> hardware, and if it has (or does already as part of a regular test), then
> > >> that does raise some doubt on what's the appropriate code change to make
> > >> this work.
> > >
> > > Nadav has been running tests on bare metal, maybe he can weigh in on
> > > whether or not test_apic_change_mode() passes on bare metal.
> >
> > These tests pass on bare-metal.
>
> Good to know this. In addition, in linux apic driver, during mode
> switch __setup_APIC_LVTT() always sets lapic_timer_period(number of
> clock cycles per jiffy)/APIC_DIVISOR to APIC_TMICT which can avoid the
> issue Matt report. So is it because there is no such stuff in windows
> or the windows version which Matt testing is too old?

I'm using Windows 10 (May 2019). Multimedia apps on Windows tend to
request higher frequency clocks, and this in turn can affect how the
kernel configures HW timers.  I may need to examine how Windows
typically interacts with the APIC timer and see if/how this changes
when Skype is used.  The frequent timer mode changes are not something
I'd expect a reasonably behaved kernel to do.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] KVM: lapic: restart counter on change to periodic mode
  2019-08-20  5:08           ` Wanpeng Li
  2019-08-20  7:34             ` Matt Delco
@ 2019-08-20 16:33             ` Nadav Amit
  2019-08-21  0:19               ` Wanpeng Li
  1 sibling, 1 reply; 12+ messages in thread
From: Nadav Amit @ 2019-08-20 16:33 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: Sean Christopherson, Matt Delco, Paolo Bonzini, Radim Krcmar, kvm

> On Aug 19, 2019, at 10:08 PM, Wanpeng Li <kernellwp@gmail.com> wrote:
> 
> On Tue, 20 Aug 2019 at 12:10, Nadav Amit <nadav.amit@gmail.com> wrote:
>>> On Aug 19, 2019, at 6:56 PM, Sean Christopherson <sean.j.christopherson@intel.com> wrote:
>>> 
>>> +Cc Nadav
>>> 
>>> On Mon, Aug 19, 2019 at 06:07:01PM -0700, Matt Delco wrote:
>>>> On Mon, Aug 19, 2019 at 5:37 PM Sean Christopherson <
>>>> sean.j.christopherson@intel.com> wrote:
>>>> 
>>>>> On Tue, Aug 20, 2019 at 01:42:37AM +0200, Paolo Bonzini wrote:
>>>>>> On 20/08/19 01:04, Matt delco wrote:
>>>>>>> From: Matt Delco <delco@google.com>
>>>>>>> 
>>>>>>> Time seems to eventually stop in a Windows VM when using Skype.
>>>>>>> Instrumentation shows that the OS is frequently switching the APIC
>>>>>>> timer between one-shot and periodic mode.  The OS is typically writing
>>>>>>> to both LVTT and TMICT.  When time stops the sequence observed is that
>>>>>>> the APIC was in one-shot mode, the timer expired, and the OS writes to
>>>>>>> LVTT (but not TMICT) to change to periodic mode.  No future timer
>>>>> events
>>>>>>> are received by the OS since the timer is only re-armed on TMICT
>>>>> writes.
>>>>>>> With this change time continues to advance in the VM.  TBD if physical
>>>>>>> hardware will reset the current count if/when the mode is changed to
>>>>>>> period and the current count is zero.
>>>>>>> 
>>>>>>> Signed-off-by: Matt Delco <delco@google.com>
>>>>>>> ---
>>>>>>> arch/x86/kvm/lapic.c | 9 +++++++--
>>>>>>> 1 file changed, 7 insertions(+), 2 deletions(-)
>>>>>>> 
>>>>>>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>>>>>>> index 685d17c11461..fddd810eeca5 100644
>>>>>>> --- a/arch/x86/kvm/lapic.c
>>>>>>> +++ b/arch/x86/kvm/lapic.c
>>>>>>> @@ -1935,14 +1935,19 @@ int kvm_lapic_reg_write(struct kvm_lapic
>>>>> *apic, u32 reg, u32 val)
>>>>>>>           break;
>>>>>>> 
>>>>>>> -   case APIC_LVTT:
>>>>>>> +   case APIC_LVTT: {
>>>>>>> +           u32 timer_mode = apic->lapic_timer.timer_mode;
>>>>>>>           if (!kvm_apic_sw_enabled(apic))
>>>>>>>                   val |= APIC_LVT_MASKED;
>>>>>>>           val &= (apic_lvt_mask[0] |
>>>>> apic->lapic_timer.timer_mode_mask);
>>>>>>>           kvm_lapic_set_reg(apic, APIC_LVTT, val);
>>>>>>>           apic_update_lvtt(apic);
>>>>>>> +           if (timer_mode == APIC_LVT_TIMER_ONESHOT &&
>>>>>>> +               apic_lvtt_period(apic) &&
>>>>>>> +               !hrtimer_active(&apic->lapic_timer.timer))
>>>>>>> +                   start_apic_timer(apic);
>>>>>> 
>>>>>> Still, this needs some more explanation.  Can you cover this, as well as
>>>>>> the oneshot->periodic transition, in kvm-unit-tests' x86/apic.c
>>>>>> testcase?  Then we could try running it on bare metal and see what
>>>>> happens.
>>>> 
>>>> I looked at apic.c and test_apic_change_mode() might already be testing
>>>> this.  It sets oneshot & TMICT, waits for the current value to get
>>>> half-way, changes the mode to periodic, and then tries to test that the
>>>> value wraps back to the upper half.  It then waits again for the half-way
>>>> point, changes the mode back to oneshot, and waits for zero.  After
>>>> reaching zero it does:
>>>> 
>>>> /* now tmcct == 0 and tmict != 0 */
>>>> apic_change_mode(APIC_LVT_TIMER_PERIODIC);
>>>> report("TMCCT should stay at zero", !apic_read(APIC_TMCCT));
>>>> 
>>>> which seems to be testing that oneshot->periodic won't reset the timer if
>>>> it's already zero.  A possible caveat is there's hardly any delay between
>>>> the mode change and the timer read.  Emulated hardware will react
>>>> instantaneously (at least as seen from within the VM), but hardware might
>>>> need more time to react (though offhand I'd expect HW to be fast enough for
>>>> this particular timer).
>>>> 
>>>> So, it looks like the code might already be ready to run on physical
>>>> hardware, and if it has (or does already as part of a regular test), then
>>>> that does raise some doubt on what's the appropriate code change to make
>>>> this work.
>>> 
>>> Nadav has been running tests on bare metal, maybe he can weigh in on
>>> whether or not test_apic_change_mode() passes on bare metal.
>> 
>> These tests pass on bare-metal.
> 
> Good to know this. In addition, in linux apic driver, during mode
> switch __setup_APIC_LVTT() always sets lapic_timer_period(number of
> clock cycles per jiffy)/APIC_DIVISOR to APIC_TMICT which can avoid the
> issue Matt report. So is it because there is no such stuff in windows
> or the windows version which Matt testing is too old?

I find it kind of disappointing that you (and others) did not try the
kvm-unit-tests of bare-metal. :(

It should be working, once Paolo (ahem..) applies the one pending patch. You
do need a serial console though (which is usually available through
ilo/idrac/etc). It should also work with UEFI/kexec, although I did not run
such tests.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] KVM: lapic: restart counter on change to periodic mode
  2019-08-20 16:33             ` Nadav Amit
@ 2019-08-21  0:19               ` Wanpeng Li
  2019-08-21  0:26                 ` Nadav Amit
  0 siblings, 1 reply; 12+ messages in thread
From: Wanpeng Li @ 2019-08-21  0:19 UTC (permalink / raw)
  To: Nadav Amit
  Cc: Sean Christopherson, Matt Delco, Paolo Bonzini, Radim Krcmar, kvm

On Wed, 21 Aug 2019 at 00:33, Nadav Amit <nadav.amit@gmail.com> wrote:
>
> > On Aug 19, 2019, at 10:08 PM, Wanpeng Li <kernellwp@gmail.com> wrote:
> >
> > On Tue, 20 Aug 2019 at 12:10, Nadav Amit <nadav.amit@gmail.com> wrote:
> >>> On Aug 19, 2019, at 6:56 PM, Sean Christopherson <sean.j.christopherson@intel.com> wrote:
> >>>
> >>> +Cc Nadav
> >>>
> >>> On Mon, Aug 19, 2019 at 06:07:01PM -0700, Matt Delco wrote:
> >>>> On Mon, Aug 19, 2019 at 5:37 PM Sean Christopherson <
> >>>> sean.j.christopherson@intel.com> wrote:
> >>>>
> >>>>> On Tue, Aug 20, 2019 at 01:42:37AM +0200, Paolo Bonzini wrote:
> >>>>>> On 20/08/19 01:04, Matt delco wrote:
> >>>>>>> From: Matt Delco <delco@google.com>
> >>>>>>>
> >>>>>>> Time seems to eventually stop in a Windows VM when using Skype.
> >>>>>>> Instrumentation shows that the OS is frequently switching the APIC
> >>>>>>> timer between one-shot and periodic mode.  The OS is typically writing
> >>>>>>> to both LVTT and TMICT.  When time stops the sequence observed is that
> >>>>>>> the APIC was in one-shot mode, the timer expired, and the OS writes to
> >>>>>>> LVTT (but not TMICT) to change to periodic mode.  No future timer
> >>>>> events
> >>>>>>> are received by the OS since the timer is only re-armed on TMICT
> >>>>> writes.
> >>>>>>> With this change time continues to advance in the VM.  TBD if physical
> >>>>>>> hardware will reset the current count if/when the mode is changed to
> >>>>>>> period and the current count is zero.
> >>>>>>>
> >>>>>>> Signed-off-by: Matt Delco <delco@google.com>
> >>>>>>> ---
> >>>>>>> arch/x86/kvm/lapic.c | 9 +++++++--
> >>>>>>> 1 file changed, 7 insertions(+), 2 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> >>>>>>> index 685d17c11461..fddd810eeca5 100644
> >>>>>>> --- a/arch/x86/kvm/lapic.c
> >>>>>>> +++ b/arch/x86/kvm/lapic.c
> >>>>>>> @@ -1935,14 +1935,19 @@ int kvm_lapic_reg_write(struct kvm_lapic
> >>>>> *apic, u32 reg, u32 val)
> >>>>>>>           break;
> >>>>>>>
> >>>>>>> -   case APIC_LVTT:
> >>>>>>> +   case APIC_LVTT: {
> >>>>>>> +           u32 timer_mode = apic->lapic_timer.timer_mode;
> >>>>>>>           if (!kvm_apic_sw_enabled(apic))
> >>>>>>>                   val |= APIC_LVT_MASKED;
> >>>>>>>           val &= (apic_lvt_mask[0] |
> >>>>> apic->lapic_timer.timer_mode_mask);
> >>>>>>>           kvm_lapic_set_reg(apic, APIC_LVTT, val);
> >>>>>>>           apic_update_lvtt(apic);
> >>>>>>> +           if (timer_mode == APIC_LVT_TIMER_ONESHOT &&
> >>>>>>> +               apic_lvtt_period(apic) &&
> >>>>>>> +               !hrtimer_active(&apic->lapic_timer.timer))
> >>>>>>> +                   start_apic_timer(apic);
> >>>>>>
> >>>>>> Still, this needs some more explanation.  Can you cover this, as well as
> >>>>>> the oneshot->periodic transition, in kvm-unit-tests' x86/apic.c
> >>>>>> testcase?  Then we could try running it on bare metal and see what
> >>>>> happens.
> >>>>
> >>>> I looked at apic.c and test_apic_change_mode() might already be testing
> >>>> this.  It sets oneshot & TMICT, waits for the current value to get
> >>>> half-way, changes the mode to periodic, and then tries to test that the
> >>>> value wraps back to the upper half.  It then waits again for the half-way
> >>>> point, changes the mode back to oneshot, and waits for zero.  After
> >>>> reaching zero it does:
> >>>>
> >>>> /* now tmcct == 0 and tmict != 0 */
> >>>> apic_change_mode(APIC_LVT_TIMER_PERIODIC);
> >>>> report("TMCCT should stay at zero", !apic_read(APIC_TMCCT));
> >>>>
> >>>> which seems to be testing that oneshot->periodic won't reset the timer if
> >>>> it's already zero.  A possible caveat is there's hardly any delay between
> >>>> the mode change and the timer read.  Emulated hardware will react
> >>>> instantaneously (at least as seen from within the VM), but hardware might
> >>>> need more time to react (though offhand I'd expect HW to be fast enough for
> >>>> this particular timer).
> >>>>
> >>>> So, it looks like the code might already be ready to run on physical
> >>>> hardware, and if it has (or does already as part of a regular test), then
> >>>> that does raise some doubt on what's the appropriate code change to make
> >>>> this work.
> >>>
> >>> Nadav has been running tests on bare metal, maybe he can weigh in on
> >>> whether or not test_apic_change_mode() passes on bare metal.
> >>
> >> These tests pass on bare-metal.
> >
> > Good to know this. In addition, in linux apic driver, during mode
> > switch __setup_APIC_LVTT() always sets lapic_timer_period(number of
> > clock cycles per jiffy)/APIC_DIVISOR to APIC_TMICT which can avoid the
> > issue Matt report. So is it because there is no such stuff in windows
> > or the windows version which Matt testing is too old?
>
> I find it kind of disappointing that you (and others) did not try the
> kvm-unit-tests of bare-metal. :(

Origianlly xen guys confirm the testcase on bare-metal, thanks for
your double confirm.

Regards,
Wanpeng Li

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] KVM: lapic: restart counter on change to periodic mode
  2019-08-21  0:19               ` Wanpeng Li
@ 2019-08-21  0:26                 ` Nadav Amit
  0 siblings, 0 replies; 12+ messages in thread
From: Nadav Amit @ 2019-08-21  0:26 UTC (permalink / raw)
  To: Wanpeng Li
  Cc: Sean Christopherson, Matt Delco, Paolo Bonzini, Radim Krcmar, kvm

> On Aug 20, 2019, at 5:19 PM, Wanpeng Li <kernellwp@gmail.com> wrote:
> 
> On Wed, 21 Aug 2019 at 00:33, Nadav Amit <nadav.amit@gmail.com> wrote:
>>> On Aug 19, 2019, at 10:08 PM, Wanpeng Li <kernellwp@gmail.com> wrote:
>>> 
>>> On Tue, 20 Aug 2019 at 12:10, Nadav Amit <nadav.amit@gmail.com> wrote:
>>>>> On Aug 19, 2019, at 6:56 PM, Sean Christopherson <sean.j.christopherson@intel.com> wrote:
>>>>> 
>>>>> +Cc Nadav
>>>>> 
>>>>> On Mon, Aug 19, 2019 at 06:07:01PM -0700, Matt Delco wrote:
>>>>>> On Mon, Aug 19, 2019 at 5:37 PM Sean Christopherson <
>>>>>> sean.j.christopherson@intel.com> wrote:
>>>>>> 
>>>>>>> On Tue, Aug 20, 2019 at 01:42:37AM +0200, Paolo Bonzini wrote:
>>>>>>>> On 20/08/19 01:04, Matt delco wrote:
>>>>>>>>> From: Matt Delco <delco@google.com>
>>>>>>>>> 
>>>>>>>>> Time seems to eventually stop in a Windows VM when using Skype.
>>>>>>>>> Instrumentation shows that the OS is frequently switching the APIC
>>>>>>>>> timer between one-shot and periodic mode.  The OS is typically writing
>>>>>>>>> to both LVTT and TMICT.  When time stops the sequence observed is that
>>>>>>>>> the APIC was in one-shot mode, the timer expired, and the OS writes to
>>>>>>>>> LVTT (but not TMICT) to change to periodic mode.  No future timer
>>>>>>> events
>>>>>>>>> are received by the OS since the timer is only re-armed on TMICT
>>>>>>> writes.
>>>>>>>>> With this change time continues to advance in the VM.  TBD if physical
>>>>>>>>> hardware will reset the current count if/when the mode is changed to
>>>>>>>>> period and the current count is zero.
>>>>>>>>> 
>>>>>>>>> Signed-off-by: Matt Delco <delco@google.com>
>>>>>>>>> ---
>>>>>>>>> arch/x86/kvm/lapic.c | 9 +++++++--
>>>>>>>>> 1 file changed, 7 insertions(+), 2 deletions(-)
>>>>>>>>> 
>>>>>>>>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>>>>>>>>> index 685d17c11461..fddd810eeca5 100644
>>>>>>>>> --- a/arch/x86/kvm/lapic.c
>>>>>>>>> +++ b/arch/x86/kvm/lapic.c
>>>>>>>>> @@ -1935,14 +1935,19 @@ int kvm_lapic_reg_write(struct kvm_lapic
>>>>>>> *apic, u32 reg, u32 val)
>>>>>>>>>          break;
>>>>>>>>> 
>>>>>>>>> -   case APIC_LVTT:
>>>>>>>>> +   case APIC_LVTT: {
>>>>>>>>> +           u32 timer_mode = apic->lapic_timer.timer_mode;
>>>>>>>>>          if (!kvm_apic_sw_enabled(apic))
>>>>>>>>>                  val |= APIC_LVT_MASKED;
>>>>>>>>>          val &= (apic_lvt_mask[0] |
>>>>>>> apic->lapic_timer.timer_mode_mask);
>>>>>>>>>          kvm_lapic_set_reg(apic, APIC_LVTT, val);
>>>>>>>>>          apic_update_lvtt(apic);
>>>>>>>>> +           if (timer_mode == APIC_LVT_TIMER_ONESHOT &&
>>>>>>>>> +               apic_lvtt_period(apic) &&
>>>>>>>>> +               !hrtimer_active(&apic->lapic_timer.timer))
>>>>>>>>> +                   start_apic_timer(apic);
>>>>>>>> 
>>>>>>>> Still, this needs some more explanation.  Can you cover this, as well as
>>>>>>>> the oneshot->periodic transition, in kvm-unit-tests' x86/apic.c
>>>>>>>> testcase?  Then we could try running it on bare metal and see what
>>>>>>> happens.
>>>>>> 
>>>>>> I looked at apic.c and test_apic_change_mode() might already be testing
>>>>>> this.  It sets oneshot & TMICT, waits for the current value to get
>>>>>> half-way, changes the mode to periodic, and then tries to test that the
>>>>>> value wraps back to the upper half.  It then waits again for the half-way
>>>>>> point, changes the mode back to oneshot, and waits for zero.  After
>>>>>> reaching zero it does:
>>>>>> 
>>>>>> /* now tmcct == 0 and tmict != 0 */
>>>>>> apic_change_mode(APIC_LVT_TIMER_PERIODIC);
>>>>>> report("TMCCT should stay at zero", !apic_read(APIC_TMCCT));
>>>>>> 
>>>>>> which seems to be testing that oneshot->periodic won't reset the timer if
>>>>>> it's already zero.  A possible caveat is there's hardly any delay between
>>>>>> the mode change and the timer read.  Emulated hardware will react
>>>>>> instantaneously (at least as seen from within the VM), but hardware might
>>>>>> need more time to react (though offhand I'd expect HW to be fast enough for
>>>>>> this particular timer).
>>>>>> 
>>>>>> So, it looks like the code might already be ready to run on physical
>>>>>> hardware, and if it has (or does already as part of a regular test), then
>>>>>> that does raise some doubt on what's the appropriate code change to make
>>>>>> this work.
>>>>> 
>>>>> Nadav has been running tests on bare metal, maybe he can weigh in on
>>>>> whether or not test_apic_change_mode() passes on bare metal.
>>>> 
>>>> These tests pass on bare-metal.
>>> 
>>> Good to know this. In addition, in linux apic driver, during mode
>>> switch __setup_APIC_LVTT() always sets lapic_timer_period(number of
>>> clock cycles per jiffy)/APIC_DIVISOR to APIC_TMICT which can avoid the
>>> issue Matt report. So is it because there is no such stuff in windows
>>> or the windows version which Matt testing is too old?
>> 
>> I find it kind of disappointing that you (and others) did not try the
>> kvm-unit-tests of bare-metal. :(
> 
> Origianlly xen guys confirm the testcase on bare-metal, thanks for
> your double confirm.

No worries, I don’t look for a “thank you” note. ;-)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] KVM: lapic: restart counter on change to periodic mode
  2019-08-20  7:34             ` Matt Delco
@ 2019-08-21 17:17               ` Sean Christopherson
  2019-08-21 18:03                 ` Matt Delco
  0 siblings, 1 reply; 12+ messages in thread
From: Sean Christopherson @ 2019-08-21 17:17 UTC (permalink / raw)
  To: Matt Delco; +Cc: Wanpeng Li, Nadav Amit, Paolo Bonzini, Radim Krcmar, kvm

On Tue, Aug 20, 2019 at 12:34:20AM -0700, Matt Delco wrote:
> On Mon, Aug 19, 2019 at 10:09 PM Wanpeng Li <kernellwp@gmail.com> wrote:
> >
> > On Tue, 20 Aug 2019 at 12:10, Nadav Amit <nadav.amit@gmail.com> wrote:
> > > These tests pass on bare-metal.
> >
> > Good to know this. In addition, in linux apic driver, during mode
> > switch __setup_APIC_LVTT() always sets lapic_timer_period(number of
> > clock cycles per jiffy)/APIC_DIVISOR to APIC_TMICT which can avoid the
> > issue Matt report. So is it because there is no such stuff in windows
> > or the windows version which Matt testing is too old?
> 
> I'm using Windows 10 (May 2019). Multimedia apps on Windows tend to
> request higher frequency clocks, and this in turn can affect how the
> kernel configures HW timers.  I may need to examine how Windows
> typically interacts with the APIC timer and see if/how this changes
> when Skype is used.  The frequent timer mode changes are not something
> I'd expect a reasonably behaved kernel to do.

Have you tried analyzing the guest code?  If we're lucky, doing so might
provide insight into what's going awry.

E.g.:

  Are the LVTT/TMICT writes are coming from a single blob/sequence of code
  in the guest?

  Is the unpaired LVTT coming from the same code sequence or is it a new
  rip entirely?

  Can you dump the relevant asm code sequences?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH] KVM: lapic: restart counter on change to periodic mode
  2019-08-21 17:17               ` Sean Christopherson
@ 2019-08-21 18:03                 ` Matt Delco
  0 siblings, 0 replies; 12+ messages in thread
From: Matt Delco @ 2019-08-21 18:03 UTC (permalink / raw)
  To: Sean Christopherson
  Cc: Wanpeng Li, Nadav Amit, Paolo Bonzini, Radim Krcmar, kvm

On Wed, Aug 21, 2019 at 10:17 AM Sean Christopherson
<sean.j.christopherson@intel.com> wrote:
> On Tue, Aug 20, 2019 at 12:34:20AM -0700, Matt Delco wrote:
> > On Mon, Aug 19, 2019 at 10:09 PM Wanpeng Li <kernellwp@gmail.com> wrote:
> > >
> > > On Tue, 20 Aug 2019 at 12:10, Nadav Amit <nadav.amit@gmail.com> wrote:
> > > > These tests pass on bare-metal.
> > >
> > > Good to know this. In addition, in linux apic driver, during mode
> > > switch __setup_APIC_LVTT() always sets lapic_timer_period(number of
> > > clock cycles per jiffy)/APIC_DIVISOR to APIC_TMICT which can avoid the
> > > issue Matt report. So is it because there is no such stuff in windows
> > > or the windows version which Matt testing is too old?
> >
> > I'm using Windows 10 (May 2019). Multimedia apps on Windows tend to
> > request higher frequency clocks, and this in turn can affect how the
> > kernel configures HW timers.  I may need to examine how Windows
> > typically interacts with the APIC timer and see if/how this changes
> > when Skype is used.  The frequent timer mode changes are not something
> > I'd expect a reasonably behaved kernel to do.
>
> Have you tried analyzing the guest code?  If we're lucky, doing so might
> provide insight into what's going awry.
>
> E.g.:
>
>   Are the LVTT/TMICT writes are coming from a single blob/sequence of code
>   in the guest?
>
>   Is the unpaired LVTT coming from the same code sequence or is it a new
>   rip entirely?
>
>   Can you dump the relevant asm code sequences?

I have changed gears to do runtime behavioral analysis, given the
reports that the code change I proposed would deviate from hardware.
The time between writes for TMICT-then-LVTT is typically quite small,
and much smaller than the average for LVTT-then-TMICT.  On the lead up
to where time stops there's alternating writes to TMICT and LVTT,
where each write to LVTT alternates between setting periodic vs.
one-shot.  The final write to LVTT (which sets periodic) comes more
than 1.5 ms after the prior TMICT (which is about 100x the typical
delay), which might mean the kernel opted to not write to TMICT but
did on the next clock tick.  The host kernel & kvm I've been testing
with seems to be firing the timer callbacks sooner than requested, so
if the guest kernel has optimizations based on whether it thinks
there's time left on the APIC timer then this might be causing
problems.  I'm going to try to pull in some of the newer kvm changes
that appear to compensate for the early delivery and see if that also
makes the time hang symptom disappear (if not then I may start to
examine things from the guest side).  Thanks.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-08-21 18:03 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-19 23:04 [PATCH] KVM: lapic: restart counter on change to periodic mode Matt delco
2019-08-19 23:42 ` Paolo Bonzini
2019-08-20  0:37   ` Sean Christopherson
     [not found]     ` <CAHGX9VrZyPQ8OxnYnOWg-ES3=kghSx1LSyzrX8i3=O+o0JAsig@mail.gmail.com>
2019-08-20  1:56       ` Sean Christopherson
2019-08-20  4:08         ` Nadav Amit
2019-08-20  5:08           ` Wanpeng Li
2019-08-20  7:34             ` Matt Delco
2019-08-21 17:17               ` Sean Christopherson
2019-08-21 18:03                 ` Matt Delco
2019-08-20 16:33             ` Nadav Amit
2019-08-21  0:19               ` Wanpeng Li
2019-08-21  0:26                 ` Nadav Amit

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).