All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 0/3] KVM: LAPIC: Rework lapic timer to behave more like real-hardware
@ 2017-10-06  1:54 Wanpeng Li
  2017-10-06  1:54 ` [PATCH v6 1/3] KVM: LAPIC: Introduce limit_periodic_timer_frequency Wanpeng Li
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Wanpeng Li @ 2017-10-06  1:54 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: Paolo Bonzini, Radim Krčmář, Wanpeng Li

The issue is reported in xen community.

Anthony PERARD pointed out:

https://www.mail-archive.com/xen-devel@lists.xen.org/msg117283.html#

 | When developing PVH for OVMF, I've used the lapic timer. It turns out that the
 | way it is used by OVMF did not work with Xen [1]. I tried to find out how
 | real-hw behave, and write a XTF tests [2]. And this patch series tries to fix
 | the behavior of the vlapic timer.
 | 
 | 
 | The OVMF driver for the APIC timer initialize the timer like this:
 | 	write to TMICT (initial counter)
 | 	write to TMDCR (divide configuration)
 | 	enable the timer (this may change timer mode from one-shot to periodic)
 | It turns out that TMICT is set to 0 on the last step, but OVMF expect the timer
 | to run.
 | 
 | Here is some description of the APIC timer, base on observation as well as read
 | of the Intel SDM. The description is also patch of patch description
 | (reworded).
 | 
 | Maybe a way of thinking how the APIC timer is evaluated, is to think of how
 | hardward will do it. There is a counter TMCCT which always keeps counting down.
 | 
 | Setting TMICT also set TMCCT, nothing else matter.
 | Setting LVTT does not change anything right away.
 | Setting TMDCR does not change much.
 | 
 | Now TMCCT keeps counting down, by a value related to TMDCR.
 | Once, TMCCT reach 0, it is only at this time that LVTT is taken into account.
 | Is there an interrupt to deliver? Should the timer restart counting from the
 | value in TMICT?
 | 
 | In the Intel SDM, there is the word "disarm" of the timer used. I guess the
 | easier way to disarm the APIC timer (when in periodic or one-shot) is to set
 | TMICT to 0. But if we take TSC-Deadline mode out of the picture, there is
 | nothing in the manual that say that the timer is disarm or stopped when
 | changing timer mode (there is only two modes left, period and one-shot).
 | 
 | As for the TSC-deadline timer mode, observation shown that changing to it (or
 | from it) does reset and disarm both timers, so effectively TMICT and the
 | tscdeadline are set to 0.
 | 
 | [1] https://lists.xenproject.org/archives/html/xen-devel/2016-12/msg00959.html
 | [2] v1: 
 | https://lists.xenproject.org/archives/html/xen-devel/2017-03/msg02533.html
 |     v2: look for "[XTF PATCH V2 0/3] Testing vlapic timer"

v5 -> v6:
 * rebase against latest kvm/queue
 * extract the function limit_periodic_timer_frequency()
 * rate limit when switching from one-shot to periodic
 * introduce a new function for updating the expiration when TDCR is changed, and apply the rate limit
 
v4 -> v5:
 * reflect the runtime divide/rate update for the remaining timer

v3 -> v4:
 * don't need to start_apic_timer() when write LVTT

v2 -> v3:
 * move the write 0 to APIC_TMICT logic to apic_update_lvtt()
 * skip hrtimer_cancel() in apic_update_lvtt() when either from one-shot mode to 
   periodic or vice versa.

v1 -> v2:
 * add cover-letter and collect recent lapic patches to one patchset

Wanpeng Li (3):
  KVM: LAPIC: Introduce limit_periodic_timer_frequency
  KVM: LAPIC: Keep timer running when switching between one-shot and periodic mode
  KVM: LAPIC: Apply change to TDCR right away to the timer

 arch/x86/kvm/lapic.c | 88 +++++++++++++++++++++++++++++++++++++++-------------
 1 file changed, 67 insertions(+), 21 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v6 1/3] KVM: LAPIC: Introduce limit_periodic_timer_frequency
  2017-10-06  1:54 [PATCH v6 0/3] KVM: LAPIC: Rework lapic timer to behave more like real-hardware Wanpeng Li
@ 2017-10-06  1:54 ` Wanpeng Li
  2017-10-06  1:54 ` [PATCH v6 2/3] KVM: LAPIC: Keep timer running when switching between one-shot and periodic mode Wanpeng Li
  2017-10-06  1:54 ` [PATCH v6 3/3] KVM: LAPIC: Apply change to TDCR right away to the timer Wanpeng Li
  2 siblings, 0 replies; 10+ messages in thread
From: Wanpeng Li @ 2017-10-06  1:54 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: Paolo Bonzini, Radim Krčmář, Wanpeng Li

From: Wanpeng Li <wanpeng.li@hotmail.com>

Extract the logic of limit lapic periodic timer frequency to a new function, 
this function will be used by later patches.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
---
 arch/x86/kvm/lapic.c | 39 ++++++++++++++++++++++-----------------
 1 file changed, 22 insertions(+), 17 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 6723e2c..8841bb5 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1301,6 +1301,27 @@ static void update_divide_count(struct kvm_lapic *apic)
 				   apic->divide_count);
 }
 
+static void limit_periodic_timer_frequency(struct kvm_lapic *apic)
+{
+	/*
+	 * Do not allow the guest to program periodic timers with small
+	 * interval, since the hrtimers are not throttled by the host
+	 * scheduler.
+	 */
+	if (apic_lvtt_period(apic)) {
+		s64 min_period = min_timer_period_us * 1000LL;
+
+		if (apic->lapic_timer.period < min_period) {
+			pr_info_ratelimited(
+			    "kvm: vcpu %i: requested %lld ns "
+			    "lapic timer period limited to %lld ns\n",
+			    apic->vcpu->vcpu_id,
+			    apic->lapic_timer.period, min_period);
+			apic->lapic_timer.period = min_period;
+		}
+	}
+}
+
 static void apic_update_lvtt(struct kvm_lapic *apic)
 {
 	u32 timer_mode = kvm_lapic_get_reg(apic, APIC_LVTT) &
@@ -1445,23 +1466,7 @@ static bool set_target_expiration(struct kvm_lapic *apic)
 	if (!apic->lapic_timer.period)
 		return false;
 
-	/*
-	 * Do not allow the guest to program periodic timers with small
-	 * interval, since the hrtimers are not throttled by the host
-	 * scheduler.
-	 */
-	if (apic_lvtt_period(apic)) {
-		s64 min_period = min_timer_period_us * 1000LL;
-
-		if (apic->lapic_timer.period < min_period) {
-			pr_info_ratelimited(
-			    "kvm: vcpu %i: requested %lld ns "
-			    "lapic timer period limited to %lld ns\n",
-			    apic->vcpu->vcpu_id,
-			    apic->lapic_timer.period, min_period);
-			apic->lapic_timer.period = min_period;
-		}
-	}
+	limit_periodic_timer_frequency(apic);
 
 	apic_debug("%s: bus cycle is %" PRId64 "ns, now 0x%016"
 		   PRIx64 ", "
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v6 2/3] KVM: LAPIC: Keep timer running when switching between one-shot and periodic mode
  2017-10-06  1:54 [PATCH v6 0/3] KVM: LAPIC: Rework lapic timer to behave more like real-hardware Wanpeng Li
  2017-10-06  1:54 ` [PATCH v6 1/3] KVM: LAPIC: Introduce limit_periodic_timer_frequency Wanpeng Li
@ 2017-10-06  1:54 ` Wanpeng Li
  2017-10-06 13:17   ` Radim Krčmář
  2017-10-06  1:54 ` [PATCH v6 3/3] KVM: LAPIC: Apply change to TDCR right away to the timer Wanpeng Li
  2 siblings, 1 reply; 10+ messages in thread
From: Wanpeng Li @ 2017-10-06  1:54 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: Paolo Bonzini, Radim Krčmář, Wanpeng Li

From: Wanpeng Li <wanpeng.li@hotmail.com>

If we take TSC-deadline mode timer out of the picture, the Intel SDM
does not say that the timer is disable when the timer mode is change,
either from one-shot to periodic or vice versa.

After this patch, the timer is no longer disarmed on change of mode, so
the counter (TMCCT) keeps counting down.

So what does a write to LVTT changes ? On baremetal, the change of mode
is probably taken into account only when the counter reach 0. When this
happen, LVTT is use to figure out if the counter should restard counting
down from TMICT (so periodic mode) or stop counting (if one-shot mode).

This patch is based on observation of the behavior of the APIC timer on
baremetal as well as check that they does not go against the description
written in the Intel SDM.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
---
 arch/x86/kvm/lapic.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 8841bb5..14f63b3 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1329,10 +1329,14 @@ static void apic_update_lvtt(struct kvm_lapic *apic)
 
 	if (apic->lapic_timer.timer_mode != timer_mode) {
 		if (apic_lvtt_tscdeadline(apic) != (timer_mode ==
-				APIC_LVT_TIMER_TSCDEADLINE))
+				APIC_LVT_TIMER_TSCDEADLINE)) {
 			kvm_lapic_set_reg(apic, APIC_TMICT, 0);
+			hrtimer_cancel(&apic->lapic_timer.timer);
+		}
+		if (apic_lvtt_oneshot(apic) && (timer_mode ==
+				APIC_LVT_TIMER_PERIODIC))
+			limit_periodic_timer_frequency(apic);
 		apic->lapic_timer.timer_mode = timer_mode;
-		hrtimer_cancel(&apic->lapic_timer.timer);
 	}
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v6 3/3] KVM: LAPIC: Apply change to TDCR right away to the timer
  2017-10-06  1:54 [PATCH v6 0/3] KVM: LAPIC: Rework lapic timer to behave more like real-hardware Wanpeng Li
  2017-10-06  1:54 ` [PATCH v6 1/3] KVM: LAPIC: Introduce limit_periodic_timer_frequency Wanpeng Li
  2017-10-06  1:54 ` [PATCH v6 2/3] KVM: LAPIC: Keep timer running when switching between one-shot and periodic mode Wanpeng Li
@ 2017-10-06  1:54 ` Wanpeng Li
  2017-10-06 13:14   ` Radim Krčmář
  2 siblings, 1 reply; 10+ messages in thread
From: Wanpeng Li @ 2017-10-06  1:54 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: Paolo Bonzini, Radim Krčmář, Wanpeng Li

From: Wanpeng Li <wanpeng.li@hotmail.com>

The description in the Intel SDM of how the divide configuration
register is used: "The APIC timer frequency will be the processor's bus
clock or core crystal clock frequency divided by the value specified in
the divide configuration register."

Observation of baremetal shown that when the TDCR is change, the TMCCT
does not change or make a big jump in value, but the rate at which it
count down change.

The patch update the emulation to APIC timer to so that a change to the
divide configuration would be reflected in the value of the counter and
when the next interrupt is triggered.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
---
 arch/x86/kvm/lapic.c | 41 +++++++++++++++++++++++++++++++++++++++--
 1 file changed, 39 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 14f63b3..f749b96 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1458,6 +1458,36 @@ static void start_sw_period(struct kvm_lapic *apic)
 		HRTIMER_MODE_ABS_PINNED);
 }
 
+static bool update_target_expiration(struct kvm_lapic *apic, uint32_t old_divisor)
+{
+	ktime_t now, remaining;
+	u64 tscl = rdtsc(), delta;
+
+	now = ktime_get();
+	remaining = ktime_sub(apic->lapic_timer.target_expiration, now);
+	if (ktime_to_ns(remaining) < 0)
+		remaining = 0;
+	delta = mod_64(ktime_to_ns(remaining), apic->lapic_timer.period);
+
+	if (!delta)
+		return false;
+
+	apic->lapic_timer.period = (u64)kvm_lapic_get_reg(apic, APIC_TMICT)
+		* APIC_BUS_CYCLE_NS * apic->divide_count;
+	delta = delta * apic->divide_count / old_divisor;
+
+	if (!apic->lapic_timer.period)
+		return false;
+
+	limit_periodic_timer_frequency(apic);
+
+	apic->lapic_timer.tscdeadline = kvm_read_l1_tsc(apic->vcpu, tscl) +
+		nsec_to_cycles(apic->vcpu, delta);
+	apic->lapic_timer.target_expiration = ktime_add_ns(now, delta);
+
+	return true;
+}
+
 static bool set_target_expiration(struct kvm_lapic *apic)
 {
 	ktime_t now;
@@ -1750,13 +1780,20 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
 		start_apic_timer(apic);
 		break;
 
-	case APIC_TDCR:
+	case APIC_TDCR: {
+		uint32_t old_divisor = apic->divide_count;
+
 		if (val & 4)
 			apic_debug("KVM_WRITE:TDCR %x\n", val);
 		kvm_lapic_set_reg(apic, APIC_TDCR, val);
 		update_divide_count(apic);
+		if (apic->divide_count != old_divisor) {
+			hrtimer_cancel(&apic->lapic_timer.timer);
+			if (update_target_expiration(apic, old_divisor))
+				restart_apic_timer(apic);
+		}
 		break;
-
+	}
 	case APIC_ESR:
 		if (apic_x2apic_mode(apic) && val != 0) {
 			apic_debug("KVM_WRITE:ESR not zero %x\n", val);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v6 3/3] KVM: LAPIC: Apply change to TDCR right away to the timer
  2017-10-06  1:54 ` [PATCH v6 3/3] KVM: LAPIC: Apply change to TDCR right away to the timer Wanpeng Li
@ 2017-10-06 13:14   ` Radim Krčmář
  2017-10-06 13:59     ` Wanpeng Li
  0 siblings, 1 reply; 10+ messages in thread
From: Radim Krčmář @ 2017-10-06 13:14 UTC (permalink / raw)
  To: Wanpeng Li; +Cc: linux-kernel, kvm, Paolo Bonzini, Wanpeng Li

2017-10-05 18:54-0700, Wanpeng Li:
> From: Wanpeng Li <wanpeng.li@hotmail.com>
> 
> The description in the Intel SDM of how the divide configuration
> register is used: "The APIC timer frequency will be the processor's bus
> clock or core crystal clock frequency divided by the value specified in
> the divide configuration register."
> 
> Observation of baremetal shown that when the TDCR is change, the TMCCT
> does not change or make a big jump in value, but the rate at which it
> count down change.
> 
> The patch update the emulation to APIC timer to so that a change to the
> divide configuration would be reflected in the value of the counter and
> when the next interrupt is triggered.
> 
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
> ---
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> @@ -1458,6 +1458,36 @@ static void start_sw_period(struct kvm_lapic *apic)
>  		HRTIMER_MODE_ABS_PINNED);
>  }
>  
> +static bool update_target_expiration(struct kvm_lapic *apic, uint32_t old_divisor)
> +{
> +	ktime_t now, remaining;
> +	u64 tscl = rdtsc(), delta;
> +
> +	now = ktime_get();
> +	remaining = ktime_sub(apic->lapic_timer.target_expiration, now);
> +	if (ktime_to_ns(remaining) < 0)
> +		remaining = 0;
> +	delta = mod_64(ktime_to_ns(remaining), apic->lapic_timer.period);

Hm, can this happen?

> +	if (!delta)
> +		return false;
> +
> +	apic->lapic_timer.period = (u64)kvm_lapic_get_reg(apic, APIC_TMICT)
> +		* APIC_BUS_CYCLE_NS * apic->divide_count;

I think that it would be safer to always modify the period.

> +	delta = delta * apic->divide_count / old_divisor;
> +
> +	if (!apic->lapic_timer.period)
> +		return false;
> +
> +	limit_periodic_timer_frequency(apic);
> +
> +	apic->lapic_timer.tscdeadline = kvm_read_l1_tsc(apic->vcpu, tscl) +
> +		nsec_to_cycles(apic->vcpu, delta);

We could do that without rdtsc() for added precision and maybe
performance:

	apic->lapic_timer.tscdeadline += nsec_to_cycles(apic->vcpu, delta) -
	                                 nsec_to_cycles(apic->vcpu, remaining);

	// not sure how a negative operand would behave:
	// nsec_to_cycles(apic->vcpu, delta - remaining)

> +	apic->lapic_timer.target_expiration = ktime_add_ns(now, delta);
> +
> +	return true;
> +}
> +
>  static bool set_target_expiration(struct kvm_lapic *apic)
>  {
>  	ktime_t now;
> @@ -1750,13 +1780,20 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
>  		start_apic_timer(apic);
>  		break;
>  
> -	case APIC_TDCR:
> +	case APIC_TDCR: {
> +		uint32_t old_divisor = apic->divide_count;
> +
>  		if (val & 4)
>  			apic_debug("KVM_WRITE:TDCR %x\n", val);
>  		kvm_lapic_set_reg(apic, APIC_TDCR, val);
>  		update_divide_count(apic);
> +		if (apic->divide_count != old_divisor) {
> +			hrtimer_cancel(&apic->lapic_timer.timer);
> +			if (update_target_expiration(apic, old_divisor))
> +				restart_apic_timer(apic);

I think we can lose a timer here when we cancel a hrtimer whose
expiration time passes before update_target_expiration(), so it never
gets restarted.

Doing restart_apic_timer() unconditionally seems better.  It behaves
well if we try to restart a timer that has already fired.

Thanks.

> +		}
>  		break;
> -
> +	}
>  	case APIC_ESR:
>  		if (apic_x2apic_mode(apic) && val != 0) {
>  			apic_debug("KVM_WRITE:ESR not zero %x\n", val);
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6 2/3] KVM: LAPIC: Keep timer running when switching between one-shot and periodic mode
  2017-10-06  1:54 ` [PATCH v6 2/3] KVM: LAPIC: Keep timer running when switching between one-shot and periodic mode Wanpeng Li
@ 2017-10-06 13:17   ` Radim Krčmář
  2017-10-06 15:04     ` Radim Krčmář
  0 siblings, 1 reply; 10+ messages in thread
From: Radim Krčmář @ 2017-10-06 13:17 UTC (permalink / raw)
  To: Wanpeng Li; +Cc: linux-kernel, kvm, Paolo Bonzini, Wanpeng Li

2017-10-05 18:54-0700, Wanpeng Li:
> From: Wanpeng Li <wanpeng.li@hotmail.com>
> 
> If we take TSC-deadline mode timer out of the picture, the Intel SDM
> does not say that the timer is disable when the timer mode is change,
> either from one-shot to periodic or vice versa.
> 
> After this patch, the timer is no longer disarmed on change of mode, so
> the counter (TMCCT) keeps counting down.
> 
> So what does a write to LVTT changes ? On baremetal, the change of mode
> is probably taken into account only when the counter reach 0. When this
> happen, LVTT is use to figure out if the counter should restard counting
> down from TMICT (so periodic mode) or stop counting (if one-shot mode).
> 
> This patch is based on observation of the behavior of the APIC timer on
> baremetal as well as check that they does not go against the description
> written in the Intel SDM.
> 
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
> ---
>  arch/x86/kvm/lapic.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)

Queued the first two patches, thanks.

> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 8841bb5..14f63b3 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -1329,10 +1329,14 @@ static void apic_update_lvtt(struct kvm_lapic *apic)
>  
>  	if (apic->lapic_timer.timer_mode != timer_mode) {
>  		if (apic_lvtt_tscdeadline(apic) != (timer_mode ==
> -				APIC_LVT_TIMER_TSCDEADLINE))
> +				APIC_LVT_TIMER_TSCDEADLINE)) {
>  			kvm_lapic_set_reg(apic, APIC_TMICT, 0);
> +			hrtimer_cancel(&apic->lapic_timer.timer);
> +		}
> +		if (apic_lvtt_oneshot(apic) && (timer_mode ==
> +				APIC_LVT_TIMER_PERIODIC))
> +			limit_periodic_timer_frequency(apic);
>  		apic->lapic_timer.timer_mode = timer_mode;
> -		hrtimer_cancel(&apic->lapic_timer.timer);
>  	}
>  }
>  
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6 3/3] KVM: LAPIC: Apply change to TDCR right away to the timer
  2017-10-06 13:14   ` Radim Krčmář
@ 2017-10-06 13:59     ` Wanpeng Li
  2017-10-06 14:14       ` Radim Krčmář
  0 siblings, 1 reply; 10+ messages in thread
From: Wanpeng Li @ 2017-10-06 13:59 UTC (permalink / raw)
  To: Radim Krčmář; +Cc: linux-kernel, kvm, Paolo Bonzini, Wanpeng Li

2017-10-06 21:14 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
> 2017-10-05 18:54-0700, Wanpeng Li:
>> From: Wanpeng Li <wanpeng.li@hotmail.com>
>>
>> The description in the Intel SDM of how the divide configuration
>> register is used: "The APIC timer frequency will be the processor's bus
>> clock or core crystal clock frequency divided by the value specified in
>> the divide configuration register."
>>
>> Observation of baremetal shown that when the TDCR is change, the TMCCT
>> does not change or make a big jump in value, but the rate at which it
>> count down change.
>>
>> The patch update the emulation to APIC timer to so that a change to the
>> divide configuration would be reflected in the value of the counter and
>> when the next interrupt is triggered.
>>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: Radim Krčmář <rkrcmar@redhat.com>
>> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
>> ---
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> @@ -1458,6 +1458,36 @@ static void start_sw_period(struct kvm_lapic *apic)
>>               HRTIMER_MODE_ABS_PINNED);
>>  }
>>
>> +static bool update_target_expiration(struct kvm_lapic *apic, uint32_t old_divisor)
>> +{
>> +     ktime_t now, remaining;
>> +     u64 tscl = rdtsc(), delta;
>> +
>> +     now = ktime_get();
>> +     remaining = ktime_sub(apic->lapic_timer.target_expiration, now);
>> +     if (ktime_to_ns(remaining) < 0)
>> +             remaining = 0;
>> +     delta = mod_64(ktime_to_ns(remaining), apic->lapic_timer.period);
>
> Hm, can this happen?

Yeah, when the hrtimer has already expired. I can catch it during testing.

>
>> +     if (!delta)
>> +             return false;
>> +
>> +     apic->lapic_timer.period = (u64)kvm_lapic_get_reg(apic, APIC_TMICT)
>> +             * APIC_BUS_CYCLE_NS * apic->divide_count;
>
> I think that it would be safer to always modify the period.

Agreed.

>
>> +     delta = delta * apic->divide_count / old_divisor;
>> +
>> +     if (!apic->lapic_timer.period)
>> +             return false;
>> +
>> +     limit_periodic_timer_frequency(apic);
>> +
>> +     apic->lapic_timer.tscdeadline = kvm_read_l1_tsc(apic->vcpu, tscl) +
>> +             nsec_to_cycles(apic->vcpu, delta);
>
> We could do that without rdtsc() for added precision and maybe
> performance:

Agreed.

>
>         apic->lapic_timer.tscdeadline += nsec_to_cycles(apic->vcpu, delta) -
>                                          nsec_to_cycles(apic->vcpu, remaining);
>
>         // not sure how a negative operand would behave:
>         // nsec_to_cycles(apic->vcpu, delta - remaining)
>
>> +     apic->lapic_timer.target_expiration = ktime_add_ns(now, delta);
>> +
>> +     return true;
>> +}
>> +
>>  static bool set_target_expiration(struct kvm_lapic *apic)
>>  {
>>       ktime_t now;
>> @@ -1750,13 +1780,20 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
>>               start_apic_timer(apic);
>>               break;
>>
>> -     case APIC_TDCR:
>> +     case APIC_TDCR: {
>> +             uint32_t old_divisor = apic->divide_count;
>> +
>>               if (val & 4)
>>                       apic_debug("KVM_WRITE:TDCR %x\n", val);
>>               kvm_lapic_set_reg(apic, APIC_TDCR, val);
>>               update_divide_count(apic);
>> +             if (apic->divide_count != old_divisor) {
>> +                     hrtimer_cancel(&apic->lapic_timer.timer);
>> +                     if (update_target_expiration(apic, old_divisor))
>> +                             restart_apic_timer(apic);
>
> I think we can lose a timer here when we cancel a hrtimer whose
> expiration time passes before update_target_expiration(), so it never
> gets restarted.
>
> Doing restart_apic_timer() unconditionally seems better.  It behaves
> well if we try to restart a timer that has already fired.

Agreed.

Regards,
Wanpeng Li

>
> Thanks.
>
>> +             }
>>               break;
>> -
>> +     }
>>       case APIC_ESR:
>>               if (apic_x2apic_mode(apic) && val != 0) {
>>                       apic_debug("KVM_WRITE:ESR not zero %x\n", val);
>> --
>> 2.7.4
>>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6 3/3] KVM: LAPIC: Apply change to TDCR right away to the timer
  2017-10-06 13:59     ` Wanpeng Li
@ 2017-10-06 14:14       ` Radim Krčmář
  0 siblings, 0 replies; 10+ messages in thread
From: Radim Krčmář @ 2017-10-06 14:14 UTC (permalink / raw)
  To: Wanpeng Li; +Cc: linux-kernel, kvm, Paolo Bonzini, Wanpeng Li

2017-10-06 21:59+0800, Wanpeng Li:
> 2017-10-06 21:14 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
> > 2017-10-05 18:54-0700, Wanpeng Li:
> >> From: Wanpeng Li <wanpeng.li@hotmail.com>
> >>
> >> The description in the Intel SDM of how the divide configuration
> >> register is used: "The APIC timer frequency will be the processor's bus
> >> clock or core crystal clock frequency divided by the value specified in
> >> the divide configuration register."
> >>
> >> Observation of baremetal shown that when the TDCR is change, the TMCCT
> >> does not change or make a big jump in value, but the rate at which it
> >> count down change.
> >>
> >> The patch update the emulation to APIC timer to so that a change to the
> >> divide configuration would be reflected in the value of the counter and
> >> when the next interrupt is triggered.
> >>
> >> Cc: Paolo Bonzini <pbonzini@redhat.com>
> >> Cc: Radim Krčmář <rkrcmar@redhat.com>
> >> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
> >> ---
> >> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> >> @@ -1458,6 +1458,36 @@ static void start_sw_period(struct kvm_lapic *apic)
> >> +static bool update_target_expiration(struct kvm_lapic *apic, uint32_t old_divisor)
> >> +{
> >> +     ktime_t now, remaining;
> >> +     u64 tscl = rdtsc(), delta;
> >> +
> >> +     now = ktime_get();
> >> +     remaining = ktime_sub(apic->lapic_timer.target_expiration, now);
> >> +     if (ktime_to_ns(remaining) < 0)
> >> +             remaining = 0;
> >> +     delta = mod_64(ktime_to_ns(remaining), apic->lapic_timer.period);
> >
> > Hm, can this happen?
> 
> Yeah, when the hrtimer has already expired. I can catch it during testing.

I thought that "if (ktime_to_ns(remaining) < 0)" is there to catch that,
isn't that a bug elsewhere?

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6 2/3] KVM: LAPIC: Keep timer running when switching between one-shot and periodic mode
  2017-10-06 13:17   ` Radim Krčmář
@ 2017-10-06 15:04     ` Radim Krčmář
  2017-10-07  0:50       ` Wanpeng Li
  0 siblings, 1 reply; 10+ messages in thread
From: Radim Krčmář @ 2017-10-06 15:04 UTC (permalink / raw)
  To: Wanpeng Li; +Cc: linux-kernel, kvm, Paolo Bonzini, Wanpeng Li

2017-10-06 15:17+0200, Radim Krčmář:
> 2017-10-05 18:54-0700, Wanpeng Li:
> > From: Wanpeng Li <wanpeng.li@hotmail.com>
> > 
> > If we take TSC-deadline mode timer out of the picture, the Intel SDM
> > does not say that the timer is disable when the timer mode is change,
> > either from one-shot to periodic or vice versa.
> > 
> > After this patch, the timer is no longer disarmed on change of mode, so
> > the counter (TMCCT) keeps counting down.
> > 
> > So what does a write to LVTT changes ? On baremetal, the change of mode
> > is probably taken into account only when the counter reach 0. When this
> > happen, LVTT is use to figure out if the counter should restard counting
> > down from TMICT (so periodic mode) or stop counting (if one-shot mode).
> > 
> > This patch is based on observation of the behavior of the APIC timer on
> > baremetal as well as check that they does not go against the description
> > written in the Intel SDM.
> > 
> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> > Cc: Radim Krčmář <rkrcmar@redhat.com>
> > Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
> > ---
> >  arch/x86/kvm/lapic.c | 8 ++++++--
> >  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> Queued the first two patches, thanks.
> 
> > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > index 8841bb5..14f63b3 100644
> > --- a/arch/x86/kvm/lapic.c
> > +++ b/arch/x86/kvm/lapic.c
> > @@ -1329,10 +1329,14 @@ static void apic_update_lvtt(struct kvm_lapic *apic)
> >  
> >  	if (apic->lapic_timer.timer_mode != timer_mode) {
> >  		if (apic_lvtt_tscdeadline(apic) != (timer_mode ==
> > -				APIC_LVT_TIMER_TSCDEADLINE))
> > +				APIC_LVT_TIMER_TSCDEADLINE)) {
> >  			kvm_lapic_set_reg(apic, APIC_TMICT, 0);
> > +			hrtimer_cancel(&apic->lapic_timer.timer);
> > +		}
> > +		if (apic_lvtt_oneshot(apic) && (timer_mode ==
> > +				APIC_LVT_TIMER_PERIODIC))
> > +			limit_periodic_timer_frequency(apic);

I noticed a problem here that required slight changes (the current code
never actually did the rate limiting), please see kvm/queue.

> >  		apic->lapic_timer.timer_mode = timer_mode;
> > -		hrtimer_cancel(&apic->lapic_timer.timer);
> >  	}
> >  }
> >  
> > -- 
> > 2.7.4
> > 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v6 2/3] KVM: LAPIC: Keep timer running when switching between one-shot and periodic mode
  2017-10-06 15:04     ` Radim Krčmář
@ 2017-10-07  0:50       ` Wanpeng Li
  0 siblings, 0 replies; 10+ messages in thread
From: Wanpeng Li @ 2017-10-07  0:50 UTC (permalink / raw)
  To: Radim Krčmář; +Cc: linux-kernel, kvm, Paolo Bonzini, Wanpeng Li

2017-10-06 23:04 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
> 2017-10-06 15:17+0200, Radim Krčmář:
>> 2017-10-05 18:54-0700, Wanpeng Li:
>> > From: Wanpeng Li <wanpeng.li@hotmail.com>
>> >
>> > If we take TSC-deadline mode timer out of the picture, the Intel SDM
>> > does not say that the timer is disable when the timer mode is change,
>> > either from one-shot to periodic or vice versa.
>> >
>> > After this patch, the timer is no longer disarmed on change of mode, so
>> > the counter (TMCCT) keeps counting down.
>> >
>> > So what does a write to LVTT changes ? On baremetal, the change of mode
>> > is probably taken into account only when the counter reach 0. When this
>> > happen, LVTT is use to figure out if the counter should restard counting
>> > down from TMICT (so periodic mode) or stop counting (if one-shot mode).
>> >
>> > This patch is based on observation of the behavior of the APIC timer on
>> > baremetal as well as check that they does not go against the description
>> > written in the Intel SDM.
>> >
>> > Cc: Paolo Bonzini <pbonzini@redhat.com>
>> > Cc: Radim Krčmář <rkrcmar@redhat.com>
>> > Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
>> > ---
>> >  arch/x86/kvm/lapic.c | 8 ++++++--
>> >  1 file changed, 6 insertions(+), 2 deletions(-)
>>
>> Queued the first two patches, thanks.
>>
>> > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> > index 8841bb5..14f63b3 100644
>> > --- a/arch/x86/kvm/lapic.c
>> > +++ b/arch/x86/kvm/lapic.c
>> > @@ -1329,10 +1329,14 @@ static void apic_update_lvtt(struct kvm_lapic *apic)
>> >
>> >     if (apic->lapic_timer.timer_mode != timer_mode) {
>> >             if (apic_lvtt_tscdeadline(apic) != (timer_mode ==
>> > -                           APIC_LVT_TIMER_TSCDEADLINE))
>> > +                           APIC_LVT_TIMER_TSCDEADLINE)) {
>> >                     kvm_lapic_set_reg(apic, APIC_TMICT, 0);
>> > +                   hrtimer_cancel(&apic->lapic_timer.timer);
>> > +           }
>> > +           if (apic_lvtt_oneshot(apic) && (timer_mode ==
>> > +                           APIC_LVT_TIMER_PERIODIC))
>> > +                   limit_periodic_timer_frequency(apic);
>
> I noticed a problem here that required slight changes (the current code
> never actually did the rate limiting), please see kvm/queue.

I see, thanks for the change.

Regards,
Wanpeng Li

>
>> >             apic->lapic_timer.timer_mode = timer_mode;
>> > -           hrtimer_cancel(&apic->lapic_timer.timer);
>> >     }
>> >  }
>> >
>> > --
>> > 2.7.4
>> >

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2017-10-07  0:50 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-06  1:54 [PATCH v6 0/3] KVM: LAPIC: Rework lapic timer to behave more like real-hardware Wanpeng Li
2017-10-06  1:54 ` [PATCH v6 1/3] KVM: LAPIC: Introduce limit_periodic_timer_frequency Wanpeng Li
2017-10-06  1:54 ` [PATCH v6 2/3] KVM: LAPIC: Keep timer running when switching between one-shot and periodic mode Wanpeng Li
2017-10-06 13:17   ` Radim Krčmář
2017-10-06 15:04     ` Radim Krčmář
2017-10-07  0:50       ` Wanpeng Li
2017-10-06  1:54 ` [PATCH v6 3/3] KVM: LAPIC: Apply change to TDCR right away to the timer Wanpeng Li
2017-10-06 13:14   ` Radim Krčmář
2017-10-06 13:59     ` Wanpeng Li
2017-10-06 14:14       ` Radim Krčmář

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.