linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/4] KVM: LAPIC: Rework lapic timer to behave more like real-hardware
@ 2017-09-29  1:04 Wanpeng Li
  2017-09-29  1:04 ` [PATCH v2 1/4] KVM: LAPIC: Fix lapic timer mode transition Wanpeng Li
                   ` (4 more replies)
  0 siblings, 5 replies; 27+ messages in thread
From: Wanpeng Li @ 2017-09-29  1:04 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: Paolo Bonzini, Radim Krčmář, Wanpeng Li

The issue is reported in xen community.

Anthony PERARD pointed out:

https://www.mail-archive.com/xen-devel@lists.xen.org/msg117283.html#

 | When developing PVH for OVMF, I've used the lapic timer. It turns out that the
 | way it is used by OVMF did not work with Xen [1]. I tried to find out how
 | real-hw behave, and write a XTF tests [2]. And this patch series tries to fix
 | the behavior of the vlapic timer.
 | 
 | 
 | The OVMF driver for the APIC timer initialize the timer like this:
 | 	write to TMICT (initial counter)
 | 	write to TMDCR (divide configuration)
 | 	enable the timer (this may change timer mode from one-shot to periodic)
 | It turns out that TMICT is set to 0 on the last step, but OVMF expect the timer
 | to run.
 | 
 | Here is some description of the APIC timer, base on observation as well as read
 | of the Intel SDM. The description is also patch of patch description
 | (reworded).
 | 
 | Maybe a way of thinking how the APIC timer is evaluated, is to think of how
 | hardward will do it. There is a counter TMCCT which always keeps counting down.
 | 
 | Setting TMICT also set TMCCT, nothing else matter.
 | Setting LVTT does not change anything right away.
 | Setting TMDCR does not change much.
 | 
 | Now TMCCT keeps counting down, by a value related to TMDCR.
 | Once, TMCCT reach 0, it is only at this time that LVTT is taken into account.
 | Is there an interrupt to deliver? Should the timer restart counting from the
 | value in TMICT?
 | 
 | In the Intel SDM, there is the word "disarm" of the timer used. I guess the
 | easier way to disarm the APIC timer (when in periodic or one-shot) is to set
 | TMICT to 0. But if we take TSC-Deadline mode out of the picture, there is
 | nothing in the manual that say that the timer is disarm or stopped when
 | changing timer mode (there is only two modes left, period and one-shot).
 | 
 | As for the TSC-deadline timer mode, observation shown that changing to it (or
 | from it) does reset and disarm both timers, so effectively TMICT and the
 | tscdeadline are set to 0.
 | 
 | [1] https://lists.xenproject.org/archives/html/xen-devel/2016-12/msg00959.html
 | [2] v1: 
 | https://lists.xenproject.org/archives/html/xen-devel/2017-03/msg02533.html
 |     v2: look for "[XTF PATCH V2 0/3] Testing vlapic timer"

 In addition, Patch 3/4 implements the illegal vector error handling according to 
 SDM 10.5.2~10.5.3.

v1 -> v2:
 * add cover-letter and collect recent lapic patches to one patchset

Wanpeng Li (4):
  KVM: LAPIC: Fix lapic timer mode transition
  KVM: LAPIC: Keep timer running when switching between one-shot and periodic mode
  KVM: LAPIC: Apply change to TDCR right away to the timer
  KVM: LAPIC: Don't silently accept bad vectors

 arch/x86/include/asm/apicdef.h |  1 +
 arch/x86/kvm/lapic.c           | 90 ++++++++++++++++++++++++++++++++++--------
 2 files changed, 74 insertions(+), 17 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH v2 1/4] KVM: LAPIC: Fix lapic timer mode transition
  2017-09-29  1:04 [PATCH v2 0/4] KVM: LAPIC: Rework lapic timer to behave more like real-hardware Wanpeng Li
@ 2017-09-29  1:04 ` Wanpeng Li
  2017-10-03 17:05   ` Radim Krčmář
  2017-09-29  1:04 ` [PATCH v2 2/4] KVM: LAPIC: Keep timer running when switching between one-shot and periodic mode Wanpeng Li
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 27+ messages in thread
From: Wanpeng Li @ 2017-09-29  1:04 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: Paolo Bonzini, Radim Krčmář, Wanpeng Li

From: Wanpeng Li <wanpeng.li@hotmail.com>

SDM 10.5.4.1 TSC-Deadline Mode mentioned that "Transitioning between TSC-Deadline
mode and other timer modes also disarms the timer". So the APIC Timer Initial Count
Register for one-shot/periodic mode should be reset. This patch do it.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
---
 arch/x86/include/asm/apicdef.h | 1 +
 arch/x86/kvm/lapic.c           | 3 +++
 2 files changed, 4 insertions(+)

diff --git a/arch/x86/include/asm/apicdef.h b/arch/x86/include/asm/apicdef.h
index c46bb99..d8ef1b4 100644
--- a/arch/x86/include/asm/apicdef.h
+++ b/arch/x86/include/asm/apicdef.h
@@ -100,6 +100,7 @@
 #define		APIC_TIMER_BASE_CLKIN		0x0
 #define		APIC_TIMER_BASE_TMBASE		0x1
 #define		APIC_TIMER_BASE_DIV		0x2
+#define		APIC_LVT_TIMER_MASK		(3 << 17)
 #define		APIC_LVT_TIMER_ONESHOT		(0 << 17)
 #define		APIC_LVT_TIMER_PERIODIC		(1 << 17)
 #define		APIC_LVT_TIMER_TSCDEADLINE	(2 << 17)
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 69c5612..a739cbb 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1722,6 +1722,9 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
 		break;
 
 	case APIC_LVTT:
+		if (apic_lvtt_tscdeadline(apic) != ((val &
+			APIC_LVT_TIMER_MASK) == APIC_LVT_TIMER_TSCDEADLINE))
+			kvm_lapic_set_reg(apic, APIC_TMICT, 0);
 		if (!kvm_apic_sw_enabled(apic))
 			val |= APIC_LVT_MASKED;
 		val &= (apic_lvt_mask[0] | apic->lapic_timer.timer_mode_mask);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 2/4] KVM: LAPIC: Keep timer running when switching between one-shot and periodic mode
  2017-09-29  1:04 [PATCH v2 0/4] KVM: LAPIC: Rework lapic timer to behave more like real-hardware Wanpeng Li
  2017-09-29  1:04 ` [PATCH v2 1/4] KVM: LAPIC: Fix lapic timer mode transition Wanpeng Li
@ 2017-09-29  1:04 ` Wanpeng Li
  2017-10-03 17:06   ` Radim Krčmář
  2017-09-29  1:04 ` [PATCH v2 3/4] KVM: LAPIC: Apply change to TDCR right away to the timer Wanpeng Li
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 27+ messages in thread
From: Wanpeng Li @ 2017-09-29  1:04 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: Paolo Bonzini, Radim Krčmář, Wanpeng Li

From: Wanpeng Li <wanpeng.li@hotmail.com>

If we take TSC-deadline mode timer out of the picture, the Intel SDM
does not say that the timer is disable when the timer mode is change,
either from one-shot to periodic or vice versa.

After this patch, the timer is no longer disarmed on change of mode, so
the counter (TMCCT) keeps counting down.

So what does a write to LVTT changes ? On baremetal, the change of mode
is probably taken into account only when the counter reach 0. When this
happen, LVTT is use to figure out if the counter should restard counting
down from TMICT (so periodic mode) or stop counting (if one-shot mode).

This patch is based on observation of the behavior of the APIC timer on
baremetal as well as check that they does not go against the description
written in the Intel SDM.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
---
 arch/x86/kvm/lapic.c | 40 ++++++++++++++++++++++++++++------------
 1 file changed, 28 insertions(+), 12 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index a739cbb..946c11b 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1301,7 +1301,7 @@ static void update_divide_count(struct kvm_lapic *apic)
 				   apic->divide_count);
 }
 
-static void apic_update_lvtt(struct kvm_lapic *apic)
+static bool apic_update_lvtt(struct kvm_lapic *apic)
 {
 	u32 timer_mode = kvm_lapic_get_reg(apic, APIC_LVTT) &
 			apic->lapic_timer.timer_mode_mask;
@@ -1309,7 +1309,9 @@ static void apic_update_lvtt(struct kvm_lapic *apic)
 	if (apic->lapic_timer.timer_mode != timer_mode) {
 		apic->lapic_timer.timer_mode = timer_mode;
 		hrtimer_cancel(&apic->lapic_timer.timer);
+		return true;
 	}
+	return false;
 }
 
 static void apic_timer_expired(struct kvm_lapic *apic)
@@ -1430,11 +1432,12 @@ static void start_sw_period(struct kvm_lapic *apic)
 		HRTIMER_MODE_ABS_PINNED);
 }
 
-static bool set_target_expiration(struct kvm_lapic *apic)
+static bool set_target_expiration(struct kvm_lapic *apic, bool timer_update)
 {
-	ktime_t now;
-	u64 tscl = rdtsc();
+	ktime_t now, remaining;
+	u64 tscl = rdtsc(), delta;
 
+	/* Calculate the next time the timer should trigger an interrupt */
 	now = ktime_get();
 	apic->lapic_timer.period = (u64)kvm_lapic_get_reg(apic, APIC_TMICT)
 		* APIC_BUS_CYCLE_NS * apic->divide_count;
@@ -1470,9 +1473,21 @@ static bool set_target_expiration(struct kvm_lapic *apic)
 		   ktime_to_ns(ktime_add_ns(now,
 				apic->lapic_timer.period)));
 
+	if (!timer_update)
+		delta = apic->lapic_timer.period;
+	else {
+		remaining = ktime_sub(apic->lapic_timer.target_expiration, now);
+		if (ktime_to_ns(remaining) < 0)
+			remaining = 0;
+		delta = mod_64(ktime_to_ns(remaining), apic->lapic_timer.period);
+	}
+
+	if (!delta)
+		return false;
+
 	apic->lapic_timer.tscdeadline = kvm_read_l1_tsc(apic->vcpu, tscl) +
-		nsec_to_cycles(apic->vcpu, apic->lapic_timer.period);
-	apic->lapic_timer.target_expiration = ktime_add_ns(now, apic->lapic_timer.period);
+		nsec_to_cycles(apic->vcpu, delta);
+	apic->lapic_timer.target_expiration = ktime_add_ns(now, delta);
 
 	return true;
 }
@@ -1609,12 +1624,12 @@ void kvm_lapic_restart_hv_timer(struct kvm_vcpu *vcpu)
 	restart_apic_timer(apic);
 }
 
-static void start_apic_timer(struct kvm_lapic *apic)
+static void start_apic_timer(struct kvm_lapic *apic, bool timer_update)
 {
 	atomic_set(&apic->lapic_timer.pending, 0);
 
 	if ((apic_lvtt_period(apic) || apic_lvtt_oneshot(apic))
-	    && !set_target_expiration(apic))
+	    && !set_target_expiration(apic, timer_update))
 		return;
 
 	restart_apic_timer(apic);
@@ -1729,7 +1744,8 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
 			val |= APIC_LVT_MASKED;
 		val &= (apic_lvt_mask[0] | apic->lapic_timer.timer_mode_mask);
 		kvm_lapic_set_reg(apic, APIC_LVTT, val);
-		apic_update_lvtt(apic);
+		if (apic_update_lvtt(apic) && !apic_lvtt_tscdeadline(apic))
+			start_apic_timer(apic, true);
 		break;
 
 	case APIC_TMICT:
@@ -1738,7 +1754,7 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
 
 		hrtimer_cancel(&apic->lapic_timer.timer);
 		kvm_lapic_set_reg(apic, APIC_TMICT, val);
-		start_apic_timer(apic);
+		start_apic_timer(apic, false);
 		break;
 
 	case APIC_TDCR:
@@ -1872,7 +1888,7 @@ void kvm_set_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu, u64 data)
 
 	hrtimer_cancel(&apic->lapic_timer.timer);
 	apic->lapic_timer.tscdeadline = data;
-	start_apic_timer(apic);
+	start_apic_timer(apic, false);
 }
 
 void kvm_lapic_set_tpr(struct kvm_vcpu *vcpu, unsigned long cr8)
@@ -2238,7 +2254,7 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
 	apic_update_lvtt(apic);
 	apic_manage_nmi_watchdog(apic, kvm_lapic_get_reg(apic, APIC_LVT0));
 	update_divide_count(apic);
-	start_apic_timer(apic);
+	start_apic_timer(apic, false);
 	apic->irr_pending = true;
 	apic->isr_count = vcpu->arch.apicv_active ?
 				1 : count_vectors(apic->regs + APIC_ISR);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 3/4] KVM: LAPIC: Apply change to TDCR right away to the timer
  2017-09-29  1:04 [PATCH v2 0/4] KVM: LAPIC: Rework lapic timer to behave more like real-hardware Wanpeng Li
  2017-09-29  1:04 ` [PATCH v2 1/4] KVM: LAPIC: Fix lapic timer mode transition Wanpeng Li
  2017-09-29  1:04 ` [PATCH v2 2/4] KVM: LAPIC: Keep timer running when switching between one-shot and periodic mode Wanpeng Li
@ 2017-09-29  1:04 ` Wanpeng Li
  2017-10-03 17:28   ` Radim Krčmář
  2017-09-29  1:04 ` [PATCH v2 4/4] KVM: LAPIC: Don't silently accept bad vectors Wanpeng Li
  2017-10-05 10:57 ` [PATCH v2 0/4] KVM: LAPIC: Rework lapic timer to behave more like real-hardware Wanpeng Li
  4 siblings, 1 reply; 27+ messages in thread
From: Wanpeng Li @ 2017-09-29  1:04 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: Paolo Bonzini, Radim Krčmář, Wanpeng Li

From: Wanpeng Li <wanpeng.li@hotmail.com>

The description in the Intel SDM of how the divide configuration
register is used: "The APIC timer frequency will be the processor's bus
clock or core crystal clock frequency divided by the value specified in
the divide configuration register."

Observation of baremetal shown that when the TDCR is change, the TMCCT
does not change or make a big jump in value, but the rate at which it
count down change.

The patch update the emulation to APIC timer to so that a change to the
divide configuration would be reflected in the value of the counter and
when the next interrupt is triggered.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
---
 arch/x86/kvm/lapic.c | 31 +++++++++++++++++++++----------
 1 file changed, 21 insertions(+), 10 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 946c11b..6bafd06 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1432,7 +1432,7 @@ static void start_sw_period(struct kvm_lapic *apic)
 		HRTIMER_MODE_ABS_PINNED);
 }
 
-static bool set_target_expiration(struct kvm_lapic *apic, bool timer_update)
+static bool set_target_expiration(struct kvm_lapic *apic, bool timer_update, uint32_t old_divisor)
 {
 	ktime_t now, remaining;
 	u64 tscl = rdtsc(), delta;
@@ -1440,7 +1440,7 @@ static bool set_target_expiration(struct kvm_lapic *apic, bool timer_update)
 	/* Calculate the next time the timer should trigger an interrupt */
 	now = ktime_get();
 	apic->lapic_timer.period = (u64)kvm_lapic_get_reg(apic, APIC_TMICT)
-		* APIC_BUS_CYCLE_NS * apic->divide_count;
+		* APIC_BUS_CYCLE_NS * old_divisor;
 
 	if (!apic->lapic_timer.period)
 		return false;
@@ -1485,6 +1485,12 @@ static bool set_target_expiration(struct kvm_lapic *apic, bool timer_update)
 	if (!delta)
 		return false;
 
+	if (apic->divide_count != old_divisor) {
+		apic->lapic_timer.period = (u64)kvm_lapic_get_reg(apic, APIC_TMICT)
+			* APIC_BUS_CYCLE_NS * apic->divide_count;
+		delta = delta * apic->divide_count / old_divisor;
+	}
+
 	apic->lapic_timer.tscdeadline = kvm_read_l1_tsc(apic->vcpu, tscl) +
 		nsec_to_cycles(apic->vcpu, delta);
 	apic->lapic_timer.target_expiration = ktime_add_ns(now, delta);
@@ -1624,12 +1630,13 @@ void kvm_lapic_restart_hv_timer(struct kvm_vcpu *vcpu)
 	restart_apic_timer(apic);
 }
 
-static void start_apic_timer(struct kvm_lapic *apic, bool timer_update)
+static void start_apic_timer(struct kvm_lapic *apic, bool timer_update,
+				uint32_t old_divisor)
 {
 	atomic_set(&apic->lapic_timer.pending, 0);
 
 	if ((apic_lvtt_period(apic) || apic_lvtt_oneshot(apic))
-	    && !set_target_expiration(apic, timer_update))
+	    && !set_target_expiration(apic, timer_update, old_divisor))
 		return;
 
 	restart_apic_timer(apic);
@@ -1745,7 +1752,7 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
 		val &= (apic_lvt_mask[0] | apic->lapic_timer.timer_mode_mask);
 		kvm_lapic_set_reg(apic, APIC_LVTT, val);
 		if (apic_update_lvtt(apic) && !apic_lvtt_tscdeadline(apic))
-			start_apic_timer(apic, true);
+			start_apic_timer(apic, true, apic->divide_count);
 		break;
 
 	case APIC_TMICT:
@@ -1754,16 +1761,20 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
 
 		hrtimer_cancel(&apic->lapic_timer.timer);
 		kvm_lapic_set_reg(apic, APIC_TMICT, val);
-		start_apic_timer(apic, false);
+		start_apic_timer(apic, false, apic->divide_count);
 		break;
 
-	case APIC_TDCR:
+	case APIC_TDCR: {
+		uint32_t current_divisor = apic->divide_count;
+
 		if (val & 4)
 			apic_debug("KVM_WRITE:TDCR %x\n", val);
 		kvm_lapic_set_reg(apic, APIC_TDCR, val);
 		update_divide_count(apic);
+		hrtimer_cancel(&apic->lapic_timer.timer);
+		start_apic_timer(apic, true, current_divisor);
 		break;
-
+	}
 	case APIC_ESR:
 		if (apic_x2apic_mode(apic) && val != 0) {
 			apic_debug("KVM_WRITE:ESR not zero %x\n", val);
@@ -1888,7 +1899,7 @@ void kvm_set_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu, u64 data)
 
 	hrtimer_cancel(&apic->lapic_timer.timer);
 	apic->lapic_timer.tscdeadline = data;
-	start_apic_timer(apic, false);
+	start_apic_timer(apic, false, apic->divide_count);
 }
 
 void kvm_lapic_set_tpr(struct kvm_vcpu *vcpu, unsigned long cr8)
@@ -2254,7 +2265,7 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
 	apic_update_lvtt(apic);
 	apic_manage_nmi_watchdog(apic, kvm_lapic_get_reg(apic, APIC_LVT0));
 	update_divide_count(apic);
-	start_apic_timer(apic, false);
+	start_apic_timer(apic, false, apic->divide_count);
 	apic->irr_pending = true;
 	apic->isr_count = vcpu->arch.apicv_active ?
 				1 : count_vectors(apic->regs + APIC_ISR);
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* [PATCH v2 4/4] KVM: LAPIC: Don't silently accept bad vectors
  2017-09-29  1:04 [PATCH v2 0/4] KVM: LAPIC: Rework lapic timer to behave more like real-hardware Wanpeng Li
                   ` (2 preceding siblings ...)
  2017-09-29  1:04 ` [PATCH v2 3/4] KVM: LAPIC: Apply change to TDCR right away to the timer Wanpeng Li
@ 2017-09-29  1:04 ` Wanpeng Li
  2017-10-03 17:53   ` Radim Krčmář
  2017-10-05 10:57 ` [PATCH v2 0/4] KVM: LAPIC: Rework lapic timer to behave more like real-hardware Wanpeng Li
  4 siblings, 1 reply; 27+ messages in thread
From: Wanpeng Li @ 2017-09-29  1:04 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: Paolo Bonzini, Radim Krčmář, Wanpeng Li

From: Wanpeng Li <wanpeng.li@hotmail.com>

Vectors 0-15 are reserved, and a physical LAPIC - upon sending or
receiving one - would generate an APIC error instead of doing the
requested action. Make our emulation behave similarly.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
---
 arch/x86/kvm/lapic.c | 30 ++++++++++++++++++++++++++++--
 1 file changed, 28 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 6bafd06..a779ba9 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -935,6 +935,25 @@ bool kvm_intr_is_single_vcpu_fast(struct kvm *kvm, struct kvm_lapic_irq *irq,
 	return ret;
 }
 
+static void apic_error(struct kvm_lapic *apic, unsigned long errmask)
+{
+	uint32_t esr;
+
+	esr = kvm_lapic_get_reg(apic, APIC_ESR);
+
+	if ((esr & errmask) != errmask) {
+		uint32_t lvterr = kvm_lapic_get_reg(apic, APIC_LVTERR);
+
+		kvm_lapic_set_reg(apic, APIC_ESR, esr | errmask);
+		if (!(lvterr & APIC_LVT_MASKED)) {
+			struct kvm_lapic_irq irq;
+
+			irq.vector = lvterr & 0xff;
+			kvm_irq_delivery_to_apic(apic->vcpu->kvm, apic, &irq, NULL);
+		}
+	}
+}
+
 /*
  * Add a pending IRQ into lapic.
  * Return 1 if successfully added and 0 if discarded.
@@ -946,6 +965,11 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
 	int result = 0;
 	struct kvm_vcpu *vcpu = apic->vcpu;
 
+	if (unlikely(vector < 16) && delivery_mode == APIC_DM_FIXED) {
+		apic_error(apic, APIC_ESR_RECVILL);
+		return 0;
+	}
+
 	trace_kvm_apic_accept_irq(vcpu->vcpu_id, delivery_mode,
 				  trig_mode, vector);
 	switch (delivery_mode) {
@@ -1146,7 +1170,10 @@ static void apic_send_ipi(struct kvm_lapic *apic)
 		   irq.trig_mode, irq.level, irq.dest_mode, irq.delivery_mode,
 		   irq.vector, irq.msi_redir_hint);
 
-	kvm_irq_delivery_to_apic(apic->vcpu->kvm, apic, &irq, NULL);
+	if (unlikely(irq.vector < 16 && irq.delivery_mode == APIC_DM_FIXED))
+		apic_error(apic, APIC_ESR_SENDILL);
+	else
+		kvm_irq_delivery_to_apic(apic->vcpu->kvm, apic, &irq, NULL);
 }
 
 static u32 apic_get_tmcct(struct kvm_lapic *apic)
@@ -1734,7 +1761,6 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
 	case APIC_LVTPC:
 	case APIC_LVT1:
 	case APIC_LVTERR:
-		/* TODO: Check vector */
 		if (!kvm_apic_sw_enabled(apic))
 			val |= APIC_LVT_MASKED;
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 1/4] KVM: LAPIC: Fix lapic timer mode transition
  2017-09-29  1:04 ` [PATCH v2 1/4] KVM: LAPIC: Fix lapic timer mode transition Wanpeng Li
@ 2017-10-03 17:05   ` Radim Krčmář
  2017-10-04  1:45     ` Wanpeng Li
  0 siblings, 1 reply; 27+ messages in thread
From: Radim Krčmář @ 2017-10-03 17:05 UTC (permalink / raw)
  To: Wanpeng Li; +Cc: linux-kernel, kvm, Paolo Bonzini, Wanpeng Li

2017-09-28 18:04-0700, Wanpeng Li:
> From: Wanpeng Li <wanpeng.li@hotmail.com>
> 
> SDM 10.5.4.1 TSC-Deadline Mode mentioned that "Transitioning between TSC-Deadline
> mode and other timer modes also disarms the timer". So the APIC Timer Initial Count
> Register for one-shot/periodic mode should be reset. This patch do it.

At the beginning of the secion is also:

  A write to the LVT Timer Register that changes the timer mode disarms
  the local APIC timer. The supported timer modes are given in Table
  10-2. The three modes of the local APIC timer are mutually exclusive.

So we should also disarm when switching between one-shot and periodic.

apic_update_lvtt() already has logic to determine whether the timer mode
has changed and is the perfect place to clear APIC_TMICT.

Thanks.

> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
> ---
>  arch/x86/include/asm/apicdef.h | 1 +
>  arch/x86/kvm/lapic.c           | 3 +++
>  2 files changed, 4 insertions(+)
> 
> diff --git a/arch/x86/include/asm/apicdef.h b/arch/x86/include/asm/apicdef.h
> index c46bb99..d8ef1b4 100644
> --- a/arch/x86/include/asm/apicdef.h
> +++ b/arch/x86/include/asm/apicdef.h
> @@ -100,6 +100,7 @@
>  #define		APIC_TIMER_BASE_CLKIN		0x0
>  #define		APIC_TIMER_BASE_TMBASE		0x1
>  #define		APIC_TIMER_BASE_DIV		0x2
> +#define		APIC_LVT_TIMER_MASK		(3 << 17)
>  #define		APIC_LVT_TIMER_ONESHOT		(0 << 17)
>  #define		APIC_LVT_TIMER_PERIODIC		(1 << 17)
>  #define		APIC_LVT_TIMER_TSCDEADLINE	(2 << 17)
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 69c5612..a739cbb 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -1722,6 +1722,9 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
>  		break;
>  
>  	case APIC_LVTT:
> +		if (apic_lvtt_tscdeadline(apic) != ((val &
> +			APIC_LVT_TIMER_MASK) == APIC_LVT_TIMER_TSCDEADLINE))
> +			kvm_lapic_set_reg(apic, APIC_TMICT, 0);
>  		if (!kvm_apic_sw_enabled(apic))
>  			val |= APIC_LVT_MASKED;
>  		val &= (apic_lvt_mask[0] | apic->lapic_timer.timer_mode_mask);
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 2/4] KVM: LAPIC: Keep timer running when switching between one-shot and periodic mode
  2017-09-29  1:04 ` [PATCH v2 2/4] KVM: LAPIC: Keep timer running when switching between one-shot and periodic mode Wanpeng Li
@ 2017-10-03 17:06   ` Radim Krčmář
  2017-10-04  1:46     ` Wanpeng Li
  0 siblings, 1 reply; 27+ messages in thread
From: Radim Krčmář @ 2017-10-03 17:06 UTC (permalink / raw)
  To: Wanpeng Li; +Cc: linux-kernel, kvm, Paolo Bonzini, Wanpeng Li

2017-09-28 18:04-0700, Wanpeng Li:
> From: Wanpeng Li <wanpeng.li@hotmail.com>
> 
> If we take TSC-deadline mode timer out of the picture, the Intel SDM
> does not say that the timer is disable when the timer mode is change,
> either from one-shot to periodic or vice versa.

I think it does, please see comment under [v2 1/4].

> After this patch, the timer is no longer disarmed on change of mode, so
> the counter (TMCCT) keeps counting down.
> 
> So what does a write to LVTT changes ? On baremetal, the change of mode
> is probably taken into account only when the counter reach 0. When this
> happen, LVTT is use to figure out if the counter should restard counting
> down from TMICT (so periodic mode) or stop counting (if one-shot mode).
> 
> This patch is based on observation of the behavior of the APIC timer on
> baremetal as well as check that they does not go against the description
> written in the Intel SDM.
> 
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
> ---
>  arch/x86/kvm/lapic.c | 40 ++++++++++++++++++++++++++++------------
>  1 file changed, 28 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index a739cbb..946c11b 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -1301,7 +1301,7 @@ static void update_divide_count(struct kvm_lapic *apic)
>  				   apic->divide_count);
>  }
>  
> -static void apic_update_lvtt(struct kvm_lapic *apic)
> +static bool apic_update_lvtt(struct kvm_lapic *apic)
>  {
>  	u32 timer_mode = kvm_lapic_get_reg(apic, APIC_LVTT) &
>  			apic->lapic_timer.timer_mode_mask;
> @@ -1309,7 +1309,9 @@ static void apic_update_lvtt(struct kvm_lapic *apic)
>  	if (apic->lapic_timer.timer_mode != timer_mode) {
>  		apic->lapic_timer.timer_mode = timer_mode;
>  		hrtimer_cancel(&apic->lapic_timer.timer);
> +		return true;
>  	}
> +	return false;
>  }
>  
>  static void apic_timer_expired(struct kvm_lapic *apic)
> @@ -1430,11 +1432,12 @@ static void start_sw_period(struct kvm_lapic *apic)
>  		HRTIMER_MODE_ABS_PINNED);
>  }
>  
> -static bool set_target_expiration(struct kvm_lapic *apic)
> +static bool set_target_expiration(struct kvm_lapic *apic, bool timer_update)
>  {
> -	ktime_t now;
> -	u64 tscl = rdtsc();
> +	ktime_t now, remaining;
> +	u64 tscl = rdtsc(), delta;
>  
> +	/* Calculate the next time the timer should trigger an interrupt */
>  	now = ktime_get();
>  	apic->lapic_timer.period = (u64)kvm_lapic_get_reg(apic, APIC_TMICT)
>  		* APIC_BUS_CYCLE_NS * apic->divide_count;
> @@ -1470,9 +1473,21 @@ static bool set_target_expiration(struct kvm_lapic *apic)
>  		   ktime_to_ns(ktime_add_ns(now,
>  				apic->lapic_timer.period)));
>  
> +	if (!timer_update)
> +		delta = apic->lapic_timer.period;
> +	else {
> +		remaining = ktime_sub(apic->lapic_timer.target_expiration, now);
> +		if (ktime_to_ns(remaining) < 0)
> +			remaining = 0;
> +		delta = mod_64(ktime_to_ns(remaining), apic->lapic_timer.period);
> +	}
> +
> +	if (!delta)
> +		return false;
> +
>  	apic->lapic_timer.tscdeadline = kvm_read_l1_tsc(apic->vcpu, tscl) +
> -		nsec_to_cycles(apic->vcpu, apic->lapic_timer.period);
> -	apic->lapic_timer.target_expiration = ktime_add_ns(now, apic->lapic_timer.period);
> +		nsec_to_cycles(apic->vcpu, delta);
> +	apic->lapic_timer.target_expiration = ktime_add_ns(now, delta);
>  
>  	return true;
>  }
> @@ -1609,12 +1624,12 @@ void kvm_lapic_restart_hv_timer(struct kvm_vcpu *vcpu)
>  	restart_apic_timer(apic);
>  }
>  
> -static void start_apic_timer(struct kvm_lapic *apic)
> +static void start_apic_timer(struct kvm_lapic *apic, bool timer_update)
>  {
>  	atomic_set(&apic->lapic_timer.pending, 0);
>  
>  	if ((apic_lvtt_period(apic) || apic_lvtt_oneshot(apic))
> -	    && !set_target_expiration(apic))
> +	    && !set_target_expiration(apic, timer_update))
>  		return;
>  
>  	restart_apic_timer(apic);
> @@ -1729,7 +1744,8 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
>  			val |= APIC_LVT_MASKED;
>  		val &= (apic_lvt_mask[0] | apic->lapic_timer.timer_mode_mask);
>  		kvm_lapic_set_reg(apic, APIC_LVTT, val);
> -		apic_update_lvtt(apic);
> +		if (apic_update_lvtt(apic) && !apic_lvtt_tscdeadline(apic))
> +			start_apic_timer(apic, true);
>  		break;
>  
>  	case APIC_TMICT:
> @@ -1738,7 +1754,7 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
>  
>  		hrtimer_cancel(&apic->lapic_timer.timer);
>  		kvm_lapic_set_reg(apic, APIC_TMICT, val);
> -		start_apic_timer(apic);
> +		start_apic_timer(apic, false);
>  		break;
>  
>  	case APIC_TDCR:
> @@ -1872,7 +1888,7 @@ void kvm_set_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu, u64 data)
>  
>  	hrtimer_cancel(&apic->lapic_timer.timer);
>  	apic->lapic_timer.tscdeadline = data;
> -	start_apic_timer(apic);
> +	start_apic_timer(apic, false);
>  }
>  
>  void kvm_lapic_set_tpr(struct kvm_vcpu *vcpu, unsigned long cr8)
> @@ -2238,7 +2254,7 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
>  	apic_update_lvtt(apic);
>  	apic_manage_nmi_watchdog(apic, kvm_lapic_get_reg(apic, APIC_LVT0));
>  	update_divide_count(apic);
> -	start_apic_timer(apic);
> +	start_apic_timer(apic, false);
>  	apic->irr_pending = true;
>  	apic->isr_count = vcpu->arch.apicv_active ?
>  				1 : count_vectors(apic->regs + APIC_ISR);
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 3/4] KVM: LAPIC: Apply change to TDCR right away to the timer
  2017-09-29  1:04 ` [PATCH v2 3/4] KVM: LAPIC: Apply change to TDCR right away to the timer Wanpeng Li
@ 2017-10-03 17:28   ` Radim Krčmář
  2017-10-04  1:59     ` Wanpeng Li
  0 siblings, 1 reply; 27+ messages in thread
From: Radim Krčmář @ 2017-10-03 17:28 UTC (permalink / raw)
  To: Wanpeng Li; +Cc: linux-kernel, kvm, Paolo Bonzini, Wanpeng Li

2017-09-28 18:04-0700, Wanpeng Li:
> From: Wanpeng Li <wanpeng.li@hotmail.com>
> 
> The description in the Intel SDM of how the divide configuration
> register is used: "The APIC timer frequency will be the processor's bus
> clock or core crystal clock frequency divided by the value specified in
> the divide configuration register."
> 
> Observation of baremetal shown that when the TDCR is change, the TMCCT
> does not change or make a big jump in value, but the rate at which it
> count down change.
> 
> The patch update the emulation to APIC timer to so that a change to the
> divide configuration would be reflected in the value of the counter and
> when the next interrupt is triggered.
> 
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
> ---

Why do we need to do more than just restart the timer?

The TMCCT should remain roughly at the same level -- changing divide
count modifies target_expiration and it looks like apic_get_tmcct()
would get the same result like before changing divide count.

Thanks.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 4/4] KVM: LAPIC: Don't silently accept bad vectors
  2017-09-29  1:04 ` [PATCH v2 4/4] KVM: LAPIC: Don't silently accept bad vectors Wanpeng Li
@ 2017-10-03 17:53   ` Radim Krčmář
  2017-10-04  7:56     ` Wanpeng Li
  0 siblings, 1 reply; 27+ messages in thread
From: Radim Krčmář @ 2017-10-03 17:53 UTC (permalink / raw)
  To: Wanpeng Li; +Cc: linux-kernel, kvm, Paolo Bonzini, Wanpeng Li

2017-09-28 18:04-0700, Wanpeng Li:
> From: Wanpeng Li <wanpeng.li@hotmail.com>
> 
> Vectors 0-15 are reserved, and a physical LAPIC - upon sending or
> receiving one - would generate an APIC error instead of doing the
> requested action. Make our emulation behave similarly.
> 
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Radim Krčmář <rkrcmar@redhat.com>
> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
> ---
>  arch/x86/kvm/lapic.c | 30 ++++++++++++++++++++++++++++--
>  1 file changed, 28 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 6bafd06..a779ba9 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -935,6 +935,25 @@ bool kvm_intr_is_single_vcpu_fast(struct kvm *kvm, struct kvm_lapic_irq *irq,
>  	return ret;
>  }
>  
> +static void apic_error(struct kvm_lapic *apic, unsigned long errmask)
> +{
> +	uint32_t esr;
> +
> +	esr = kvm_lapic_get_reg(apic, APIC_ESR);
> +
> +	if ((esr & errmask) != errmask) {

The spec makes me think that there is going to be only 1 interrupt
(regardless of the number errors) until the software writes 0 to
APIC_ESR.  Is there a better description than the following 10.5.3?

  The ESR is a write/read register. Before attempt to read from the ESR,
  software should first write to it. (The value written does not affect
  the values read subsequently; only zero may be written in x2APIC
  mode.) This write clears any previously logged errors and updates the
  ESR with any errors detected since the last write to the ESR. This
  write also rearms the APIC error interrupt triggering mechanism.

This also describes a different handling of APIC_ESR -- APIC_ESR is
updated only on software writes to APIC_ESR.  All errors in between seem
to be logged internally (not sure where to migrate it).

> +		uint32_t lvterr = kvm_lapic_get_reg(apic, APIC_LVTERR);
> +
> +		kvm_lapic_set_reg(apic, APIC_ESR, esr | errmask);
> +		if (!(lvterr & APIC_LVT_MASKED)) {
> +			struct kvm_lapic_irq irq;
> +
> +			irq.vector = lvterr & 0xff;
> +			kvm_irq_delivery_to_apic(apic->vcpu->kvm, apic, &irq, NULL);
> +		}
> +	}
> +}
> +
>  /*
>   * Add a pending IRQ into lapic.
>   * Return 1 if successfully added and 0 if discarded.
> @@ -946,6 +965,11 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
>  	int result = 0;
>  	struct kvm_vcpu *vcpu = apic->vcpu;
>  
> +	if (unlikely(vector < 16) && delivery_mode == APIC_DM_FIXED) {
> +		apic_error(apic, APIC_ESR_RECVILL);

The error is also triggered if lowest priority is supported and tries to
deliver an invalid vector.

> +		return 0;
> +	}
> +
>  	trace_kvm_apic_accept_irq(vcpu->vcpu_id, delivery_mode,
>  				  trig_mode, vector);
>  	switch (delivery_mode) {
> @@ -1146,7 +1170,10 @@ static void apic_send_ipi(struct kvm_lapic *apic)
>  		   irq.trig_mode, irq.level, irq.dest_mode, irq.delivery_mode,
>  		   irq.vector, irq.msi_redir_hint);
>  
> -	kvm_irq_delivery_to_apic(apic->vcpu->kvm, apic, &irq, NULL);
> +	if (unlikely(irq.vector < 16 && irq.delivery_mode == APIC_DM_FIXED))

Please check how APICv self-IPI acceleration behaves, so we're
consistent.

Thanks.

> +		apic_error(apic, APIC_ESR_SENDILL);
> +	else
> +		kvm_irq_delivery_to_apic(apic->vcpu->kvm, apic, &irq, NULL);
>  }
>  
>  static u32 apic_get_tmcct(struct kvm_lapic *apic)
> @@ -1734,7 +1761,6 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
>  	case APIC_LVTPC:
>  	case APIC_LVT1:
>  	case APIC_LVTERR:
> -		/* TODO: Check vector */
>  		if (!kvm_apic_sw_enabled(apic))
>  			val |= APIC_LVT_MASKED;
>  
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 1/4] KVM: LAPIC: Fix lapic timer mode transition
  2017-10-03 17:05   ` Radim Krčmář
@ 2017-10-04  1:45     ` Wanpeng Li
  2017-10-04 13:21       ` Radim Krčmář
  0 siblings, 1 reply; 27+ messages in thread
From: Wanpeng Li @ 2017-10-04  1:45 UTC (permalink / raw)
  To: Radim Krčmář; +Cc: linux-kernel, kvm, Paolo Bonzini, Wanpeng Li

2017-10-04 1:05 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
> 2017-09-28 18:04-0700, Wanpeng Li:
>> From: Wanpeng Li <wanpeng.li@hotmail.com>
>>
>> SDM 10.5.4.1 TSC-Deadline Mode mentioned that "Transitioning between TSC-Deadline
>> mode and other timer modes also disarms the timer". So the APIC Timer Initial Count
>> Register for one-shot/periodic mode should be reset. This patch do it.
>
> At the beginning of the secion is also:
>
>   A write to the LVT Timer Register that changes the timer mode disarms
>   the local APIC timer. The supported timer modes are given in Table
>   10-2. The three modes of the local APIC timer are mutually exclusive.

Yeah, I saw it before sending out the patches, but it is mentioned in
TSC-deadline section which looks strange, if the timer is still
disarmed when switching between one-shot and periodic mode before
TSC-deadline is introduced and w/o TSC-deadline section?

>
> So we should also disarm when switching between one-shot and periodic.
>
> apic_update_lvtt() already has logic to determine whether the timer mode
> has changed and is the perfect place to clear APIC_TMICT.

Agreed, thanks for your review, Radim. :)

Regards,
Wanpeng Li

>
> Thanks.
>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: Radim Krčmář <rkrcmar@redhat.com>
>> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
>> ---
>>  arch/x86/include/asm/apicdef.h | 1 +
>>  arch/x86/kvm/lapic.c           | 3 +++
>>  2 files changed, 4 insertions(+)
>>
>> diff --git a/arch/x86/include/asm/apicdef.h b/arch/x86/include/asm/apicdef.h
>> index c46bb99..d8ef1b4 100644
>> --- a/arch/x86/include/asm/apicdef.h
>> +++ b/arch/x86/include/asm/apicdef.h
>> @@ -100,6 +100,7 @@
>>  #define              APIC_TIMER_BASE_CLKIN           0x0
>>  #define              APIC_TIMER_BASE_TMBASE          0x1
>>  #define              APIC_TIMER_BASE_DIV             0x2
>> +#define              APIC_LVT_TIMER_MASK             (3 << 17)
>>  #define              APIC_LVT_TIMER_ONESHOT          (0 << 17)
>>  #define              APIC_LVT_TIMER_PERIODIC         (1 << 17)
>>  #define              APIC_LVT_TIMER_TSCDEADLINE      (2 << 17)
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index 69c5612..a739cbb 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -1722,6 +1722,9 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
>>               break;
>>
>>       case APIC_LVTT:
>> +             if (apic_lvtt_tscdeadline(apic) != ((val &
>> +                     APIC_LVT_TIMER_MASK) == APIC_LVT_TIMER_TSCDEADLINE))
>> +                     kvm_lapic_set_reg(apic, APIC_TMICT, 0);
>>               if (!kvm_apic_sw_enabled(apic))
>>                       val |= APIC_LVT_MASKED;
>>               val &= (apic_lvt_mask[0] | apic->lapic_timer.timer_mode_mask);
>> --
>> 2.7.4
>>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 2/4] KVM: LAPIC: Keep timer running when switching between one-shot and periodic mode
  2017-10-03 17:06   ` Radim Krčmář
@ 2017-10-04  1:46     ` Wanpeng Li
  2017-10-04 13:33       ` Radim Krčmář
  0 siblings, 1 reply; 27+ messages in thread
From: Wanpeng Li @ 2017-10-04  1:46 UTC (permalink / raw)
  To: Radim Krčmář; +Cc: linux-kernel, kvm, Paolo Bonzini, Wanpeng Li

2017-10-04 1:06 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
> 2017-09-28 18:04-0700, Wanpeng Li:
>> From: Wanpeng Li <wanpeng.li@hotmail.com>
>>
>> If we take TSC-deadline mode timer out of the picture, the Intel SDM
>> does not say that the timer is disable when the timer mode is change,
>> either from one-shot to periodic or vice versa.
>
> I think it does, please see comment under [v2 1/4].

As I replied to [v2 1/4].

Regards,
Wanpeng Li

>
>> After this patch, the timer is no longer disarmed on change of mode, so
>> the counter (TMCCT) keeps counting down.
>>
>> So what does a write to LVTT changes ? On baremetal, the change of mode
>> is probably taken into account only when the counter reach 0. When this
>> happen, LVTT is use to figure out if the counter should restard counting
>> down from TMICT (so periodic mode) or stop counting (if one-shot mode).
>>
>> This patch is based on observation of the behavior of the APIC timer on
>> baremetal as well as check that they does not go against the description
>> written in the Intel SDM.
>>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: Radim Krčmář <rkrcmar@redhat.com>
>> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
>> ---
>>  arch/x86/kvm/lapic.c | 40 ++++++++++++++++++++++++++++------------
>>  1 file changed, 28 insertions(+), 12 deletions(-)
>>
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index a739cbb..946c11b 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -1301,7 +1301,7 @@ static void update_divide_count(struct kvm_lapic *apic)
>>                                  apic->divide_count);
>>  }
>>
>> -static void apic_update_lvtt(struct kvm_lapic *apic)
>> +static bool apic_update_lvtt(struct kvm_lapic *apic)
>>  {
>>       u32 timer_mode = kvm_lapic_get_reg(apic, APIC_LVTT) &
>>                       apic->lapic_timer.timer_mode_mask;
>> @@ -1309,7 +1309,9 @@ static void apic_update_lvtt(struct kvm_lapic *apic)
>>       if (apic->lapic_timer.timer_mode != timer_mode) {
>>               apic->lapic_timer.timer_mode = timer_mode;
>>               hrtimer_cancel(&apic->lapic_timer.timer);
>> +             return true;
>>       }
>> +     return false;
>>  }
>>
>>  static void apic_timer_expired(struct kvm_lapic *apic)
>> @@ -1430,11 +1432,12 @@ static void start_sw_period(struct kvm_lapic *apic)
>>               HRTIMER_MODE_ABS_PINNED);
>>  }
>>
>> -static bool set_target_expiration(struct kvm_lapic *apic)
>> +static bool set_target_expiration(struct kvm_lapic *apic, bool timer_update)
>>  {
>> -     ktime_t now;
>> -     u64 tscl = rdtsc();
>> +     ktime_t now, remaining;
>> +     u64 tscl = rdtsc(), delta;
>>
>> +     /* Calculate the next time the timer should trigger an interrupt */
>>       now = ktime_get();
>>       apic->lapic_timer.period = (u64)kvm_lapic_get_reg(apic, APIC_TMICT)
>>               * APIC_BUS_CYCLE_NS * apic->divide_count;
>> @@ -1470,9 +1473,21 @@ static bool set_target_expiration(struct kvm_lapic *apic)
>>                  ktime_to_ns(ktime_add_ns(now,
>>                               apic->lapic_timer.period)));
>>
>> +     if (!timer_update)
>> +             delta = apic->lapic_timer.period;
>> +     else {
>> +             remaining = ktime_sub(apic->lapic_timer.target_expiration, now);
>> +             if (ktime_to_ns(remaining) < 0)
>> +                     remaining = 0;
>> +             delta = mod_64(ktime_to_ns(remaining), apic->lapic_timer.period);
>> +     }
>> +
>> +     if (!delta)
>> +             return false;
>> +
>>       apic->lapic_timer.tscdeadline = kvm_read_l1_tsc(apic->vcpu, tscl) +
>> -             nsec_to_cycles(apic->vcpu, apic->lapic_timer.period);
>> -     apic->lapic_timer.target_expiration = ktime_add_ns(now, apic->lapic_timer.period);
>> +             nsec_to_cycles(apic->vcpu, delta);
>> +     apic->lapic_timer.target_expiration = ktime_add_ns(now, delta);
>>
>>       return true;
>>  }
>> @@ -1609,12 +1624,12 @@ void kvm_lapic_restart_hv_timer(struct kvm_vcpu *vcpu)
>>       restart_apic_timer(apic);
>>  }
>>
>> -static void start_apic_timer(struct kvm_lapic *apic)
>> +static void start_apic_timer(struct kvm_lapic *apic, bool timer_update)
>>  {
>>       atomic_set(&apic->lapic_timer.pending, 0);
>>
>>       if ((apic_lvtt_period(apic) || apic_lvtt_oneshot(apic))
>> -         && !set_target_expiration(apic))
>> +         && !set_target_expiration(apic, timer_update))
>>               return;
>>
>>       restart_apic_timer(apic);
>> @@ -1729,7 +1744,8 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
>>                       val |= APIC_LVT_MASKED;
>>               val &= (apic_lvt_mask[0] | apic->lapic_timer.timer_mode_mask);
>>               kvm_lapic_set_reg(apic, APIC_LVTT, val);
>> -             apic_update_lvtt(apic);
>> +             if (apic_update_lvtt(apic) && !apic_lvtt_tscdeadline(apic))
>> +                     start_apic_timer(apic, true);
>>               break;
>>
>>       case APIC_TMICT:
>> @@ -1738,7 +1754,7 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
>>
>>               hrtimer_cancel(&apic->lapic_timer.timer);
>>               kvm_lapic_set_reg(apic, APIC_TMICT, val);
>> -             start_apic_timer(apic);
>> +             start_apic_timer(apic, false);
>>               break;
>>
>>       case APIC_TDCR:
>> @@ -1872,7 +1888,7 @@ void kvm_set_lapic_tscdeadline_msr(struct kvm_vcpu *vcpu, u64 data)
>>
>>       hrtimer_cancel(&apic->lapic_timer.timer);
>>       apic->lapic_timer.tscdeadline = data;
>> -     start_apic_timer(apic);
>> +     start_apic_timer(apic, false);
>>  }
>>
>>  void kvm_lapic_set_tpr(struct kvm_vcpu *vcpu, unsigned long cr8)
>> @@ -2238,7 +2254,7 @@ int kvm_apic_set_state(struct kvm_vcpu *vcpu, struct kvm_lapic_state *s)
>>       apic_update_lvtt(apic);
>>       apic_manage_nmi_watchdog(apic, kvm_lapic_get_reg(apic, APIC_LVT0));
>>       update_divide_count(apic);
>> -     start_apic_timer(apic);
>> +     start_apic_timer(apic, false);
>>       apic->irr_pending = true;
>>       apic->isr_count = vcpu->arch.apicv_active ?
>>                               1 : count_vectors(apic->regs + APIC_ISR);
>> --
>> 2.7.4
>>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 3/4] KVM: LAPIC: Apply change to TDCR right away to the timer
  2017-10-03 17:28   ` Radim Krčmář
@ 2017-10-04  1:59     ` Wanpeng Li
  2017-10-04 12:43       ` Radim Krčmář
  0 siblings, 1 reply; 27+ messages in thread
From: Wanpeng Li @ 2017-10-04  1:59 UTC (permalink / raw)
  To: Radim Krčmář; +Cc: linux-kernel, kvm, Paolo Bonzini, Wanpeng Li

2017-10-04 1:28 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
> 2017-09-28 18:04-0700, Wanpeng Li:
>> From: Wanpeng Li <wanpeng.li@hotmail.com>
>>
>> The description in the Intel SDM of how the divide configuration
>> register is used: "The APIC timer frequency will be the processor's bus
>> clock or core crystal clock frequency divided by the value specified in
>> the divide configuration register."
>>
>> Observation of baremetal shown that when the TDCR is change, the TMCCT
>> does not change or make a big jump in value, but the rate at which it
>> count down change.
>>
>> The patch update the emulation to APIC timer to so that a change to the
>> divide configuration would be reflected in the value of the counter and
>> when the next interrupt is triggered.
>>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: Radim Krčmář <rkrcmar@redhat.com>
>> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
>> ---
>
> Why do we need to do more than just restart the timer?

Because the current timer (hv or sw) are still running. I think the
goal of this commit is to runtime update the rate of the current timer
which is running. Our restart_apic_timer() implementation just cancels
the current timer when switch between preemption timer and hrtimer.

Regards,
Wanpeng Li

>
> The TMCCT should remain roughly at the same level -- changing divide
> count modifies target_expiration and it looks like apic_get_tmcct()
> would get the same result like before changing divide count.
>
> Thanks.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 4/4] KVM: LAPIC: Don't silently accept bad vectors
  2017-10-03 17:53   ` Radim Krčmář
@ 2017-10-04  7:56     ` Wanpeng Li
  2017-10-04 12:01       ` Radim Krčmář
  0 siblings, 1 reply; 27+ messages in thread
From: Wanpeng Li @ 2017-10-04  7:56 UTC (permalink / raw)
  To: Radim Krčmář; +Cc: linux-kernel, kvm, Paolo Bonzini, Wanpeng Li

2017-10-04 1:53 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
> 2017-09-28 18:04-0700, Wanpeng Li:
>> From: Wanpeng Li <wanpeng.li@hotmail.com>
>>
>> Vectors 0-15 are reserved, and a physical LAPIC - upon sending or
>> receiving one - would generate an APIC error instead of doing the
>> requested action. Make our emulation behave similarly.
>>
>> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> Cc: Radim Krčmář <rkrcmar@redhat.com>
>> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
>> ---
>>  arch/x86/kvm/lapic.c | 30 ++++++++++++++++++++++++++++--
>>  1 file changed, 28 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index 6bafd06..a779ba9 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -935,6 +935,25 @@ bool kvm_intr_is_single_vcpu_fast(struct kvm *kvm, struct kvm_lapic_irq *irq,
>>       return ret;
>>  }
>>
>> +static void apic_error(struct kvm_lapic *apic, unsigned long errmask)
>> +{
>> +     uint32_t esr;
>> +
>> +     esr = kvm_lapic_get_reg(apic, APIC_ESR);
>> +
>> +     if ((esr & errmask) != errmask) {
>
> The spec makes me think that there is going to be only 1 interrupt
> (regardless of the number errors) until the software writes 0 to
> APIC_ESR.  Is there a better description than the following 10.5.3?
>
>   The ESR is a write/read register. Before attempt to read from the ESR,
>   software should first write to it. (The value written does not affect
>   the values read subsequently; only zero may be written in x2APIC
>   mode.) This write clears any previously logged errors and updates the
>   ESR with any errors detected since the last write to the ESR. This
>   write also rearms the APIC error interrupt triggering mechanism.
>
> This also describes a different handling of APIC_ESR -- APIC_ESR is
> updated only on software writes to APIC_ESR.  All errors in between seem
> to be logged internally (not sure where to migrate it).

Is there any thing need to be changed in this function?

>
>> +             uint32_t lvterr = kvm_lapic_get_reg(apic, APIC_LVTERR);
>> +
>> +             kvm_lapic_set_reg(apic, APIC_ESR, esr | errmask);
>> +             if (!(lvterr & APIC_LVT_MASKED)) {
>> +                     struct kvm_lapic_irq irq;
>> +
>> +                     irq.vector = lvterr & 0xff;
>> +                     kvm_irq_delivery_to_apic(apic->vcpu->kvm, apic, &irq, NULL);
>> +             }
>> +     }
>> +}
>> +
>>  /*
>>   * Add a pending IRQ into lapic.
>>   * Return 1 if successfully added and 0 if discarded.
>> @@ -946,6 +965,11 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
>>       int result = 0;
>>       struct kvm_vcpu *vcpu = apic->vcpu;
>>
>> +     if (unlikely(vector < 16) && delivery_mode == APIC_DM_FIXED) {
>> +             apic_error(apic, APIC_ESR_RECVILL);
>
> The error is also triggered if lowest priority is supported and tries to
> deliver an invalid vector.

Could you point out this in SDM? :)

>
>> +             return 0;
>> +     }
>> +
>>       trace_kvm_apic_accept_irq(vcpu->vcpu_id, delivery_mode,
>>                                 trig_mode, vector);
>>       switch (delivery_mode) {
>> @@ -1146,7 +1170,10 @@ static void apic_send_ipi(struct kvm_lapic *apic)
>>                  irq.trig_mode, irq.level, irq.dest_mode, irq.delivery_mode,
>>                  irq.vector, irq.msi_redir_hint);
>>
>> -     kvm_irq_delivery_to_apic(apic->vcpu->kvm, apic, &irq, NULL);
>> +     if (unlikely(irq.vector < 16 && irq.delivery_mode == APIC_DM_FIXED))
>
> Please check how APICv self-IPI acceleration behaves, so we're
> consistent.

There is no vmexit for APICv self-IPI, so I think we can't intercept the vector.

Regards,
Wanpeng Li

>
> Thanks.
>
>> +             apic_error(apic, APIC_ESR_SENDILL);
>> +     else
>> +             kvm_irq_delivery_to_apic(apic->vcpu->kvm, apic, &irq, NULL);
>>  }
>>
>>  static u32 apic_get_tmcct(struct kvm_lapic *apic)
>> @@ -1734,7 +1761,6 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
>>       case APIC_LVTPC:
>>       case APIC_LVT1:
>>       case APIC_LVTERR:
>> -             /* TODO: Check vector */
>>               if (!kvm_apic_sw_enabled(apic))
>>                       val |= APIC_LVT_MASKED;
>>
>> --
>> 2.7.4
>>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 4/4] KVM: LAPIC: Don't silently accept bad vectors
  2017-10-04  7:56     ` Wanpeng Li
@ 2017-10-04 12:01       ` Radim Krčmář
  2017-10-04 14:16         ` Wanpeng Li
  0 siblings, 1 reply; 27+ messages in thread
From: Radim Krčmář @ 2017-10-04 12:01 UTC (permalink / raw)
  To: Wanpeng Li; +Cc: linux-kernel, kvm, Paolo Bonzini, Wanpeng Li

2017-10-04 15:56+0800, Wanpeng Li:
> 2017-10-04 1:53 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
> > 2017-09-28 18:04-0700, Wanpeng Li:
> >> From: Wanpeng Li <wanpeng.li@hotmail.com>
> >>
> >> Vectors 0-15 are reserved, and a physical LAPIC - upon sending or
> >> receiving one - would generate an APIC error instead of doing the
> >> requested action. Make our emulation behave similarly.
> >>
> >> Cc: Paolo Bonzini <pbonzini@redhat.com>
> >> Cc: Radim Krčmář <rkrcmar@redhat.com>
> >> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
> >> ---
> >>  arch/x86/kvm/lapic.c | 30 ++++++++++++++++++++++++++++--
> >>  1 file changed, 28 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> >> index 6bafd06..a779ba9 100644
> >> --- a/arch/x86/kvm/lapic.c
> >> +++ b/arch/x86/kvm/lapic.c
> >> @@ -935,6 +935,25 @@ bool kvm_intr_is_single_vcpu_fast(struct kvm *kvm, struct kvm_lapic_irq *irq,
> >>       return ret;
> >>  }
> >>
> >> +static void apic_error(struct kvm_lapic *apic, unsigned long errmask)
> >> +{
> >> +     uint32_t esr;
> >> +
> >> +     esr = kvm_lapic_get_reg(apic, APIC_ESR);
> >> +
> >> +     if ((esr & errmask) != errmask) {
> >
> > The spec makes me think that there is going to be only 1 interrupt
> > (regardless of the number errors) until the software writes 0 to
> > APIC_ESR.  Is there a better description than the following 10.5.3?
> >
> >   The ESR is a write/read register. Before attempt to read from the ESR,
> >   software should first write to it. (The value written does not affect
> >   the values read subsequently; only zero may be written in x2APIC
> >   mode.) This write clears any previously logged errors and updates the
> >   ESR with any errors detected since the last write to the ESR. This
> >   write also rearms the APIC error interrupt triggering mechanism.
> >
> > This also describes a different handling of APIC_ESR -- APIC_ESR is
> > updated only on software writes to APIC_ESR.  All errors in between seem
> > to be logged internally (not sure where to migrate it).
> 
> Is there any thing need to be changed in this function?

Yes.  For the first part, it should really be tested on bare-metal and
modelled upon that.  SDM mentions some kind of rearming and APM doesn't
so we maybe could just send the interrupt every time (if unmasked).
And maybe vectors from external interrupts trigger the error too, but
we definitely don't need to sort that out immediately.

For the second part, the LAPIC error doesn't cause a write to APIC_ESR.
We need to add a state for pending errors (and somehow migrate it) that
gets copied to APIC_ESR after a write.

> >> +             uint32_t lvterr = kvm_lapic_get_reg(apic, APIC_LVTERR);
> >> +
> >> +             kvm_lapic_set_reg(apic, APIC_ESR, esr | errmask);
> >> +             if (!(lvterr & APIC_LVT_MASKED)) {
> >> +                     struct kvm_lapic_irq irq;
> >> +
> >> +                     irq.vector = lvterr & 0xff;
> >> +                     kvm_irq_delivery_to_apic(apic->vcpu->kvm, apic, &irq, NULL);
> >> +             }
> >> +     }
> >> +}
> >> +
> >>  /*
> >>   * Add a pending IRQ into lapic.
> >>   * Return 1 if successfully added and 0 if discarded.
> >> @@ -946,6 +965,11 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
> >>       int result = 0;
> >>       struct kvm_vcpu *vcpu = apic->vcpu;
> >>
> >> +     if (unlikely(vector < 16) && delivery_mode == APIC_DM_FIXED) {
> >> +             apic_error(apic, APIC_ESR_RECVILL);
> >
> > The error is also triggered if lowest priority is supported and tries to
> > deliver an invalid vector.
> 
> Could you point out this in SDM? :)

In section 10.5.3 Error Handling:

  If the local APIC does not support the sending of lowest-priority IPIs
  and software writes the ICR to send a lowest-priority IPI with an
  illegal vector, the local APIC sets only the “redirectible IPI” error
  bit.

Hence, if local APIC does support lowest-priority, then it throws the
same error as fixed.  (KVM does support lowest-priority.)

> >
> >> +             return 0;
> >> +     }
> >> +
> >>       trace_kvm_apic_accept_irq(vcpu->vcpu_id, delivery_mode,
> >>                                 trig_mode, vector);
> >>       switch (delivery_mode) {
> >> @@ -1146,7 +1170,10 @@ static void apic_send_ipi(struct kvm_lapic *apic)
> >>                  irq.trig_mode, irq.level, irq.dest_mode, irq.delivery_mode,
> >>                  irq.vector, irq.msi_redir_hint);
> >>
> >> -     kvm_irq_delivery_to_apic(apic->vcpu->kvm, apic, &irq, NULL);
> >> +     if (unlikely(irq.vector < 16 && irq.delivery_mode == APIC_DM_FIXED))
> >
> > Please check how APICv self-IPI acceleration behaves, so we're
> > consistent.
> 
> There is no vmexit for APICv self-IPI, so I think we can't intercept the vector.

Right, so does it deliver the 0-15 vector?  If yes, then we should do
that as well.  Otherwise, where does it save the error flag and does it
send an error interrupt?  Or do we get a VM exit after all?

Thanks.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 3/4] KVM: LAPIC: Apply change to TDCR right away to the timer
  2017-10-04  1:59     ` Wanpeng Li
@ 2017-10-04 12:43       ` Radim Krčmář
  0 siblings, 0 replies; 27+ messages in thread
From: Radim Krčmář @ 2017-10-04 12:43 UTC (permalink / raw)
  To: Wanpeng Li; +Cc: linux-kernel, kvm, Paolo Bonzini, Wanpeng Li

2017-10-04 09:59+0800, Wanpeng Li:
> 2017-10-04 1:28 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
> > 2017-09-28 18:04-0700, Wanpeng Li:
> >> From: Wanpeng Li <wanpeng.li@hotmail.com>
> >>
> >> The description in the Intel SDM of how the divide configuration
> >> register is used: "The APIC timer frequency will be the processor's bus
> >> clock or core crystal clock frequency divided by the value specified in
> >> the divide configuration register."
> >>
> >> Observation of baremetal shown that when the TDCR is change, the TMCCT
> >> does not change or make a big jump in value, but the rate at which it
> >> count down change.
> >>
> >> The patch update the emulation to APIC timer to so that a change to the
> >> divide configuration would be reflected in the value of the counter and
> >> when the next interrupt is triggered.
> >>
> >> Cc: Paolo Bonzini <pbonzini@redhat.com>
> >> Cc: Radim Krčmář <rkrcmar@redhat.com>
> >> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
> >> ---
> >
> > Why do we need to do more than just restart the timer?
> 
> Because the current timer (hv or sw) are still running. I think the
> goal of this commit is to runtime update the rate of the current timer
> which is running. Our restart_apic_timer() implementation just cancels
> the current timer when switch between preemption timer and hrtimer.

I see ... we do need to know both divisors in order to make it work,
thanks.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 1/4] KVM: LAPIC: Fix lapic timer mode transition
  2017-10-04  1:45     ` Wanpeng Li
@ 2017-10-04 13:21       ` Radim Krčmář
  2017-10-04 13:50         ` Wanpeng Li
  0 siblings, 1 reply; 27+ messages in thread
From: Radim Krčmář @ 2017-10-04 13:21 UTC (permalink / raw)
  To: Wanpeng Li; +Cc: linux-kernel, kvm, Paolo Bonzini, Wanpeng Li

2017-10-04 09:45+0800, Wanpeng Li:
> 2017-10-04 1:05 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
> > 2017-09-28 18:04-0700, Wanpeng Li:
> >> From: Wanpeng Li <wanpeng.li@hotmail.com>
> >>
> >> SDM 10.5.4.1 TSC-Deadline Mode mentioned that "Transitioning between TSC-Deadline
> >> mode and other timer modes also disarms the timer". So the APIC Timer Initial Count
> >> Register for one-shot/periodic mode should be reset. This patch do it.
> >
> > At the beginning of the secion is also:
> >
> >   A write to the LVT Timer Register that changes the timer mode disarms
> >   the local APIC timer. The supported timer modes are given in Table
> >   10-2. The three modes of the local APIC timer are mutually exclusive.
> 
> Yeah, I saw it before sending out the patches, but it is mentioned in
> TSC-deadline section which looks strange, if the timer is still
> disarmed when switching between one-shot and periodic mode before
> TSC-deadline is introduced and w/o TSC-deadline section?

Yeah, maybe it is only true if the machine has TSC.  APM doesn't mention
disarming at all.  Bochs only disables the timer it on switch from/to
TSC-deadline.

> > So we should also disarm when switching between one-shot and periodic.
> >
> > apic_update_lvtt() already has logic to determine whether the timer mode
> > has changed and is the perfect place to clear APIC_TMICT.
> 
> Agreed, thanks for your review, Radim. :)

Bochs doesn't write 0 to APIC_TMICT, but it seems that Xen guys verified
that on bare-metal, so the behavior is fine.
Please just move it to apic_update_lvtt(),

thanks.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 2/4] KVM: LAPIC: Keep timer running when switching between one-shot and periodic mode
  2017-10-04  1:46     ` Wanpeng Li
@ 2017-10-04 13:33       ` Radim Krčmář
  2017-10-04 13:57         ` Wanpeng Li
  0 siblings, 1 reply; 27+ messages in thread
From: Radim Krčmář @ 2017-10-04 13:33 UTC (permalink / raw)
  To: Wanpeng Li; +Cc: linux-kernel, kvm, Paolo Bonzini, Wanpeng Li

2017-10-04 09:46+0800, Wanpeng Li:
> 2017-10-04 1:06 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
> > 2017-09-28 18:04-0700, Wanpeng Li:
> >> From: Wanpeng Li <wanpeng.li@hotmail.com>
> >>
> >> If we take TSC-deadline mode timer out of the picture, the Intel SDM
> >> does not say that the timer is disable when the timer mode is change,
> >> either from one-shot to periodic or vice versa.
> >
> > I think it does, please see comment under [v2 1/4].
> 
> As I replied to [v2 1/4].

Right, so we probably shouldn't disable the timer.

> >> After this patch, the timer is no longer disarmed on change of mode, so
> >> the counter (TMCCT) keeps counting down.
> >>
> >> So what does a write to LVTT changes ? On baremetal, the change of mode
> >> is probably taken into account only when the counter reach 0. When this
> >> happen, LVTT is use to figure out if the counter should restard counting
> >> down from TMICT (so periodic mode) or stop counting (if one-shot mode).
> >>
> >> This patch is based on observation of the behavior of the APIC timer on
> >> baremetal as well as check that they does not go against the description
> >> written in the Intel SDM.
> >>
> >> Cc: Paolo Bonzini <pbonzini@redhat.com>
> >> Cc: Radim Krčmář <rkrcmar@redhat.com>
> >> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
> >> ---
> >>  arch/x86/kvm/lapic.c | 40 ++++++++++++++++++++++++++++------------
> >>  1 file changed, 28 insertions(+), 12 deletions(-)
> >>
> >> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> >> index a739cbb..946c11b 100644
> >> --- a/arch/x86/kvm/lapic.c
> >> +++ b/arch/x86/kvm/lapic.c
> >> @@ -1301,7 +1301,7 @@ static void update_divide_count(struct kvm_lapic *apic)
> >>                                  apic->divide_count);
> >>  }
> >>
> >> -static void apic_update_lvtt(struct kvm_lapic *apic)
> >> +static bool apic_update_lvtt(struct kvm_lapic *apic)
> >>  {
> >>       u32 timer_mode = kvm_lapic_get_reg(apic, APIC_LVTT) &
> >>                       apic->lapic_timer.timer_mode_mask;
> >> @@ -1309,7 +1309,9 @@ static void apic_update_lvtt(struct kvm_lapic *apic)
> >>       if (apic->lapic_timer.timer_mode != timer_mode) {
> >>               apic->lapic_timer.timer_mode = timer_mode;
> >>               hrtimer_cancel(&apic->lapic_timer.timer);
> >> +             return true;
> >>       }
> >> +     return false;
> >>  }
> >>
> >>  static void apic_timer_expired(struct kvm_lapic *apic)
> >> @@ -1729,7 +1744,8 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
> >>                       val |= APIC_LVT_MASKED;
> >>               val &= (apic_lvt_mask[0] | apic->lapic_timer.timer_mode_mask);
> >>               kvm_lapic_set_reg(apic, APIC_LVTT, val);
> >> -             apic_update_lvtt(apic);
> >> +             if (apic_update_lvtt(apic) && !apic_lvtt_tscdeadline(apic))
> >> +                     start_apic_timer(apic, true);

Changing the timer from one-shot to periodic doesn't change the expected
expiration -- I think we could instead skip hrtimer_cancel() in
apic_update_lvtt().

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 1/4] KVM: LAPIC: Fix lapic timer mode transition
  2017-10-04 13:21       ` Radim Krčmář
@ 2017-10-04 13:50         ` Wanpeng Li
  2017-10-04 14:02           ` Wanpeng Li
  0 siblings, 1 reply; 27+ messages in thread
From: Wanpeng Li @ 2017-10-04 13:50 UTC (permalink / raw)
  To: Radim Krčmář; +Cc: linux-kernel, kvm, Paolo Bonzini, Wanpeng Li

2017-10-04 21:21 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
> 2017-10-04 09:45+0800, Wanpeng Li:
>> 2017-10-04 1:05 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
>> > 2017-09-28 18:04-0700, Wanpeng Li:
>> >> From: Wanpeng Li <wanpeng.li@hotmail.com>
>> >>
>> >> SDM 10.5.4.1 TSC-Deadline Mode mentioned that "Transitioning between TSC-Deadline
>> >> mode and other timer modes also disarms the timer". So the APIC Timer Initial Count
>> >> Register for one-shot/periodic mode should be reset. This patch do it.
>> >
>> > At the beginning of the secion is also:
>> >
>> >   A write to the LVT Timer Register that changes the timer mode disarms
>> >   the local APIC timer. The supported timer modes are given in Table
>> >   10-2. The three modes of the local APIC timer are mutually exclusive.
>>
>> Yeah, I saw it before sending out the patches, but it is mentioned in
>> TSC-deadline section which looks strange, if the timer is still
>> disarmed when switching between one-shot and periodic mode before
>> TSC-deadline is introduced and w/o TSC-deadline section?
>
> Yeah, maybe it is only true if the machine has TSC.  APM doesn't mention

If APM is another emulator?

> disarming at all.  Bochs only disables the timer it on switch from/to
> TSC-deadline.
>
>> > So we should also disarm when switching between one-shot and periodic.
>> >
>> > apic_update_lvtt() already has logic to determine whether the timer mode
>> > has changed and is the perfect place to clear APIC_TMICT.
>>
>> Agreed, thanks for your review, Radim. :)
>
> Bochs doesn't write 0 to APIC_TMICT, but it seems that Xen guys verified
> that on bare-metal, so the behavior is fine.
> Please just move it to apic_update_lvtt(),
>
> thanks.

Ok, thanks for the review. :)

Regards,
Wanpeng Li

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 2/4] KVM: LAPIC: Keep timer running when switching between one-shot and periodic mode
  2017-10-04 13:33       ` Radim Krčmář
@ 2017-10-04 13:57         ` Wanpeng Li
  0 siblings, 0 replies; 27+ messages in thread
From: Wanpeng Li @ 2017-10-04 13:57 UTC (permalink / raw)
  To: Radim Krčmář; +Cc: linux-kernel, kvm, Paolo Bonzini, Wanpeng Li

2017-10-04 21:33 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
> 2017-10-04 09:46+0800, Wanpeng Li:
>> 2017-10-04 1:06 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
>> > 2017-09-28 18:04-0700, Wanpeng Li:
>> >> From: Wanpeng Li <wanpeng.li@hotmail.com>
>> >>
>> >> If we take TSC-deadline mode timer out of the picture, the Intel SDM
>> >> does not say that the timer is disable when the timer mode is change,
>> >> either from one-shot to periodic or vice versa.
>> >
>> > I think it does, please see comment under [v2 1/4].
>>
>> As I replied to [v2 1/4].
>
> Right, so we probably shouldn't disable the timer.
>
>> >> After this patch, the timer is no longer disarmed on change of mode, so
>> >> the counter (TMCCT) keeps counting down.
>> >>
>> >> So what does a write to LVTT changes ? On baremetal, the change of mode
>> >> is probably taken into account only when the counter reach 0. When this
>> >> happen, LVTT is use to figure out if the counter should restard counting
>> >> down from TMICT (so periodic mode) or stop counting (if one-shot mode).
>> >>
>> >> This patch is based on observation of the behavior of the APIC timer on
>> >> baremetal as well as check that they does not go against the description
>> >> written in the Intel SDM.
>> >>
>> >> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> >> Cc: Radim Krčmář <rkrcmar@redhat.com>
>> >> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
>> >> ---
>> >>  arch/x86/kvm/lapic.c | 40 ++++++++++++++++++++++++++++------------
>> >>  1 file changed, 28 insertions(+), 12 deletions(-)
>> >>
>> >> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> >> index a739cbb..946c11b 100644
>> >> --- a/arch/x86/kvm/lapic.c
>> >> +++ b/arch/x86/kvm/lapic.c
>> >> @@ -1301,7 +1301,7 @@ static void update_divide_count(struct kvm_lapic *apic)
>> >>                                  apic->divide_count);
>> >>  }
>> >>
>> >> -static void apic_update_lvtt(struct kvm_lapic *apic)
>> >> +static bool apic_update_lvtt(struct kvm_lapic *apic)
>> >>  {
>> >>       u32 timer_mode = kvm_lapic_get_reg(apic, APIC_LVTT) &
>> >>                       apic->lapic_timer.timer_mode_mask;
>> >> @@ -1309,7 +1309,9 @@ static void apic_update_lvtt(struct kvm_lapic *apic)
>> >>       if (apic->lapic_timer.timer_mode != timer_mode) {
>> >>               apic->lapic_timer.timer_mode = timer_mode;
>> >>               hrtimer_cancel(&apic->lapic_timer.timer);
>> >> +             return true;
>> >>       }
>> >> +     return false;
>> >>  }
>> >>
>> >>  static void apic_timer_expired(struct kvm_lapic *apic)
>> >> @@ -1729,7 +1744,8 @@ int kvm_lapic_reg_write(struct kvm_lapic *apic, u32 reg, u32 val)
>> >>                       val |= APIC_LVT_MASKED;
>> >>               val &= (apic_lvt_mask[0] | apic->lapic_timer.timer_mode_mask);
>> >>               kvm_lapic_set_reg(apic, APIC_LVTT, val);
>> >> -             apic_update_lvtt(apic);
>> >> +             if (apic_update_lvtt(apic) && !apic_lvtt_tscdeadline(apic))
>> >> +                     start_apic_timer(apic, true);
>
> Changing the timer from one-shot to periodic doesn't change the expected
> expiration -- I think we could instead skip hrtimer_cancel() in
> apic_update_lvtt().

Either from one-shot to periodic or vice versa, will do, thanks. :)

Regards,
Wanpeng Li

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 1/4] KVM: LAPIC: Fix lapic timer mode transition
  2017-10-04 13:50         ` Wanpeng Li
@ 2017-10-04 14:02           ` Wanpeng Li
  0 siblings, 0 replies; 27+ messages in thread
From: Wanpeng Li @ 2017-10-04 14:02 UTC (permalink / raw)
  To: Radim Krčmář; +Cc: linux-kernel, kvm, Paolo Bonzini, Wanpeng Li

2017-10-04 21:50 GMT+08:00 Wanpeng Li <kernellwp@gmail.com>:
> 2017-10-04 21:21 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
>> 2017-10-04 09:45+0800, Wanpeng Li:
>>> 2017-10-04 1:05 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
>>> > 2017-09-28 18:04-0700, Wanpeng Li:
>>> >> From: Wanpeng Li <wanpeng.li@hotmail.com>
>>> >>
>>> >> SDM 10.5.4.1 TSC-Deadline Mode mentioned that "Transitioning between TSC-Deadline
>>> >> mode and other timer modes also disarms the timer". So the APIC Timer Initial Count
>>> >> Register for one-shot/periodic mode should be reset. This patch do it.
>>> >
>>> > At the beginning of the secion is also:
>>> >
>>> >   A write to the LVT Timer Register that changes the timer mode disarms
>>> >   the local APIC timer. The supported timer modes are given in Table
>>> >   10-2. The three modes of the local APIC timer are mutually exclusive.
>>>
>>> Yeah, I saw it before sending out the patches, but it is mentioned in
>>> TSC-deadline section which looks strange, if the timer is still
>>> disarmed when switching between one-shot and periodic mode before
>>> TSC-deadline is introduced and w/o TSC-deadline section?
>>
>> Yeah, maybe it is only true if the machine has TSC.  APM doesn't mention
>
> If APM is another emulator?

The document from AMD.

>
>> disarming at all.  Bochs only disables the timer it on switch from/to
>> TSC-deadline.
>>
>>> > So we should also disarm when switching between one-shot and periodic.
>>> >
>>> > apic_update_lvtt() already has logic to determine whether the timer mode
>>> > has changed and is the perfect place to clear APIC_TMICT.
>>>
>>> Agreed, thanks for your review, Radim. :)
>>
>> Bochs doesn't write 0 to APIC_TMICT, but it seems that Xen guys verified
>> that on bare-metal, so the behavior is fine.
>> Please just move it to apic_update_lvtt(),
>>
>> thanks.
>
> Ok, thanks for the review. :)
>
> Regards,
> Wanpeng Li

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 4/4] KVM: LAPIC: Don't silently accept bad vectors
  2017-10-04 12:01       ` Radim Krčmář
@ 2017-10-04 14:16         ` Wanpeng Li
  2017-10-04 14:44           ` Radim Krčmář
  0 siblings, 1 reply; 27+ messages in thread
From: Wanpeng Li @ 2017-10-04 14:16 UTC (permalink / raw)
  To: Radim Krčmář; +Cc: linux-kernel, kvm, Paolo Bonzini, Wanpeng Li

2017-10-04 20:01 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
> 2017-10-04 15:56+0800, Wanpeng Li:
>> 2017-10-04 1:53 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
>> > 2017-09-28 18:04-0700, Wanpeng Li:
>> >> From: Wanpeng Li <wanpeng.li@hotmail.com>
>> >>
>> >> Vectors 0-15 are reserved, and a physical LAPIC - upon sending or
>> >> receiving one - would generate an APIC error instead of doing the
>> >> requested action. Make our emulation behave similarly.
>> >>
>> >> Cc: Paolo Bonzini <pbonzini@redhat.com>
>> >> Cc: Radim Krčmář <rkrcmar@redhat.com>
>> >> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com>
>> >> ---
>> >>  arch/x86/kvm/lapic.c | 30 ++++++++++++++++++++++++++++--
>> >>  1 file changed, 28 insertions(+), 2 deletions(-)
>> >>
>> >> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> >> index 6bafd06..a779ba9 100644
>> >> --- a/arch/x86/kvm/lapic.c
>> >> +++ b/arch/x86/kvm/lapic.c
>> >> @@ -935,6 +935,25 @@ bool kvm_intr_is_single_vcpu_fast(struct kvm *kvm, struct kvm_lapic_irq *irq,
>> >>       return ret;
>> >>  }
>> >>
>> >> +static void apic_error(struct kvm_lapic *apic, unsigned long errmask)
>> >> +{
>> >> +     uint32_t esr;
>> >> +
>> >> +     esr = kvm_lapic_get_reg(apic, APIC_ESR);
>> >> +
>> >> +     if ((esr & errmask) != errmask) {
>> >
>> > The spec makes me think that there is going to be only 1 interrupt
>> > (regardless of the number errors) until the software writes 0 to
>> > APIC_ESR.  Is there a better description than the following 10.5.3?
>> >
>> >   The ESR is a write/read register. Before attempt to read from the ESR,
>> >   software should first write to it. (The value written does not affect
>> >   the values read subsequently; only zero may be written in x2APIC
>> >   mode.) This write clears any previously logged errors and updates the
>> >   ESR with any errors detected since the last write to the ESR. This
>> >   write also rearms the APIC error interrupt triggering mechanism.
>> >
>> > This also describes a different handling of APIC_ESR -- APIC_ESR is
>> > updated only on software writes to APIC_ESR.  All errors in between seem
>> > to be logged internally (not sure where to migrate it).
>>
>> Is there any thing need to be changed in this function?
>
> Yes.  For the first part, it should really be tested on bare-metal and
> modelled upon that.  SDM mentions some kind of rearming and APM doesn't
> so we maybe could just send the interrupt every time (if unmasked).
> And maybe vectors from external interrupts trigger the error too, but
> we definitely don't need to sort that out immediately.
>
> For the second part, the LAPIC error doesn't cause a write to APIC_ESR.
> We need to add a state for pending errors (and somehow migrate it) that
> gets copied to APIC_ESR after a write.
>
>> >> +             uint32_t lvterr = kvm_lapic_get_reg(apic, APIC_LVTERR);
>> >> +
>> >> +             kvm_lapic_set_reg(apic, APIC_ESR, esr | errmask);
>> >> +             if (!(lvterr & APIC_LVT_MASKED)) {
>> >> +                     struct kvm_lapic_irq irq;
>> >> +
>> >> +                     irq.vector = lvterr & 0xff;
>> >> +                     kvm_irq_delivery_to_apic(apic->vcpu->kvm, apic, &irq, NULL);
>> >> +             }
>> >> +     }
>> >> +}
>> >> +
>> >>  /*
>> >>   * Add a pending IRQ into lapic.
>> >>   * Return 1 if successfully added and 0 if discarded.
>> >> @@ -946,6 +965,11 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
>> >>       int result = 0;
>> >>       struct kvm_vcpu *vcpu = apic->vcpu;
>> >>
>> >> +     if (unlikely(vector < 16) && delivery_mode == APIC_DM_FIXED) {
>> >> +             apic_error(apic, APIC_ESR_RECVILL);
>> >
>> > The error is also triggered if lowest priority is supported and tries to
>> > deliver an invalid vector.
>>
>> Could you point out this in SDM? :)
>
> In section 10.5.3 Error Handling:
>
>   If the local APIC does not support the sending of lowest-priority IPIs
>   and software writes the ICR to send a lowest-priority IPI with an
>   illegal vector, the local APIC sets only the “redirectible IPI” error
>   bit.
>
> Hence, if local APIC does support lowest-priority, then it throws the
> same error as fixed.  (KVM does support lowest-priority.)

Yeah, I read the section before but I misunderstand it. It seems that
the section means it just occurs when the local APIC does not support
the sending of lowest-priority IPIs?

Regards,
Wanpeng Li

>
>> >
>> >> +             return 0;
>> >> +     }
>> >> +
>> >>       trace_kvm_apic_accept_irq(vcpu->vcpu_id, delivery_mode,
>> >>                                 trig_mode, vector);
>> >>       switch (delivery_mode) {
>> >> @@ -1146,7 +1170,10 @@ static void apic_send_ipi(struct kvm_lapic *apic)
>> >>                  irq.trig_mode, irq.level, irq.dest_mode, irq.delivery_mode,
>> >>                  irq.vector, irq.msi_redir_hint);
>> >>
>> >> -     kvm_irq_delivery_to_apic(apic->vcpu->kvm, apic, &irq, NULL);
>> >> +     if (unlikely(irq.vector < 16 && irq.delivery_mode == APIC_DM_FIXED))
>> >
>> > Please check how APICv self-IPI acceleration behaves, so we're
>> > consistent.
>>
>> There is no vmexit for APICv self-IPI, so I think we can't intercept the vector.
>
> Right, so does it deliver the 0-15 vector?  If yes, then we should do
> that as well.  Otherwise, where does it save the error flag and does it
> send an error interrupt?  Or do we get a VM exit after all?
>
> Thanks.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 4/4] KVM: LAPIC: Don't silently accept bad vectors
  2017-10-04 14:16         ` Wanpeng Li
@ 2017-10-04 14:44           ` Radim Krčmář
  2017-10-13  1:17             ` Wanpeng Li
  0 siblings, 1 reply; 27+ messages in thread
From: Radim Krčmář @ 2017-10-04 14:44 UTC (permalink / raw)
  To: Wanpeng Li; +Cc: linux-kernel, kvm, Paolo Bonzini, Wanpeng Li

2017-10-04 22:16+0800, Wanpeng Li:
> 2017-10-04 20:01 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
> > 2017-10-04 15:56+0800, Wanpeng Li:
> >> 2017-10-04 1:53 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
> >> > 2017-09-28 18:04-0700, Wanpeng Li:
> >> >> @@ -946,6 +965,11 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
> >> >>       int result = 0;
> >> >>       struct kvm_vcpu *vcpu = apic->vcpu;
> >> >>
> >> >> +     if (unlikely(vector < 16) && delivery_mode == APIC_DM_FIXED) {
> >> >> +             apic_error(apic, APIC_ESR_RECVILL);
> >> >
> >> > The error is also triggered if lowest priority is supported and tries to
> >> > deliver an invalid vector.
> >>
> >> Could you point out this in SDM? :)
> >
> > In section 10.5.3 Error Handling:
> >
> >   If the local APIC does not support the sending of lowest-priority IPIs
> >   and software writes the ICR to send a lowest-priority IPI with an
> >   illegal vector, the local APIC sets only the “redirectible IPI” error
> >   bit.
> >
> > Hence, if local APIC does support lowest-priority, then it throws the
> > same error as fixed.  (KVM does support lowest-priority.)
> 
> Yeah, I read the section before but I misunderstand it. It seems that
> the section means it just occurs when the local APIC does not support
> the sending of lowest-priority IPIs?

I think so.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 0/4] KVM: LAPIC: Rework lapic timer to behave more like real-hardware
  2017-09-29  1:04 [PATCH v2 0/4] KVM: LAPIC: Rework lapic timer to behave more like real-hardware Wanpeng Li
                   ` (3 preceding siblings ...)
  2017-09-29  1:04 ` [PATCH v2 4/4] KVM: LAPIC: Don't silently accept bad vectors Wanpeng Li
@ 2017-10-05 10:57 ` Wanpeng Li
  4 siblings, 0 replies; 27+ messages in thread
From: Wanpeng Li @ 2017-10-05 10:57 UTC (permalink / raw)
  To: linux-kernel, kvm; +Cc: Paolo Bonzini, Radim Krčmář, Wanpeng Li

2017-09-29 9:04 GMT+08:00 Wanpeng Li <kernellwp@gmail.com>:
> The issue is reported in xen community.
>
> Anthony PERARD pointed out:
>
> https://www.mail-archive.com/xen-devel@lists.xen.org/msg117283.html#
>
>  | When developing PVH for OVMF, I've used the lapic timer. It turns out that the
>  | way it is used by OVMF did not work with Xen [1]. I tried to find out how
>  | real-hw behave, and write a XTF tests [2]. And this patch series tries to fix
>  | the behavior of the vlapic timer.
>  |
>  |
>  | The OVMF driver for the APIC timer initialize the timer like this:
>  |      write to TMICT (initial counter)
>  |      write to TMDCR (divide configuration)
>  |      enable the timer (this may change timer mode from one-shot to periodic)
>  | It turns out that TMICT is set to 0 on the last step, but OVMF expect the timer
>  | to run.
>  |
>  | Here is some description of the APIC timer, base on observation as well as read
>  | of the Intel SDM. The description is also patch of patch description
>  | (reworded).
>  |
>  | Maybe a way of thinking how the APIC timer is evaluated, is to think of how
>  | hardward will do it. There is a counter TMCCT which always keeps counting down.
>  |
>  | Setting TMICT also set TMCCT, nothing else matter.
>  | Setting LVTT does not change anything right away.
>  | Setting TMDCR does not change much.
>  |
>  | Now TMCCT keeps counting down, by a value related to TMDCR.
>  | Once, TMCCT reach 0, it is only at this time that LVTT is taken into account.
>  | Is there an interrupt to deliver? Should the timer restart counting from the
>  | value in TMICT?
>  |
>  | In the Intel SDM, there is the word "disarm" of the timer used. I guess the
>  | easier way to disarm the APIC timer (when in periodic or one-shot) is to set
>  | TMICT to 0. But if we take TSC-Deadline mode out of the picture, there is
>  | nothing in the manual that say that the timer is disarm or stopped when
>  | changing timer mode (there is only two modes left, period and one-shot).
>  |
>  | As for the TSC-deadline timer mode, observation shown that changing to it (or
>  | from it) does reset and disarm both timers, so effectively TMICT and the
>  | tscdeadline are set to 0.
>  |
>  | [1] https://lists.xenproject.org/archives/html/xen-devel/2016-12/msg00959.html
>  | [2] v1:
>  | https://lists.xenproject.org/archives/html/xen-devel/2017-03/msg02533.html
>  |     v2: look for "[XTF PATCH V2 0/3] Testing vlapic timer"
>
>  In addition, Patch 3/4 implements the illegal vector error handling according to
>  SDM 10.5.2~10.5.3.
>
> v1 -> v2:
>  * add cover-letter and collect recent lapic patches to one patchset
>
> Wanpeng Li (4):
>   KVM: LAPIC: Fix lapic timer mode transition
>   KVM: LAPIC: Keep timer running when switching between one-shot and periodic mode
>   KVM: LAPIC: Apply change to TDCR right away to the timer
>   KVM: LAPIC: Don't silently accept bad vectors

I just sent out a new version of patch 1~3 for v3, and patch 4 need
more time to verify.

Regards,
Wanpeng Li

>
>  arch/x86/include/asm/apicdef.h |  1 +
>  arch/x86/kvm/lapic.c           | 90 ++++++++++++++++++++++++++++++++++--------
>  2 files changed, 74 insertions(+), 17 deletions(-)
>
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 4/4] KVM: LAPIC: Don't silently accept bad vectors
  2017-10-04 14:44           ` Radim Krčmář
@ 2017-10-13  1:17             ` Wanpeng Li
  2017-10-13 17:36               ` Radim Krčmář
  0 siblings, 1 reply; 27+ messages in thread
From: Wanpeng Li @ 2017-10-13  1:17 UTC (permalink / raw)
  To: Radim Krčmář; +Cc: linux-kernel, kvm, Paolo Bonzini, Wanpeng Li

2017-10-04 22:44 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
> 2017-10-04 22:16+0800, Wanpeng Li:
>> 2017-10-04 20:01 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
>> > 2017-10-04 15:56+0800, Wanpeng Li:
>> >> 2017-10-04 1:53 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
>> >> > 2017-09-28 18:04-0700, Wanpeng Li:
>> >> >> @@ -946,6 +965,11 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
>> >> >>       int result = 0;
>> >> >>       struct kvm_vcpu *vcpu = apic->vcpu;
>> >> >>
>> >> >> +     if (unlikely(vector < 16) && delivery_mode == APIC_DM_FIXED) {
>> >> >> +             apic_error(apic, APIC_ESR_RECVILL);
>> >> >
>> >> > The error is also triggered if lowest priority is supported and tries to
>> >> > deliver an invalid vector.
>> >>
>> >> Could you point out this in SDM? :)
>> >
>> > In section 10.5.3 Error Handling:
>> >
>> >   If the local APIC does not support the sending of lowest-priority IPIs
>> >   and software writes the ICR to send a lowest-priority IPI with an
>> >   illegal vector, the local APIC sets only the “redirectible IPI” error
>> >   bit.
>> >
>> > Hence, if local APIC does support lowest-priority, then it throws the
>> > same error as fixed.  (KVM does support lowest-priority.)
>>
>> Yeah, I read the section before but I misunderstand it. It seems that
>> the section means it just occurs when the local APIC does not support
>> the sending of lowest-priority IPIs?
>
> I think so.

I see Virtualbox just captures Fixed delivery mode for error handling.

Regards,
Wanpeng Li

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 4/4] KVM: LAPIC: Don't silently accept bad vectors
  2017-10-13  1:17             ` Wanpeng Li
@ 2017-10-13 17:36               ` Radim Krčmář
  2017-10-13 20:31                 ` Radim Krčmář
  0 siblings, 1 reply; 27+ messages in thread
From: Radim Krčmář @ 2017-10-13 17:36 UTC (permalink / raw)
  To: Wanpeng Li; +Cc: linux-kernel, kvm, Paolo Bonzini, Wanpeng Li

2017-10-13 09:17+0800, Wanpeng Li:
> 2017-10-04 22:44 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
> > 2017-10-04 22:16+0800, Wanpeng Li:
> >> 2017-10-04 20:01 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
> >> > 2017-10-04 15:56+0800, Wanpeng Li:
> >> >> 2017-10-04 1:53 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
> >> >> > 2017-09-28 18:04-0700, Wanpeng Li:
> >> >> >> @@ -946,6 +965,11 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
> >> >> >>       int result = 0;
> >> >> >>       struct kvm_vcpu *vcpu = apic->vcpu;
> >> >> >>
> >> >> >> +     if (unlikely(vector < 16) && delivery_mode == APIC_DM_FIXED) {
> >> >> >> +             apic_error(apic, APIC_ESR_RECVILL);
> >> >> >
> >> >> > The error is also triggered if lowest priority is supported and tries to
> >> >> > deliver an invalid vector.
> >> >>
> >> >> Could you point out this in SDM? :)
> >> >
> >> > In section 10.5.3 Error Handling:
> >> >
> >> >   If the local APIC does not support the sending of lowest-priority IPIs
> >> >   and software writes the ICR to send a lowest-priority IPI with an
> >> >   illegal vector, the local APIC sets only the “redirectible IPI” error
> >> >   bit.
> >> >
> >> > Hence, if local APIC does support lowest-priority, then it throws the
> >> > same error as fixed.  (KVM does support lowest-priority.)
> >>
> >> Yeah, I read the section before but I misunderstand it. It seems that
> >> the section means it just occurs when the local APIC does not support
> >> the sending of lowest-priority IPIs?
> >
> > I think so.
> 
> I see Virtualbox just captures Fixed delivery mode for error handling.

Hm, it doesn't even inject an error on destination of the
lowest-priority interrupt and just drop it?

I can't interpret the SDM in any other way, though:

  When an interrupt vector in the range of 0 to 15 is sent or received
  through the local APIC, the APIC indicates an illegal vector in its
  Error Status Register (see Section 10.5.3, “Error Handling”).

and we support lowest-priority interrupts, because if we didn't, then

 The interrupt is not processed and hence the “Send Illegal Vector” bit
 is not set in the ESR.

I'll go for a quick bare-metal test ...

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 4/4] KVM: LAPIC: Don't silently accept bad vectors
  2017-10-13 17:36               ` Radim Krčmář
@ 2017-10-13 20:31                 ` Radim Krčmář
  2017-10-15  2:41                   ` Wanpeng Li
  0 siblings, 1 reply; 27+ messages in thread
From: Radim Krčmář @ 2017-10-13 20:31 UTC (permalink / raw)
  To: Wanpeng Li; +Cc: linux-kernel, kvm, Paolo Bonzini, Wanpeng Li

2017-10-13 19:36+0200, Radim Krčmář:
> 2017-10-13 09:17+0800, Wanpeng Li:
> > 2017-10-04 22:44 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
> > > 2017-10-04 22:16+0800, Wanpeng Li:
> > >> 2017-10-04 20:01 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
> > >> > 2017-10-04 15:56+0800, Wanpeng Li:
> > >> >> 2017-10-04 1:53 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
> > >> >> > 2017-09-28 18:04-0700, Wanpeng Li:
> > >> >> >> @@ -946,6 +965,11 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
> > >> >> >>       int result = 0;
> > >> >> >>       struct kvm_vcpu *vcpu = apic->vcpu;
> > >> >> >>
> > >> >> >> +     if (unlikely(vector < 16) && delivery_mode == APIC_DM_FIXED) {
> > >> >> >> +             apic_error(apic, APIC_ESR_RECVILL);
> > >> >> >
> > >> >> > The error is also triggered if lowest priority is supported and tries to
> > >> >> > deliver an invalid vector.
> > >> >>
> > >> >> Could you point out this in SDM? :)
> > >> >
> > >> > In section 10.5.3 Error Handling:
> > >> >
> > >> >   If the local APIC does not support the sending of lowest-priority IPIs
> > >> >   and software writes the ICR to send a lowest-priority IPI with an
> > >> >   illegal vector, the local APIC sets only the “redirectible IPI” error
> > >> >   bit.
> > >> >
> > >> > Hence, if local APIC does support lowest-priority, then it throws the
> > >> > same error as fixed.  (KVM does support lowest-priority.)
> > >>
> > >> Yeah, I read the section before but I misunderstand it. It seems that
> > >> the section means it just occurs when the local APIC does not support
> > >> the sending of lowest-priority IPIs?
> > >
> > > I think so.
> > 
> > I see Virtualbox just captures Fixed delivery mode for error handling.
> 
> Hm, it doesn't even inject an error on destination of the
> lowest-priority interrupt and just drop it?
> 
> I can't interpret the SDM in any other way, though:
> 
>   When an interrupt vector in the range of 0 to 15 is sent or received
>   through the local APIC, the APIC indicates an illegal vector in its
>   Error Status Register (see Section 10.5.3, “Error Handling”).
> 
> and we support lowest-priority interrupts, because if we didn't, then
> 
>  The interrupt is not processed and hence the “Send Illegal Vector” bit
>  is not set in the ESR.
> 
> I'll go for a quick bare-metal test ...

Turns out my machine doesn't support for lowest priority IPIs (probably
got killed with FSB), so all I get is error 0x10.

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH v2 4/4] KVM: LAPIC: Don't silently accept bad vectors
  2017-10-13 20:31                 ` Radim Krčmář
@ 2017-10-15  2:41                   ` Wanpeng Li
  0 siblings, 0 replies; 27+ messages in thread
From: Wanpeng Li @ 2017-10-15  2:41 UTC (permalink / raw)
  To: Radim Krčmář; +Cc: linux-kernel, kvm, Paolo Bonzini, Wanpeng Li

2017-10-14 4:31 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
> 2017-10-13 19:36+0200, Radim Krčmář:
>> 2017-10-13 09:17+0800, Wanpeng Li:
>> > 2017-10-04 22:44 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
>> > > 2017-10-04 22:16+0800, Wanpeng Li:
>> > >> 2017-10-04 20:01 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
>> > >> > 2017-10-04 15:56+0800, Wanpeng Li:
>> > >> >> 2017-10-04 1:53 GMT+08:00 Radim Krčmář <rkrcmar@redhat.com>:
>> > >> >> > 2017-09-28 18:04-0700, Wanpeng Li:
>> > >> >> >> @@ -946,6 +965,11 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
>> > >> >> >>       int result = 0;
>> > >> >> >>       struct kvm_vcpu *vcpu = apic->vcpu;
>> > >> >> >>
>> > >> >> >> +     if (unlikely(vector < 16) && delivery_mode == APIC_DM_FIXED) {
>> > >> >> >> +             apic_error(apic, APIC_ESR_RECVILL);
>> > >> >> >
>> > >> >> > The error is also triggered if lowest priority is supported and tries to
>> > >> >> > deliver an invalid vector.
>> > >> >>
>> > >> >> Could you point out this in SDM? :)
>> > >> >
>> > >> > In section 10.5.3 Error Handling:
>> > >> >
>> > >> >   If the local APIC does not support the sending of lowest-priority IPIs
>> > >> >   and software writes the ICR to send a lowest-priority IPI with an
>> > >> >   illegal vector, the local APIC sets only the “redirectible IPI” error
>> > >> >   bit.
>> > >> >
>> > >> > Hence, if local APIC does support lowest-priority, then it throws the
>> > >> > same error as fixed.  (KVM does support lowest-priority.)
>> > >>
>> > >> Yeah, I read the section before but I misunderstand it. It seems that
>> > >> the section means it just occurs when the local APIC does not support
>> > >> the sending of lowest-priority IPIs?
>> > >
>> > > I think so.
>> >
>> > I see Virtualbox just captures Fixed delivery mode for error handling.
>>
>> Hm, it doesn't even inject an error on destination of the
>> lowest-priority interrupt and just drop it?

I think so.

>>
>> I can't interpret the SDM in any other way, though:
>>
>>   When an interrupt vector in the range of 0 to 15 is sent or received
>>   through the local APIC, the APIC indicates an illegal vector in its
>>   Error Status Register (see Section 10.5.3, “Error Handling”).
>>
>> and we support lowest-priority interrupts, because if we didn't, then
>>
>>  The interrupt is not processed and hence the “Send Illegal Vector” bit
>>  is not set in the ESR.
>>
>> I'll go for a quick bare-metal test ...
>
> Turns out my machine doesn't support for lowest priority IPIs (probably
> got killed with FSB), so all I get is error 0x10.

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2017-10-15  2:41 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-29  1:04 [PATCH v2 0/4] KVM: LAPIC: Rework lapic timer to behave more like real-hardware Wanpeng Li
2017-09-29  1:04 ` [PATCH v2 1/4] KVM: LAPIC: Fix lapic timer mode transition Wanpeng Li
2017-10-03 17:05   ` Radim Krčmář
2017-10-04  1:45     ` Wanpeng Li
2017-10-04 13:21       ` Radim Krčmář
2017-10-04 13:50         ` Wanpeng Li
2017-10-04 14:02           ` Wanpeng Li
2017-09-29  1:04 ` [PATCH v2 2/4] KVM: LAPIC: Keep timer running when switching between one-shot and periodic mode Wanpeng Li
2017-10-03 17:06   ` Radim Krčmář
2017-10-04  1:46     ` Wanpeng Li
2017-10-04 13:33       ` Radim Krčmář
2017-10-04 13:57         ` Wanpeng Li
2017-09-29  1:04 ` [PATCH v2 3/4] KVM: LAPIC: Apply change to TDCR right away to the timer Wanpeng Li
2017-10-03 17:28   ` Radim Krčmář
2017-10-04  1:59     ` Wanpeng Li
2017-10-04 12:43       ` Radim Krčmář
2017-09-29  1:04 ` [PATCH v2 4/4] KVM: LAPIC: Don't silently accept bad vectors Wanpeng Li
2017-10-03 17:53   ` Radim Krčmář
2017-10-04  7:56     ` Wanpeng Li
2017-10-04 12:01       ` Radim Krčmář
2017-10-04 14:16         ` Wanpeng Li
2017-10-04 14:44           ` Radim Krčmář
2017-10-13  1:17             ` Wanpeng Li
2017-10-13 17:36               ` Radim Krčmář
2017-10-13 20:31                 ` Radim Krčmář
2017-10-15  2:41                   ` Wanpeng Li
2017-10-05 10:57 ` [PATCH v2 0/4] KVM: LAPIC: Rework lapic timer to behave more like real-hardware Wanpeng Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).