All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 0/5] KVM: VMX: Add Posted Interrupt supporting
@ 2013-03-15 13:31 Yang Zhang
  2013-03-15 13:31 ` [PATCH v6 1/5] KVM: VMX: Enable acknowledge interupt on vmexit Yang Zhang
                   ` (6 more replies)
  0 siblings, 7 replies; 31+ messages in thread
From: Yang Zhang @ 2013-03-15 13:31 UTC (permalink / raw)
  To: kvm; +Cc: gleb, mtosatti, xiantao.zhang, Yang Zhang

From: Yang Zhang <yang.z.zhang@Intel.com>

The follwoing patches are adding the Posted Interrupt supporting to KVM:
The first patch enables the feature 'acknowledge interrupt on vmexit'.Since
it is required by Posted interrupt, we need to enable it firstly.

And the subsequent patches are adding the posted interrupt supporting:
Posted Interrupt allows APIC interrupts to inject into guest directly
without any vmexit.

- When delivering a interrupt to guest, if target vcpu is running,
  update Posted-interrupt requests bitmap and send a notification event
  to the vcpu. Then the vcpu will handle this interrupt automatically,
  without any software involvemnt.

- If target vcpu is not running or there already a notification event
  pending in the vcpu, do nothing. The interrupt will be handled by
  next vm entry

NOTE: We don't turn on the Posted Interrupt until the coalesced issue is
solved.

Changes from v5 to v6:
* Split sync_pir_to_irr into two functions one to query whether PIR is empty
  and the other to perform the sync.
* Add comments to explain how vmx_sync_pir_to_irr() work.
* Rebase on top of KVM upstream.
  
Changes from v4 to v5:
* Add per cpu count for posted IPI handler.
* Dont' check irr when delivering interrupt. Since we can not get interrupt
  coalesced info with Posted Interrupt. So there is no need to check the irr.
  There is another patch will changed current interrupt coalesced logic. Before
  it, we will not turn on Posted Interrupt.
* Clear outstanding notification bit after call local_irq_disable, but before
  check request. As Marcelo suggested, we can ensure the IPI not lost in this
  way
* Remove the spinlock. Same as item 2, if not need to get coalesced info, then no
  need the lock.
* Rebase on top of KVM upstream.

Yang Zhang (5):
  KVM: VMX: Enable acknowledge interupt on vmexit
  KVM: VMX: Register a new IPI for posted interrupt
  KVM: VMX: Check the posted interrupt capability
  KVM: VMX: Add the algorithm of deliver posted interrupt
  KVM : VMX: Use posted interrupt to deliver virtual interrupt

 arch/x86/include/asm/entry_arch.h  |    4 +
 arch/x86/include/asm/hardirq.h     |    3 +
 arch/x86/include/asm/hw_irq.h      |    1 +
 arch/x86/include/asm/irq_vectors.h |    5 +
 arch/x86/include/asm/kvm_host.h    |    5 +
 arch/x86/include/asm/vmx.h         |    4 +
 arch/x86/kernel/entry_64.S         |    5 +
 arch/x86/kernel/irq.c              |   22 ++++
 arch/x86/kernel/irqinit.c          |    4 +
 arch/x86/kvm/irq.c                 |    3 +-
 arch/x86/kvm/lapic.c               |   29 ++++-
 arch/x86/kvm/lapic.h               |    2 +
 arch/x86/kvm/svm.c                 |   30 +++++
 arch/x86/kvm/vmx.c                 |  223 ++++++++++++++++++++++++++++++++----
 arch/x86/kvm/x86.c                 |    8 +-
 virt/kvm/kvm_main.c                |    1 +
 16 files changed, 320 insertions(+), 29 deletions(-)


^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v6 1/5] KVM: VMX: Enable acknowledge interupt on vmexit
  2013-03-15 13:31 [PATCH v6 0/5] KVM: VMX: Add Posted Interrupt supporting Yang Zhang
@ 2013-03-15 13:31 ` Yang Zhang
  2013-03-15 13:31 ` [PATCH v6 2/5] KVM: VMX: Register a new IPI for posted interrupt Yang Zhang
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 31+ messages in thread
From: Yang Zhang @ 2013-03-15 13:31 UTC (permalink / raw)
  To: kvm; +Cc: gleb, mtosatti, xiantao.zhang, Yang Zhang

From: Yang Zhang <yang.z.zhang@Intel.com>

The "acknowledge interrupt on exit" feature controls processor behavior
for external interrupt acknowledgement. When this control is set, the
processor acknowledges the interrupt controller to acquire the
interrupt vector on VM exit.

After enabling this feature, an interrupt which arrived when target cpu is
running in vmx non-root mode will be handled by vmx handler instead of handler
in idt. Currently, vmx handler only fakes an interrupt stack and jump to idt
table to let real handler to handle it. Further, we will recognize the interrupt
and only delivery the interrupt which not belong to current vcpu through idt table.
The interrupt which belonged to current vcpu will be handled inside vmx handler.
This will reduce the interrupt handle cost of KVM.

Also, interrupt enable logic is changed if this feature is turnning on:
Before this patch, hypervior call local_irq_enable() to enable it directly.
Now IF bit is set on interrupt stack frame, and will be enabled on a return from
interrupt handler if exterrupt interrupt exists. If no external interrupt, still
call local_irq_enable() to enable it.

Refer to Intel SDM volum 3, chapter 33.2.

Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
---
 arch/x86/include/asm/kvm_host.h |    1 +
 arch/x86/kvm/svm.c              |    6 ++++
 arch/x86/kvm/vmx.c              |   58 ++++++++++++++++++++++++++++++++++++---
 arch/x86/kvm/x86.c              |    4 ++-
 4 files changed, 64 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 635a74d..b8388e9 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -730,6 +730,7 @@ struct kvm_x86_ops {
 	int (*check_intercept)(struct kvm_vcpu *vcpu,
 			       struct x86_instruction_info *info,
 			       enum x86_intercept_stage stage);
+	void (*handle_external_intr)(struct kvm_vcpu *vcpu);
 };
 
 struct kvm_arch_async_pf {
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index e1b1ce2..a7d60d7 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -4247,6 +4247,11 @@ out:
 	return ret;
 }
 
+static void svm_handle_external_intr(struct kvm_vcpu *vcpu)
+{
+	local_irq_enable();
+}
+
 static struct kvm_x86_ops svm_x86_ops = {
 	.cpu_has_kvm_support = has_svm,
 	.disabled_by_bios = is_disabled,
@@ -4342,6 +4347,7 @@ static struct kvm_x86_ops svm_x86_ops = {
 	.set_tdp_cr3 = set_tdp_cr3,
 
 	.check_intercept = svm_check_intercept,
+	.handle_external_intr = svm_handle_external_intr,
 };
 
 static int __init svm_init(void)
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 7cc566b..dd881a4 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -377,6 +377,7 @@ struct vcpu_vmx {
 	struct shared_msr_entry *guest_msrs;
 	int                   nmsrs;
 	int                   save_nmsrs;
+	unsigned long	      host_idt_base;
 #ifdef CONFIG_X86_64
 	u64 		      msr_host_kernel_gs_base;
 	u64 		      msr_guest_kernel_gs_base;
@@ -2607,7 +2608,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf)
 #ifdef CONFIG_X86_64
 	min |= VM_EXIT_HOST_ADDR_SPACE_SIZE;
 #endif
-	opt = VM_EXIT_SAVE_IA32_PAT | VM_EXIT_LOAD_IA32_PAT;
+	opt = VM_EXIT_SAVE_IA32_PAT | VM_EXIT_LOAD_IA32_PAT |
+		VM_EXIT_ACK_INTR_ON_EXIT;
 	if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS,
 				&_vmexit_control) < 0)
 		return -EIO;
@@ -3877,7 +3879,7 @@ static void vmx_disable_intercept_msr_write_x2apic(u32 msr)
  * Note that host-state that does change is set elsewhere. E.g., host-state
  * that is set differently for each CPU is set in vmx_vcpu_load(), not here.
  */
-static void vmx_set_constant_host_state(void)
+static void vmx_set_constant_host_state(struct vcpu_vmx *vmx)
 {
 	u32 low32, high32;
 	unsigned long tmpl;
@@ -3905,6 +3907,7 @@ static void vmx_set_constant_host_state(void)
 
 	native_store_idt(&dt);
 	vmcs_writel(HOST_IDTR_BASE, dt.address);   /* 22.2.4 */
+	vmx->host_idt_base = dt.address;
 
 	vmcs_writel(HOST_RIP, vmx_return); /* 22.2.5 */
 
@@ -4037,7 +4040,7 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx)
 
 	vmcs_write16(HOST_FS_SELECTOR, 0);            /* 22.2.4 */
 	vmcs_write16(HOST_GS_SELECTOR, 0);            /* 22.2.4 */
-	vmx_set_constant_host_state();
+	vmx_set_constant_host_state(vmx);
 #ifdef CONFIG_X86_64
 	rdmsrl(MSR_FS_BASE, a);
 	vmcs_writel(HOST_FS_BASE, a); /* 22.2.4 */
@@ -6394,6 +6397,52 @@ static void vmx_complete_atomic_exit(struct vcpu_vmx *vmx)
 	}
 }
 
+static void vmx_handle_external_intr(struct kvm_vcpu *vcpu)
+{
+	u32 exit_intr_info = vmcs_read32(VM_EXIT_INTR_INFO);
+
+	/*
+	 * If external interrupt exists, IF bit is set in rflags/eflags on the
+	 * interrupt stack frame, and interrupt will be enabled on a return
+	 * from interrupt handler.
+	 */
+	if ((exit_intr_info & (INTR_INFO_VALID_MASK | INTR_INFO_INTR_TYPE_MASK))
+			== (INTR_INFO_VALID_MASK | INTR_TYPE_EXT_INTR)) {
+		unsigned int vector;
+		unsigned long entry;
+		gate_desc *desc;
+		struct vcpu_vmx *vmx = to_vmx(vcpu);
+#ifdef CONFIG_X86_64
+		unsigned long tmp;
+#endif
+
+		vector =  exit_intr_info & INTR_INFO_VECTOR_MASK;
+		desc = (gate_desc *)vmx->host_idt_base + vector;
+		entry = gate_offset(*desc);
+		asm volatile(
+#ifdef CONFIG_X86_64
+			"mov %%" _ASM_SP ", %[sp]\n\t"
+			"and $0xfffffffffffffff0, %%" _ASM_SP "\n\t"
+			"push $%c[ss]\n\t"
+			"push %[sp]\n\t"
+#endif
+			"pushf\n\t"
+			"orl $0x200, (%%" _ASM_SP ")\n\t"
+			__ASM_SIZE(push) " $%c[cs]\n\t"
+			"call *%[entry]\n\t"
+			:
+#ifdef CONFIG_X86_64
+			[sp]"=&r"(tmp)
+#endif
+			:
+			[entry]"r"(entry),
+			[ss]"i"(__KERNEL_DS),
+			[cs]"i"(__KERNEL_CS)
+			);
+	} else
+		local_irq_enable();
+}
+
 static void vmx_recover_nmi_blocking(struct vcpu_vmx *vmx)
 {
 	u32 exit_intr_info;
@@ -7061,7 +7110,7 @@ static void prepare_vmcs02(struct kvm_vcpu *vcpu, struct vmcs12 *vmcs12)
 	 * Other fields are different per CPU, and will be set later when
 	 * vmx_vcpu_load() is called, and when vmx_save_host_state() is called.
 	 */
-	vmx_set_constant_host_state();
+	vmx_set_constant_host_state(vmx);
 
 	/*
 	 * HOST_RSP is normally set correctly in vmx_vcpu_run() just before
@@ -7667,6 +7716,7 @@ static struct kvm_x86_ops vmx_x86_ops = {
 	.set_tdp_cr3 = vmx_set_cr3,
 
 	.check_intercept = vmx_check_intercept,
+	.handle_external_intr = vmx_handle_external_intr,
 };
 
 static int __init vmx_init(void)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index d0cf737..0baa90d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5784,7 +5784,9 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 
 	vcpu->mode = OUTSIDE_GUEST_MODE;
 	smp_wmb();
-	local_irq_enable();
+
+	/* Interrupt is enabled by handle_external_intr() */
+	kvm_x86_ops->handle_external_intr(vcpu);
 
 	++vcpu->stat.exits;
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v6 2/5] KVM: VMX: Register a new IPI for posted interrupt
  2013-03-15 13:31 [PATCH v6 0/5] KVM: VMX: Add Posted Interrupt supporting Yang Zhang
  2013-03-15 13:31 ` [PATCH v6 1/5] KVM: VMX: Enable acknowledge interupt on vmexit Yang Zhang
@ 2013-03-15 13:31 ` Yang Zhang
  2013-03-15 13:31 ` [PATCH v6 3/5] KVM: VMX: Check the posted interrupt capability Yang Zhang
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 31+ messages in thread
From: Yang Zhang @ 2013-03-15 13:31 UTC (permalink / raw)
  To: kvm; +Cc: gleb, mtosatti, xiantao.zhang, Yang Zhang

From: Yang Zhang <yang.z.zhang@Intel.com>

Posted Interrupt feature requires a special IPI to deliver posted interrupt
to guest. And it should has a high priority so the interrupt will not be
blocked by others.
Normally, the posted interrupt will be consumed by vcpu if target vcpu is
running and transparent to OS. But in some cases, the interrupt will arrive
when target vcpu is scheduled out. And host will see it. So we need to
register a dump handler to handle it.

Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
---
 arch/x86/include/asm/entry_arch.h  |    4 ++++
 arch/x86/include/asm/hardirq.h     |    3 +++
 arch/x86/include/asm/hw_irq.h      |    1 +
 arch/x86/include/asm/irq_vectors.h |    5 +++++
 arch/x86/kernel/entry_64.S         |    5 +++++
 arch/x86/kernel/irq.c              |   22 ++++++++++++++++++++++
 arch/x86/kernel/irqinit.c          |    4 ++++
 7 files changed, 44 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/entry_arch.h b/arch/x86/include/asm/entry_arch.h
index 40afa00..9bd4eca 100644
--- a/arch/x86/include/asm/entry_arch.h
+++ b/arch/x86/include/asm/entry_arch.h
@@ -19,6 +19,10 @@ BUILD_INTERRUPT(reboot_interrupt,REBOOT_VECTOR)
 
 BUILD_INTERRUPT(x86_platform_ipi, X86_PLATFORM_IPI_VECTOR)
 
+#ifdef CONFIG_HAVE_KVM
+BUILD_INTERRUPT(kvm_posted_intr_ipi, POSTED_INTR_VECTOR)
+#endif
+
 /*
  * every pentium local APIC has two 'local interrupts', with a
  * soft-definable vector attached to both interrupts, one of
diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h
index 81f04ce..ab0ae1a 100644
--- a/arch/x86/include/asm/hardirq.h
+++ b/arch/x86/include/asm/hardirq.h
@@ -12,6 +12,9 @@ typedef struct {
 	unsigned int irq_spurious_count;
 	unsigned int icr_read_retry_count;
 #endif
+#ifdef CONFIG_HAVE_KVM
+	unsigned int kvm_posted_intr_ipis;
+#endif
 	unsigned int x86_platform_ipis;	/* arch dependent */
 	unsigned int apic_perf_irqs;
 	unsigned int apic_irq_work_irqs;
diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
index eb92a6e..cebef02 100644
--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -28,6 +28,7 @@
 /* Interrupt handlers registered during init_IRQ */
 extern void apic_timer_interrupt(void);
 extern void x86_platform_ipi(void);
+extern void kvm_posted_intr_ipi(void);
 extern void error_interrupt(void);
 extern void irq_work_interrupt(void);
 
diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_vectors.h
index 1508e51..774dc9f 100644
--- a/arch/x86/include/asm/irq_vectors.h
+++ b/arch/x86/include/asm/irq_vectors.h
@@ -102,6 +102,11 @@
  */
 #define X86_PLATFORM_IPI_VECTOR		0xf7
 
+/* Vector for KVM to deliver posted interrupt IPI */
+#ifdef CONFIG_HAVE_KVM
+#define POSTED_INTR_VECTOR		0xf2
+#endif
+
 /*
  * IRQ work vector:
  */
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 70641af..781bc06 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -1177,6 +1177,11 @@ apicinterrupt LOCAL_TIMER_VECTOR \
 apicinterrupt X86_PLATFORM_IPI_VECTOR \
 	x86_platform_ipi smp_x86_platform_ipi
 
+#ifdef CONFIG_HAVE_KVM
+apicinterrupt POSTED_INTR_VECTOR \
+	kvm_posted_intr_ipi smp_kvm_posted_intr_ipi
+#endif
+
 apicinterrupt THRESHOLD_APIC_VECTOR \
 	threshold_interrupt smp_threshold_interrupt
 apicinterrupt THERMAL_APIC_VECTOR \
diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index e4595f1..6ae6ea1 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -228,6 +228,28 @@ void smp_x86_platform_ipi(struct pt_regs *regs)
 	set_irq_regs(old_regs);
 }
 
+#ifdef CONFIG_HAVE_KVM
+/*
+ * Handler for POSTED_INTERRUPT_VECTOR.
+ */
+void smp_kvm_posted_intr_ipi(struct pt_regs *regs)
+{
+	struct pt_regs *old_regs = set_irq_regs(regs);
+
+	ack_APIC_irq();
+
+	irq_enter();
+
+	exit_idle();
+
+	inc_irq_stat(kvm_posted_intr_ipis);
+
+	irq_exit();
+
+	set_irq_regs(old_regs);
+}
+#endif
+
 EXPORT_SYMBOL_GPL(vector_used_by_percpu_irq);
 
 #ifdef CONFIG_HOTPLUG_CPU
diff --git a/arch/x86/kernel/irqinit.c b/arch/x86/kernel/irqinit.c
index 6e03b0d..2329a54 100644
--- a/arch/x86/kernel/irqinit.c
+++ b/arch/x86/kernel/irqinit.c
@@ -205,6 +205,10 @@ static void __init apic_intr_init(void)
 
 	/* IPI for X86 platform specific use */
 	alloc_intr_gate(X86_PLATFORM_IPI_VECTOR, x86_platform_ipi);
+#ifdef CONFIG_HAVE_KVM
+	/* IPI for KVM to deliver posted interrupt */
+	alloc_intr_gate(POSTED_INTR_VECTOR, kvm_posted_intr_ipi);
+#endif
 
 	/* IPI vectors for APIC spurious and error interrupts */
 	alloc_intr_gate(SPURIOUS_APIC_VECTOR, spurious_interrupt);
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v6 3/5] KVM: VMX: Check the posted interrupt capability
  2013-03-15 13:31 [PATCH v6 0/5] KVM: VMX: Add Posted Interrupt supporting Yang Zhang
  2013-03-15 13:31 ` [PATCH v6 1/5] KVM: VMX: Enable acknowledge interupt on vmexit Yang Zhang
  2013-03-15 13:31 ` [PATCH v6 2/5] KVM: VMX: Register a new IPI for posted interrupt Yang Zhang
@ 2013-03-15 13:31 ` Yang Zhang
  2013-03-15 13:31 ` [PATCH v6 4/5] KVM: VMX: Add the algorithm of deliver posted interrupt Yang Zhang
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 31+ messages in thread
From: Yang Zhang @ 2013-03-15 13:31 UTC (permalink / raw)
  To: kvm; +Cc: gleb, mtosatti, xiantao.zhang, Yang Zhang

From: Yang Zhang <yang.z.zhang@Intel.com>

Detect the posted interrupt feature. If it exists, then set it in vmcs_config.

Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
---
 arch/x86/include/asm/vmx.h |    4 ++
 arch/x86/kvm/vmx.c         |   87 ++++++++++++++++++++++++++++++++++----------
 2 files changed, 71 insertions(+), 20 deletions(-)

diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index 5c9dbad..ce8ac80 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -158,6 +158,7 @@
 #define PIN_BASED_EXT_INTR_MASK                 0x00000001
 #define PIN_BASED_NMI_EXITING                   0x00000008
 #define PIN_BASED_VIRTUAL_NMIS                  0x00000020
+#define PIN_BASED_POSTED_INTR                   0x00000080
 
 #define VM_EXIT_SAVE_DEBUG_CONTROLS             0x00000002
 #define VM_EXIT_HOST_ADDR_SPACE_SIZE            0x00000200
@@ -180,6 +181,7 @@
 /* VMCS Encodings */
 enum vmcs_field {
 	VIRTUAL_PROCESSOR_ID            = 0x00000000,
+	POSTED_INTR_NV                  = 0x00000002,
 	GUEST_ES_SELECTOR               = 0x00000800,
 	GUEST_CS_SELECTOR               = 0x00000802,
 	GUEST_SS_SELECTOR               = 0x00000804,
@@ -214,6 +216,8 @@ enum vmcs_field {
 	VIRTUAL_APIC_PAGE_ADDR_HIGH     = 0x00002013,
 	APIC_ACCESS_ADDR		= 0x00002014,
 	APIC_ACCESS_ADDR_HIGH		= 0x00002015,
+	POSTED_INTR_DESC_ADDR           = 0x00002016,
+	POSTED_INTR_DESC_ADDR_HIGH      = 0x00002017,
 	EPT_POINTER                     = 0x0000201a,
 	EPT_POINTER_HIGH                = 0x0000201b,
 	EOI_EXIT_BITMAP0                = 0x0000201c,
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index dd881a4..625fd19 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -84,7 +84,8 @@ module_param(vmm_exclusive, bool, S_IRUGO);
 static bool __read_mostly fasteoi = 1;
 module_param(fasteoi, bool, S_IRUGO);
 
-static bool __read_mostly enable_apicv_reg_vid;
+static bool __read_mostly enable_apicv;
+module_param(enable_apicv, bool, S_IRUGO);
 
 /*
  * If nested=1, nested virtualization is supported, i.e., guests may use
@@ -365,6 +366,19 @@ struct nested_vmx {
 	struct page *apic_access_page;
 };
 
+#define POSTED_INTR_ON  0
+/* Posted-Interrupt Descriptor */
+struct pi_desc {
+	u32 pir[8];     /* Posted interrupt requested */
+	union {
+		struct {
+			u8  on:1,
+			    rsvd:7;
+		} control;
+		u32 rsvd[8];
+	} u;
+} __aligned(64);
+
 struct vcpu_vmx {
 	struct kvm_vcpu       vcpu;
 	unsigned long         host_rsp;
@@ -429,6 +443,9 @@ struct vcpu_vmx {
 
 	bool rdtscp_enabled;
 
+	/* Posted interrupt descriptor */
+	struct pi_desc pi_desc;
+
 	/* Support for a guest hypervisor (nested VMX) */
 	struct nested_vmx nested;
 };
@@ -783,6 +800,18 @@ static inline bool cpu_has_vmx_virtual_intr_delivery(void)
 		SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY;
 }
 
+static inline bool cpu_has_vmx_posted_intr(void)
+{
+	return vmcs_config.pin_based_exec_ctrl & PIN_BASED_POSTED_INTR;
+}
+
+static inline bool cpu_has_vmx_apicv(void)
+{
+	return cpu_has_vmx_apic_register_virt() &&
+		cpu_has_vmx_virtual_intr_delivery() &&
+		cpu_has_vmx_posted_intr();
+}
+
 static inline bool cpu_has_vmx_flexpriority(void)
 {
 	return cpu_has_vmx_tpr_shadow() &&
@@ -2532,12 +2561,6 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf)
 	u32 _vmexit_control = 0;
 	u32 _vmentry_control = 0;
 
-	min = PIN_BASED_EXT_INTR_MASK | PIN_BASED_NMI_EXITING;
-	opt = PIN_BASED_VIRTUAL_NMIS;
-	if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PINBASED_CTLS,
-				&_pin_based_exec_control) < 0)
-		return -EIO;
-
 	min = CPU_BASED_HLT_EXITING |
 #ifdef CONFIG_X86_64
 	      CPU_BASED_CR8_LOAD_EXITING |
@@ -2614,6 +2637,17 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf)
 				&_vmexit_control) < 0)
 		return -EIO;
 
+	min = PIN_BASED_EXT_INTR_MASK | PIN_BASED_NMI_EXITING;
+	opt = PIN_BASED_VIRTUAL_NMIS | PIN_BASED_POSTED_INTR;
+	if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_PINBASED_CTLS,
+				&_pin_based_exec_control) < 0)
+		return -EIO;
+
+	if (!(_cpu_based_2nd_exec_control &
+		SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY) ||
+		!(_vmexit_control & VM_EXIT_ACK_INTR_ON_EXIT))
+		_pin_based_exec_control &= ~PIN_BASED_POSTED_INTR;
+
 	min = 0;
 	opt = VM_ENTRY_LOAD_IA32_PAT;
 	if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_ENTRY_CTLS,
@@ -2792,11 +2826,10 @@ static __init int hardware_setup(void)
 	if (!cpu_has_vmx_ple())
 		ple_gap = 0;
 
-	if (!cpu_has_vmx_apic_register_virt() ||
-				!cpu_has_vmx_virtual_intr_delivery())
-		enable_apicv_reg_vid = 0;
+	if (!cpu_has_vmx_apicv())
+		enable_apicv = 0;
 
-	if (enable_apicv_reg_vid)
+	if (enable_apicv)
 		kvm_x86_ops->update_cr8_intercept = NULL;
 	else
 		kvm_x86_ops->hwapic_irr_update = NULL;
@@ -3873,6 +3906,11 @@ static void vmx_disable_intercept_msr_write_x2apic(u32 msr)
 			msr, MSR_TYPE_W);
 }
 
+static int vmx_vm_has_apicv(struct kvm *kvm)
+{
+	return enable_apicv && irqchip_in_kernel(kvm);
+}
+
 /*
  * Set up the vmcs's constant host-state fields, i.e., host-state fields that
  * will not change in the lifetime of the guest.
@@ -3933,6 +3971,15 @@ static void set_cr4_guest_host_mask(struct vcpu_vmx *vmx)
 	vmcs_writel(CR4_GUEST_HOST_MASK, ~vmx->vcpu.arch.cr4_guest_owned_bits);
 }
 
+static u32 vmx_pin_based_exec_ctrl(struct vcpu_vmx *vmx)
+{
+	u32 pin_based_exec_ctrl = vmcs_config.pin_based_exec_ctrl;
+
+	if (!vmx_vm_has_apicv(vmx->vcpu.kvm))
+		pin_based_exec_ctrl &= ~PIN_BASED_POSTED_INTR;
+	return pin_based_exec_ctrl;
+}
+
 static u32 vmx_exec_control(struct vcpu_vmx *vmx)
 {
 	u32 exec_control = vmcs_config.cpu_based_exec_ctrl;
@@ -3950,11 +3997,6 @@ static u32 vmx_exec_control(struct vcpu_vmx *vmx)
 	return exec_control;
 }
 
-static int vmx_vm_has_apicv(struct kvm *kvm)
-{
-	return enable_apicv_reg_vid && irqchip_in_kernel(kvm);
-}
-
 static u32 vmx_secondary_exec_control(struct vcpu_vmx *vmx)
 {
 	u32 exec_control = vmcs_config.cpu_based_2nd_exec_ctrl;
@@ -4010,8 +4052,7 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx)
 	vmcs_write64(VMCS_LINK_POINTER, -1ull); /* 22.3.1.5 */
 
 	/* Control */
-	vmcs_write32(PIN_BASED_VM_EXEC_CONTROL,
-		vmcs_config.pin_based_exec_ctrl);
+	vmcs_write32(PIN_BASED_VM_EXEC_CONTROL, vmx_pin_based_exec_ctrl(vmx));
 
 	vmcs_write32(CPU_BASED_VM_EXEC_CONTROL, vmx_exec_control(vmx));
 
@@ -4020,13 +4061,16 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx)
 				vmx_secondary_exec_control(vmx));
 	}
 
-	if (enable_apicv_reg_vid) {
+	if (vmx_vm_has_apicv(vmx->vcpu.kvm)) {
 		vmcs_write64(EOI_EXIT_BITMAP0, 0);
 		vmcs_write64(EOI_EXIT_BITMAP1, 0);
 		vmcs_write64(EOI_EXIT_BITMAP2, 0);
 		vmcs_write64(EOI_EXIT_BITMAP3, 0);
 
 		vmcs_write16(GUEST_INTR_STATUS, 0);
+
+		vmcs_write64(POSTED_INTR_NV, POSTED_INTR_VECTOR);
+		vmcs_write64(POSTED_INTR_DESC_ADDR, __pa((&vmx->pi_desc)));
 	}
 
 	if (ple_gap) {
@@ -4176,6 +4220,9 @@ static int vmx_vcpu_reset(struct kvm_vcpu *vcpu)
 		vmcs_write64(APIC_ACCESS_ADDR,
 			     page_to_phys(vmx->vcpu.kvm->arch.apic_access_page));
 
+	if (vmx_vm_has_apicv(vcpu->kvm))
+		memset(&vmx->pi_desc, 0, sizeof(struct pi_desc));
+
 	if (vmx->vpid != 0)
 		vmcs_write16(VIRTUAL_PROCESSOR_ID, vmx->vpid);
 
@@ -7791,7 +7838,7 @@ static int __init vmx_init(void)
 	memcpy(vmx_msr_bitmap_longmode_x2apic,
 			vmx_msr_bitmap_longmode, PAGE_SIZE);
 
-	if (enable_apicv_reg_vid) {
+	if (enable_apicv) {
 		for (msr = 0x800; msr <= 0x8ff; msr++)
 			vmx_disable_intercept_msr_read_x2apic(msr);
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v6 4/5] KVM: VMX: Add the algorithm of deliver posted interrupt
  2013-03-15 13:31 [PATCH v6 0/5] KVM: VMX: Add Posted Interrupt supporting Yang Zhang
                   ` (2 preceding siblings ...)
  2013-03-15 13:31 ` [PATCH v6 3/5] KVM: VMX: Check the posted interrupt capability Yang Zhang
@ 2013-03-15 13:31 ` Yang Zhang
  2013-03-15 13:31 ` [PATCH v6 5/5] KVM : VMX: Use posted interrupt to deliver virtual interrupt Yang Zhang
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 31+ messages in thread
From: Yang Zhang @ 2013-03-15 13:31 UTC (permalink / raw)
  To: kvm; +Cc: gleb, mtosatti, xiantao.zhang, Yang Zhang

From: Yang Zhang <yang.z.zhang@Intel.com>

Only deliver the posted interrupt when target vcpu is running
and there is no previous interrupt pending in pir.

Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
---
 arch/x86/include/asm/kvm_host.h |    4 ++
 arch/x86/kvm/lapic.c            |   13 +++++++
 arch/x86/kvm/lapic.h            |    1 +
 arch/x86/kvm/svm.c              |   24 ++++++++++++++
 arch/x86/kvm/vmx.c              |   67 +++++++++++++++++++++++++++++++++++++++
 virt/kvm/kvm_main.c             |    1 +
 6 files changed, 110 insertions(+), 0 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index b8388e9..eac50d3 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -704,6 +704,10 @@ struct kvm_x86_ops {
 	void (*hwapic_isr_update)(struct kvm *kvm, int isr);
 	void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap);
 	void (*set_virtual_x2apic_mode)(struct kvm_vcpu *vcpu, bool set);
+	bool (*deliver_posted_interrupt)(struct kvm_vcpu *vcpu, int vector);
+	void (*posted_intr_clear_on)(struct kvm_vcpu *vcpu);
+	void (*sync_pir_to_irr)(struct kvm_vcpu *vcpu);
+	bool (*hwapic_has_interrupt)(struct kvm_vcpu *vcpu);
 	int (*set_tss_addr)(struct kvm *kvm, unsigned int addr);
 	int (*get_tdp_level)(void);
 	u64 (*get_mt_mask)(struct kvm_vcpu *vcpu, gfn_t gfn, bool is_mmio);
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 02b51dd..b3ea50e 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -357,6 +357,19 @@ static u8 count_vectors(void *bitmap)
 	return count;
 }
 
+void kvm_apic_update_irr(struct kvm_vcpu *vcpu, u32 *pir)
+{
+	u32 i, pir_val;
+	struct kvm_lapic *apic = vcpu->arch.apic;
+
+	for (i = 0; i <= 7; i++) {
+		pir_val = xchg(&pir[i], 0);
+		if (pir_val)
+			*((u32 *)(apic->regs + APIC_IRR + i * 0x10)) |= pir_val;
+	}
+}
+EXPORT_SYMBOL_GPL(kvm_apic_update_irr);
+
 static inline int apic_test_and_set_irr(int vec, struct kvm_lapic *apic)
 {
 	apic->irr_pending = true;
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index 1676d34..e5327be 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -157,5 +157,6 @@ static inline u16 apic_logical_id(struct kvm_apic_map *map, u32 ldr)
 void kvm_calculate_eoi_exitmap(struct kvm_vcpu *vcpu,
 				struct kvm_lapic_irq *irq,
 				u64 *eoi_bitmap);
+void kvm_apic_update_irr(struct kvm_vcpu *vcpu, u32 *pir);
 
 #endif
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index a7d60d7..6382d09 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3591,6 +3591,26 @@ static void svm_hwapic_isr_update(struct kvm *kvm, int isr)
 	return;
 }
 
+static bool svm_hwapic_has_interrupt(struct kvm_vcpu *vcpu)
+{
+	return false;
+}
+
+static void svm_sync_pir_to_irr(struct kvm_vcpu *vcpu)
+{
+	return;
+}
+
+static bool svm_deliver_posted_interrupt(struct kvm_vcpu *vcpu, int vector)
+{
+	return false;
+}
+
+static void svm_posted_intr_clear_on(struct kvm_vcpu *vcpu)
+{
+	return;
+}
+
 static int svm_nmi_allowed(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
@@ -4319,6 +4339,10 @@ static struct kvm_x86_ops svm_x86_ops = {
 	.vm_has_apicv = svm_vm_has_apicv,
 	.load_eoi_exitmap = svm_load_eoi_exitmap,
 	.hwapic_isr_update = svm_hwapic_isr_update,
+	.sync_pir_to_irr = svm_sync_pir_to_irr,
+	.hwapic_has_interrupt = svm_hwapic_has_interrupt,
+	.deliver_posted_interrupt = svm_deliver_posted_interrupt,
+	.posted_intr_clear_on = svm_posted_intr_clear_on,
 
 	.set_tss_addr = svm_set_tss_addr,
 	.get_tdp_level = get_npt_level,
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 625fd19..0b5a8ae 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -379,6 +379,17 @@ struct pi_desc {
 	} u;
 } __aligned(64);
 
+static bool pi_test_and_set_on(struct pi_desc *pi_desc)
+{
+	return test_and_set_bit(POSTED_INTR_ON,
+			(unsigned long *)&pi_desc->u.control);
+}
+
+static int pi_test_and_set_pir(int vector, struct pi_desc *pi_desc)
+{
+	return test_and_set_bit(vector, (unsigned long *)pi_desc->pir);
+}
+
 struct vcpu_vmx {
 	struct kvm_vcpu       vcpu;
 	unsigned long         host_rsp;
@@ -3911,6 +3922,58 @@ static int vmx_vm_has_apicv(struct kvm *kvm)
 	return enable_apicv && irqchip_in_kernel(kvm);
 }
 
+static void vmx_posted_intr_clear_on(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
+
+	if (!vmx_vm_has_apicv(vcpu->kvm))
+		return;
+
+	clear_bit(POSTED_INTR_ON, (unsigned long *)&vmx->pi_desc.u.control);
+}
+
+static bool vmx_deliver_posted_interrupt(struct kvm_vcpu *vcpu, int vector)
+{
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
+
+	if (!vmx_vm_has_apicv(vcpu->kvm))
+		return false;
+
+	if (pi_test_and_set_pir(vector, &vmx->pi_desc))
+		return true;
+
+	kvm_make_request(KVM_REQ_EVENT, vcpu);
+	if ((vcpu->mode == IN_GUEST_MODE)) {
+		if (!pi_test_and_set_on(&vmx->pi_desc))
+			apic->send_IPI_mask(get_cpu_mask(vcpu->cpu),
+					POSTED_INTR_VECTOR);
+	} else
+		kvm_vcpu_kick(vcpu);
+
+	return true;
+}
+
+static bool vmx_hwapic_has_interrupt(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
+
+	if (!vmx_vm_has_apicv(vcpu->kvm))
+		return false;
+
+	if (bitmap_empty((unsigned long *)vmx->pi_desc.pir, 256))
+		return false;
+
+	return true;
+}
+
+static void vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
+
+	if (vmx_hwapic_has_interrupt(vcpu))
+		kvm_apic_update_irr(vcpu, vmx->pi_desc.pir);
+}
+
 /*
  * Set up the vmcs's constant host-state fields, i.e., host-state fields that
  * will not change in the lifetime of the guest.
@@ -7735,6 +7798,10 @@ static struct kvm_x86_ops vmx_x86_ops = {
 	.load_eoi_exitmap = vmx_load_eoi_exitmap,
 	.hwapic_irr_update = vmx_hwapic_irr_update,
 	.hwapic_isr_update = vmx_hwapic_isr_update,
+	.sync_pir_to_irr = vmx_sync_pir_to_irr,
+	.hwapic_has_interrupt = vmx_hwapic_has_interrupt,
+	.deliver_posted_interrupt = vmx_deliver_posted_interrupt,
+	.posted_intr_clear_on = vmx_posted_intr_clear_on,
 
 	.set_tss_addr = vmx_set_tss_addr,
 	.get_tdp_level = get_ept_level,
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index adc68fe..cd73e0e 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1692,6 +1692,7 @@ void kvm_vcpu_kick(struct kvm_vcpu *vcpu)
 			smp_send_reschedule(cpu);
 	put_cpu();
 }
+EXPORT_SYMBOL_GPL(kvm_vcpu_kick);
 #endif /* !CONFIG_S390 */
 
 void kvm_resched(struct kvm_vcpu *vcpu)
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v6 5/5] KVM : VMX: Use posted interrupt to deliver virtual interrupt
  2013-03-15 13:31 [PATCH v6 0/5] KVM: VMX: Add Posted Interrupt supporting Yang Zhang
                   ` (3 preceding siblings ...)
  2013-03-15 13:31 ` [PATCH v6 4/5] KVM: VMX: Add the algorithm of deliver posted interrupt Yang Zhang
@ 2013-03-15 13:31 ` Yang Zhang
  2013-03-19  8:54   ` Gleb Natapov
  2013-03-18  2:49 ` [PATCH v6 0/5] KVM: VMX: Add Posted Interrupt supporting Zhang, Yang Z
  2013-03-18 22:20 ` Marcelo Tosatti
  6 siblings, 1 reply; 31+ messages in thread
From: Yang Zhang @ 2013-03-15 13:31 UTC (permalink / raw)
  To: kvm; +Cc: gleb, mtosatti, xiantao.zhang, Yang Zhang

From: Yang Zhang <yang.z.zhang@Intel.com>

If posted interrupt is avaliable, then uses it to inject virtual
interrupt to guest.

Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
---
 arch/x86/kvm/irq.c   |    3 ++-
 arch/x86/kvm/lapic.c |   16 +++++++++++++---
 arch/x86/kvm/lapic.h |    1 +
 arch/x86/kvm/vmx.c   |   11 +++++++++++
 arch/x86/kvm/x86.c   |    4 ++++
 5 files changed, 31 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
index 484bc87..5179988 100644
--- a/arch/x86/kvm/irq.c
+++ b/arch/x86/kvm/irq.c
@@ -81,7 +81,8 @@ int kvm_cpu_has_interrupt(struct kvm_vcpu *v)
 	if (kvm_cpu_has_extint(v))
 		return 1;
 
-	return kvm_apic_has_interrupt(v) != -1;	/* LAPIC */
+	return (kvm_apic_has_interrupt(v) != -1) ||
+		kvm_hwapic_has_interrupt(v);
 }
 EXPORT_SYMBOL_GPL(kvm_cpu_has_interrupt);
 
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index b3ea50e..46c7310 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -713,7 +713,10 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
 		} else
 			apic_clear_vector(vector, apic->regs + APIC_TMR);
 
-		result = !apic_test_and_set_irr(vector, apic);
+		result = 1;
+		if (!kvm_x86_ops->deliver_posted_interrupt(vcpu, vector))
+			result = !apic_test_and_set_irr(vector, apic);
+
 		trace_kvm_apic_accept_irq(vcpu->vcpu_id, delivery_mode,
 					  trig_mode, vector, !result);
 		if (!result) {
@@ -723,8 +726,10 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
 			break;
 		}
 
-		kvm_make_request(KVM_REQ_EVENT, vcpu);
-		kvm_vcpu_kick(vcpu);
+		if (!kvm_x86_ops->vm_has_apicv(vcpu->kvm)) {
+			kvm_make_request(KVM_REQ_EVENT, vcpu);
+			kvm_vcpu_kick(vcpu);
+		}
 		break;
 
 	case APIC_DM_REMRD:
@@ -1604,6 +1609,11 @@ int kvm_apic_has_interrupt(struct kvm_vcpu *vcpu)
 	return highest_irr;
 }
 
+bool kvm_hwapic_has_interrupt(struct kvm_vcpu *vcpu)
+{
+	return kvm_x86_ops->hwapic_has_interrupt(vcpu);
+}
+
 int kvm_apic_accept_pic_intr(struct kvm_vcpu *vcpu)
 {
 	u32 lvt0 = kvm_apic_get_reg(vcpu->arch.apic, APIC_LVT0);
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index e5327be..c6abc63 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -37,6 +37,7 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu);
 void kvm_free_lapic(struct kvm_vcpu *vcpu);
 
 int kvm_apic_has_interrupt(struct kvm_vcpu *vcpu);
+bool kvm_hwapic_has_interrupt(struct kvm_vcpu *vcpu);
 int kvm_apic_accept_pic_intr(struct kvm_vcpu *vcpu);
 int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu);
 void kvm_lapic_reset(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 0b5a8ae..48a2239 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -3932,6 +3932,17 @@ static void vmx_posted_intr_clear_on(struct kvm_vcpu *vcpu)
 	clear_bit(POSTED_INTR_ON, (unsigned long *)&vmx->pi_desc.u.control);
 }
 
+/*
+ * Send interrupt to vcpu via posted interrupt way.
+ * Return false if posted interrupt is not supported and the caller will
+ * roll back to old way(via set vIRR).
+ * Return true if posted interrupt is avalialbe, the interrupt is set
+ * in pir(posted interrupt requests):
+ * 1. If target vcpu is running(non-root mode), send posted interrupt
+ * notification to vcpu and hardware will sync pir to vIRR atomically.
+ * 2. If target vcpu isn't running(root mode), kick it to pick up the
+ * interrupt from pir in next vmentry.
+ */
 static bool vmx_deliver_posted_interrupt(struct kvm_vcpu *vcpu, int vector)
 {
 	struct vcpu_vmx *vmx = to_vmx(vcpu);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0baa90d..0981100 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2679,6 +2679,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,
 				    struct kvm_lapic_state *s)
 {
+	kvm_x86_ops->sync_pir_to_irr(vcpu);
 	memcpy(s->regs, vcpu->arch.apic->regs, sizeof *s);
 
 	return 0;
@@ -5699,6 +5700,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 	}
 
 	if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
+		kvm_x86_ops->sync_pir_to_irr(vcpu);
 		inject_pending_event(vcpu);
 
 		/* enable NMI/IRQ window open exits if needed */
@@ -5741,6 +5743,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 
 	local_irq_disable();
 
+	kvm_x86_ops->posted_intr_clear_on(vcpu);
+
 	if (vcpu->mode == EXITING_GUEST_MODE || vcpu->requests
 	    || need_resched() || signal_pending(current)) {
 		vcpu->mode = OUTSIDE_GUEST_MODE;
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 31+ messages in thread

* RE: [PATCH v6 0/5] KVM: VMX: Add Posted Interrupt supporting
  2013-03-15 13:31 [PATCH v6 0/5] KVM: VMX: Add Posted Interrupt supporting Yang Zhang
                   ` (4 preceding siblings ...)
  2013-03-15 13:31 ` [PATCH v6 5/5] KVM : VMX: Use posted interrupt to deliver virtual interrupt Yang Zhang
@ 2013-03-18  2:49 ` Zhang, Yang Z
  2013-03-18  9:16   ` Gleb Natapov
  2013-03-18 22:20 ` Marcelo Tosatti
  6 siblings, 1 reply; 31+ messages in thread
From: Zhang, Yang Z @ 2013-03-18  2:49 UTC (permalink / raw)
  To: Zhang, Yang Z, kvm; +Cc: gleb, mtosatti, Zhang, Xiantao

Zhang, Yang Z wrote on 2013-03-15:
> From: Yang Zhang <yang.z.zhang@Intel.com>
> 
> The follwoing patches are adding the Posted Interrupt supporting to KVM:
> The first patch enables the feature 'acknowledge interrupt on vmexit'.Since
> it is required by Posted interrupt, we need to enable it firstly.
> 
> And the subsequent patches are adding the posted interrupt supporting:
> Posted Interrupt allows APIC interrupts to inject into guest directly
> without any vmexit.
> 
> - When delivering a interrupt to guest, if target vcpu is running,
>   update Posted-interrupt requests bitmap and send a notification event
>   to the vcpu. Then the vcpu will handle this interrupt automatically,
>   without any software involvemnt.
> - If target vcpu is not running or there already a notification event
>   pending in the vcpu, do nothing. The interrupt will be handled by
>   next vm entry
> NOTE: We don't turn on the Posted Interrupt until the coalesced issue is
> solved.
> 
> Changes from v5 to v6:
> * Split sync_pir_to_irr into two functions one to query whether PIR is empty
>   and the other to perform the sync.
> * Add comments to explain how vmx_sync_pir_to_irr() work.
> * Rebase on top of KVM upstream.
> 
> Changes from v4 to v5:
> * Add per cpu count for posted IPI handler.
> * Dont' check irr when delivering interrupt. Since we can not get interrupt
>   coalesced info with Posted Interrupt. So there is no need to check the
>   irr. There is another patch will changed current interrupt coalesced
>   logic. Before it, we will not turn on Posted Interrupt. * Clear
>   outstanding notification bit after call local_irq_disable, but before
>   check request. As Marcelo suggested, we can ensure the IPI not lost in
>   this way * Remove the spinlock. Same as item 2, if not need to get
>   coalesced info, then no need the lock.
> * Rebase on top of KVM upstream.
> 
> Yang Zhang (5):
>   KVM: VMX: Enable acknowledge interupt on vmexit
>   KVM: VMX: Register a new IPI for posted interrupt
>   KVM: VMX: Check the posted interrupt capability
>   KVM: VMX: Add the algorithm of deliver posted interrupt
>   KVM : VMX: Use posted interrupt to deliver virtual interrupt
>  arch/x86/include/asm/entry_arch.h  |    4 +
>  arch/x86/include/asm/hardirq.h     |    3 +
>  arch/x86/include/asm/hw_irq.h      |    1 +
>  arch/x86/include/asm/irq_vectors.h |    5 +
>  arch/x86/include/asm/kvm_host.h    |    5 + arch/x86/include/asm/vmx.h 
>         |    4 + arch/x86/kernel/entry_64.S         |    5 +
>  arch/x86/kernel/irq.c              |   22 ++++
>  arch/x86/kernel/irqinit.c          |    4 + arch/x86/kvm/irq.c         
>         |    3 +- arch/x86/kvm/lapic.c               |   29 ++++-
>  arch/x86/kvm/lapic.h               |    2 + arch/x86/kvm/svm.c         
>         |   30 +++++ arch/x86/kvm/vmx.c                 |  223
>  ++++++++++++++++++++++++++++++++---- arch/x86/kvm/x86.c                
>  |    8 +- virt/kvm/kvm_main.c                |    1 + 16 files changed,
>  320 insertions(+), 29 deletions(-)
Are there any other comments with this patch?

For TMR issue, since it has nothing to do with APICv, if we really need to handle it later, then we may need a separate patch to fix it. But currently, we may focused on APICv only.

Best regards,
Yang


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 0/5] KVM: VMX: Add Posted Interrupt supporting
  2013-03-18  2:49 ` [PATCH v6 0/5] KVM: VMX: Add Posted Interrupt supporting Zhang, Yang Z
@ 2013-03-18  9:16   ` Gleb Natapov
  2013-03-18 10:43     ` Zhang, Yang Z
  0 siblings, 1 reply; 31+ messages in thread
From: Gleb Natapov @ 2013-03-18  9:16 UTC (permalink / raw)
  To: Zhang, Yang Z; +Cc: kvm, mtosatti, Zhang, Xiantao

On Mon, Mar 18, 2013 at 02:49:25AM +0000, Zhang, Yang Z wrote:
> Zhang, Yang Z wrote on 2013-03-15:
> > From: Yang Zhang <yang.z.zhang@Intel.com>
> > 
> > The follwoing patches are adding the Posted Interrupt supporting to KVM:
> > The first patch enables the feature 'acknowledge interrupt on vmexit'.Since
> > it is required by Posted interrupt, we need to enable it firstly.
> > 
> > And the subsequent patches are adding the posted interrupt supporting:
> > Posted Interrupt allows APIC interrupts to inject into guest directly
> > without any vmexit.
> > 
> > - When delivering a interrupt to guest, if target vcpu is running,
> >   update Posted-interrupt requests bitmap and send a notification event
> >   to the vcpu. Then the vcpu will handle this interrupt automatically,
> >   without any software involvemnt.
> > - If target vcpu is not running or there already a notification event
> >   pending in the vcpu, do nothing. The interrupt will be handled by
> >   next vm entry
> > NOTE: We don't turn on the Posted Interrupt until the coalesced issue is
> > solved.
> > 
> > Changes from v5 to v6:
> > * Split sync_pir_to_irr into two functions one to query whether PIR is empty
> >   and the other to perform the sync.
> > * Add comments to explain how vmx_sync_pir_to_irr() work.
> > * Rebase on top of KVM upstream.
> > 
> > Changes from v4 to v5:
> > * Add per cpu count for posted IPI handler.
> > * Dont' check irr when delivering interrupt. Since we can not get interrupt
> >   coalesced info with Posted Interrupt. So there is no need to check the
> >   irr. There is another patch will changed current interrupt coalesced
> >   logic. Before it, we will not turn on Posted Interrupt. * Clear
> >   outstanding notification bit after call local_irq_disable, but before
> >   check request. As Marcelo suggested, we can ensure the IPI not lost in
> >   this way * Remove the spinlock. Same as item 2, if not need to get
> >   coalesced info, then no need the lock.
> > * Rebase on top of KVM upstream.
> > 
> > Yang Zhang (5):
> >   KVM: VMX: Enable acknowledge interupt on vmexit
> >   KVM: VMX: Register a new IPI for posted interrupt
> >   KVM: VMX: Check the posted interrupt capability
> >   KVM: VMX: Add the algorithm of deliver posted interrupt
> >   KVM : VMX: Use posted interrupt to deliver virtual interrupt
> >  arch/x86/include/asm/entry_arch.h  |    4 +
> >  arch/x86/include/asm/hardirq.h     |    3 +
> >  arch/x86/include/asm/hw_irq.h      |    1 +
> >  arch/x86/include/asm/irq_vectors.h |    5 +
> >  arch/x86/include/asm/kvm_host.h    |    5 + arch/x86/include/asm/vmx.h 
> >         |    4 + arch/x86/kernel/entry_64.S         |    5 +
> >  arch/x86/kernel/irq.c              |   22 ++++
> >  arch/x86/kernel/irqinit.c          |    4 + arch/x86/kvm/irq.c         
> >         |    3 +- arch/x86/kvm/lapic.c               |   29 ++++-
> >  arch/x86/kvm/lapic.h               |    2 + arch/x86/kvm/svm.c         
> >         |   30 +++++ arch/x86/kvm/vmx.c                 |  223
> >  ++++++++++++++++++++++++++++++++---- arch/x86/kvm/x86.c                
> >  |    8 +- virt/kvm/kvm_main.c                |    1 + 16 files changed,
> >  320 insertions(+), 29 deletions(-)
> Are there any other comments with this patch?
> 
Haven't reviewed it yet, will do ASAP. "Use eoi to track RTC interrupt
delivery status" series is pre-request for this one.

> For TMR issue, since it has nothing to do with APICv, if we really need to handle it later, then we may need a separate patch to fix it. But currently, we may focused on APICv only.
> 
What do you mean by "TMR has nothing to do with APICv"?

--
			Gleb.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH v6 0/5] KVM: VMX: Add Posted Interrupt supporting
  2013-03-18  9:16   ` Gleb Natapov
@ 2013-03-18 10:43     ` Zhang, Yang Z
  2013-03-18 11:28       ` Gleb Natapov
  0 siblings, 1 reply; 31+ messages in thread
From: Zhang, Yang Z @ 2013-03-18 10:43 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: kvm, mtosatti, Zhang, Xiantao

Gleb Natapov wrote on 2013-03-18:
> On Mon, Mar 18, 2013 at 02:49:25AM +0000, Zhang, Yang Z wrote:
>> Zhang, Yang Z wrote on 2013-03-15:
>>> From: Yang Zhang <yang.z.zhang@Intel.com>
>>> 
>>> The follwoing patches are adding the Posted Interrupt supporting to KVM:
>>> The first patch enables the feature 'acknowledge interrupt on vmexit'.Since
>>> it is required by Posted interrupt, we need to enable it firstly.
>>> 
>>> And the subsequent patches are adding the posted interrupt supporting:
>>> Posted Interrupt allows APIC interrupts to inject into guest directly
>>> without any vmexit.
>>> 
>>> - When delivering a interrupt to guest, if target vcpu is running,
>>>   update Posted-interrupt requests bitmap and send a notification
>>>   event to the vcpu. Then the vcpu will handle this interrupt
>>>   automatically, without any software involvemnt. - If target vcpu is
>>>   not running or there already a notification event pending in the
>>>   vcpu, do nothing. The interrupt will be handled by next vm entry
>>> NOTE: We don't turn on the Posted Interrupt until the coalesced issue is
>>> solved.
>>> 
>>> Changes from v5 to v6:
>>> * Split sync_pir_to_irr into two functions one to query whether PIR is empty
>>>   and the other to perform the sync.
>>> * Add comments to explain how vmx_sync_pir_to_irr() work.
>>> * Rebase on top of KVM upstream.
>>> 
>>> Changes from v4 to v5:
>>> * Add per cpu count for posted IPI handler.
>>> * Dont' check irr when delivering interrupt. Since we can not get interrupt
>>>   coalesced info with Posted Interrupt. So there is no need to check the
>>>   irr. There is another patch will changed current interrupt coalesced
>>>   logic. Before it, we will not turn on Posted Interrupt. * Clear
>>>   outstanding notification bit after call local_irq_disable, but before
>>>   check request. As Marcelo suggested, we can ensure the IPI not lost in
>>>   this way * Remove the spinlock. Same as item 2, if not need to get
>>>   coalesced info, then no need the lock.
>>> * Rebase on top of KVM upstream.
>>> 
>>> Yang Zhang (5):
>>>   KVM: VMX: Enable acknowledge interupt on vmexit
>>>   KVM: VMX: Register a new IPI for posted interrupt
>>>   KVM: VMX: Check the posted interrupt capability
>>>   KVM: VMX: Add the algorithm of deliver posted interrupt
>>>   KVM : VMX: Use posted interrupt to deliver virtual interrupt
>>>  arch/x86/include/asm/entry_arch.h  |    4 +
>>>  arch/x86/include/asm/hardirq.h     |    3 +
>>>  arch/x86/include/asm/hw_irq.h      |    1 +
>>>  arch/x86/include/asm/irq_vectors.h |    5 +
>>>  arch/x86/include/asm/kvm_host.h    |    5 +
> arch/x86/include/asm/vmx.h
>>>         |    4 + arch/x86/kernel/entry_64.S         |    5 +
>>>  arch/x86/kernel/irq.c              |   22 ++++
>>>  arch/x86/kernel/irqinit.c          |    4 + arch/x86/kvm/irq.c
>>>         |    3 +- arch/x86/kvm/lapic.c               |   29 ++++-
>>>         arch/x86/kvm/lapic.h               |    2 + arch/x86/kvm/svm.c
>>>         |   30 +++++ arch/x86/kvm/vmx.c                 |  223
>>>  ++++++++++++++++++++++++++++++++---- arch/x86/kvm/x86.c
>>>  |    8 +- virt/kvm/kvm_main.c                |    1 + 16 files changed,
>>>  320 insertions(+), 29 deletions(-)
>> Are there any other comments with this patch?
>> 
> Haven't reviewed it yet, will do ASAP. "Use eoi to track RTC interrupt
> delivery status" series is pre-request for this one.
> 
>> For TMR issue, since it has nothing to do with APICv, if we really need to handle
> it later, then we may need a separate patch to fix it. But currently, we may
> focused on APICv only.
>> 
> What do you mean by "TMR has nothing to do with APICv"?
Just ignore it. I will send out the fixing with APICv patch.

Best regards,
Yang



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 0/5] KVM: VMX: Add Posted Interrupt supporting
  2013-03-18 10:43     ` Zhang, Yang Z
@ 2013-03-18 11:28       ` Gleb Natapov
  2013-03-18 11:44         ` Zhang, Yang Z
  0 siblings, 1 reply; 31+ messages in thread
From: Gleb Natapov @ 2013-03-18 11:28 UTC (permalink / raw)
  To: Zhang, Yang Z; +Cc: kvm, mtosatti, Zhang, Xiantao

On Mon, Mar 18, 2013 at 10:43:16AM +0000, Zhang, Yang Z wrote:
> >> For TMR issue, since it has nothing to do with APICv, if we really need to handle
> > it later, then we may need a separate patch to fix it. But currently, we may
> > focused on APICv only.
> >> 
> > What do you mean by "TMR has nothing to do with APICv"?
> Just ignore it. I will send out the fixing with APICv patch.
> 
Does this mean we cannot get away with updating TMR like we do now?
Because if we can I prefer to not complicate the code further.

--
			Gleb.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH v6 0/5] KVM: VMX: Add Posted Interrupt supporting
  2013-03-18 11:28       ` Gleb Natapov
@ 2013-03-18 11:44         ` Zhang, Yang Z
  0 siblings, 0 replies; 31+ messages in thread
From: Zhang, Yang Z @ 2013-03-18 11:44 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: kvm, mtosatti, Zhang, Xiantao

Gleb Natapov wrote on 2013-03-18:
> On Mon, Mar 18, 2013 at 10:43:16AM +0000, Zhang, Yang Z wrote:
>>>> For TMR issue, since it has nothing to do with APICv, if we really need to
> handle
>>> it later, then we may need a separate patch to fix it. But currently, we may
>>> focused on APICv only.
>>>> 
>>> What do you mean by "TMR has nothing to do with APICv"?
>> Just ignore it. I will send out the fixing with APICv patch.
>> 
> Does this mean we cannot get away with updating TMR like we do now?
> Because if we can I prefer to not complicate the code further.
No, we still need to update TMR but only in vcpu context.

Best regards,
Yang



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 0/5] KVM: VMX: Add Posted Interrupt supporting
  2013-03-15 13:31 [PATCH v6 0/5] KVM: VMX: Add Posted Interrupt supporting Yang Zhang
                   ` (5 preceding siblings ...)
  2013-03-18  2:49 ` [PATCH v6 0/5] KVM: VMX: Add Posted Interrupt supporting Zhang, Yang Z
@ 2013-03-18 22:20 ` Marcelo Tosatti
  6 siblings, 0 replies; 31+ messages in thread
From: Marcelo Tosatti @ 2013-03-18 22:20 UTC (permalink / raw)
  To: Yang Zhang; +Cc: kvm, gleb, xiantao.zhang

On Fri, Mar 15, 2013 at 09:31:06PM +0800, Yang Zhang wrote:
> From: Yang Zhang <yang.z.zhang@Intel.com>
> 
> The follwoing patches are adding the Posted Interrupt supporting to KVM:
> The first patch enables the feature 'acknowledge interrupt on vmexit'.Since
> it is required by Posted interrupt, we need to enable it firstly.
> 
> And the subsequent patches are adding the posted interrupt supporting:
> Posted Interrupt allows APIC interrupts to inject into guest directly
> without any vmexit.
> 
> - When delivering a interrupt to guest, if target vcpu is running,
>   update Posted-interrupt requests bitmap and send a notification event
>   to the vcpu. Then the vcpu will handle this interrupt automatically,
>   without any software involvemnt.
> 
> - If target vcpu is not running or there already a notification event
>   pending in the vcpu, do nothing. The interrupt will be handled by
>   next vm entry
> 
> NOTE: We don't turn on the Posted Interrupt until the coalesced issue is
> solved.

Looks good, but RTC issue should be resolved.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 5/5] KVM : VMX: Use posted interrupt to deliver virtual interrupt
  2013-03-15 13:31 ` [PATCH v6 5/5] KVM : VMX: Use posted interrupt to deliver virtual interrupt Yang Zhang
@ 2013-03-19  8:54   ` Gleb Natapov
  2013-03-19 12:11     ` Zhang, Yang Z
  0 siblings, 1 reply; 31+ messages in thread
From: Gleb Natapov @ 2013-03-19  8:54 UTC (permalink / raw)
  To: Yang Zhang; +Cc: kvm, mtosatti, xiantao.zhang

On Fri, Mar 15, 2013 at 09:31:11PM +0800, Yang Zhang wrote:
> From: Yang Zhang <yang.z.zhang@Intel.com>
> 
> If posted interrupt is avaliable, then uses it to inject virtual
> interrupt to guest.
> 
> Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
> ---
>  arch/x86/kvm/irq.c   |    3 ++-
>  arch/x86/kvm/lapic.c |   16 +++++++++++++---
>  arch/x86/kvm/lapic.h |    1 +
>  arch/x86/kvm/vmx.c   |   11 +++++++++++
>  arch/x86/kvm/x86.c   |    4 ++++
>  5 files changed, 31 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
> index 484bc87..5179988 100644
> --- a/arch/x86/kvm/irq.c
> +++ b/arch/x86/kvm/irq.c
> @@ -81,7 +81,8 @@ int kvm_cpu_has_interrupt(struct kvm_vcpu *v)
>  	if (kvm_cpu_has_extint(v))
>  		return 1;
>  
> -	return kvm_apic_has_interrupt(v) != -1;	/* LAPIC */
> +	return (kvm_apic_has_interrupt(v) != -1) ||
> +		kvm_hwapic_has_interrupt(v);
That's incorrect. kvm_cpu_has_interrupt() should return true only it
there is IRR suitable to be injected, not just any IRR.
kvm_apic_has_interrupt() should call kvm_apic_update_irr().

>  }
>  EXPORT_SYMBOL_GPL(kvm_cpu_has_interrupt);
>  
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index b3ea50e..46c7310 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -713,7 +713,10 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
>  		} else
>  			apic_clear_vector(vector, apic->regs + APIC_TMR);
>  
> -		result = !apic_test_and_set_irr(vector, apic);
> +		result = 1;
> +		if (!kvm_x86_ops->deliver_posted_interrupt(vcpu, vector))
> +			result = !apic_test_and_set_irr(vector, apic);
> +
>  		trace_kvm_apic_accept_irq(vcpu->vcpu_id, delivery_mode,
>  					  trig_mode, vector, !result);
>  		if (!result) {
> @@ -723,8 +726,10 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
>  			break;
>  		}
>  
> -		kvm_make_request(KVM_REQ_EVENT, vcpu);
> -		kvm_vcpu_kick(vcpu);
> +		if (!kvm_x86_ops->vm_has_apicv(vcpu->kvm)) {
> +			kvm_make_request(KVM_REQ_EVENT, vcpu);
> +			kvm_vcpu_kick(vcpu);
> +		}
>  		break;
apicv code and non apicv code are completely different. What's the point
checking for apicv twice here?
Just do:

if (kvm_x86_ops->deliver_posted_interrupt)
	kvm_x86_ops->deliver_posted_interrupt(vcpu, vector)
else {
	result = !apic_test_and_set_irr(vector, apic);
	kvm_make_request(KVM_REQ_EVENT, vcpu);
	kvm_vcpu_kick(vcpu);
}

And set kvm_x86_ops->deliver_posted_interrupt only if apicv is enabled.

Also rearrange patches so that APIC_TMR handling goes before posted
interrupt series.

>  
>  	case APIC_DM_REMRD:
> @@ -1604,6 +1609,11 @@ int kvm_apic_has_interrupt(struct kvm_vcpu *vcpu)
>  	return highest_irr;
>  }
>  
> +bool kvm_hwapic_has_interrupt(struct kvm_vcpu *vcpu)
> +{
> +	return kvm_x86_ops->hwapic_has_interrupt(vcpu);
> +}
> +
>  int kvm_apic_accept_pic_intr(struct kvm_vcpu *vcpu)
>  {
>  	u32 lvt0 = kvm_apic_get_reg(vcpu->arch.apic, APIC_LVT0);
> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
> index e5327be..c6abc63 100644
> --- a/arch/x86/kvm/lapic.h
> +++ b/arch/x86/kvm/lapic.h
> @@ -37,6 +37,7 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu);
>  void kvm_free_lapic(struct kvm_vcpu *vcpu);
>  
>  int kvm_apic_has_interrupt(struct kvm_vcpu *vcpu);
> +bool kvm_hwapic_has_interrupt(struct kvm_vcpu *vcpu);
>  int kvm_apic_accept_pic_intr(struct kvm_vcpu *vcpu);
>  int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu);
>  void kvm_lapic_reset(struct kvm_vcpu *vcpu);
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 0b5a8ae..48a2239 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -3932,6 +3932,17 @@ static void vmx_posted_intr_clear_on(struct kvm_vcpu *vcpu)
>  	clear_bit(POSTED_INTR_ON, (unsigned long *)&vmx->pi_desc.u.control);
>  }
>  
> +/*
> + * Send interrupt to vcpu via posted interrupt way.
> + * Return false if posted interrupt is not supported and the caller will
> + * roll back to old way(via set vIRR).
> + * Return true if posted interrupt is avalialbe, the interrupt is set
> + * in pir(posted interrupt requests):
> + * 1. If target vcpu is running(non-root mode), send posted interrupt
> + * notification to vcpu and hardware will sync pir to vIRR atomically.
> + * 2. If target vcpu isn't running(root mode), kick it to pick up the
> + * interrupt from pir in next vmentry.
> + */
The comment should go into previous patch. Also I prefer to not check
for posted interrupt inside the callback, but set it to NULL instead.
This way we avoid calling a callback on a hot path needlessly.

>  static bool vmx_deliver_posted_interrupt(struct kvm_vcpu *vcpu, int vector)
>  {
>  	struct vcpu_vmx *vmx = to_vmx(vcpu);
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 0baa90d..0981100 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2679,6 +2679,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>  static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,
>  				    struct kvm_lapic_state *s)
>  {
> +	kvm_x86_ops->sync_pir_to_irr(vcpu);
>  	memcpy(s->regs, vcpu->arch.apic->regs, sizeof *s);
>  
>  	return 0;
> @@ -5699,6 +5700,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  	}
>  
>  	if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
> +		kvm_x86_ops->sync_pir_to_irr(vcpu);
>  		inject_pending_event(vcpu);
>  
>  		/* enable NMI/IRQ window open exits if needed */
> @@ -5741,6 +5743,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>  
>  	local_irq_disable();
>  
> +	kvm_x86_ops->posted_intr_clear_on(vcpu);
> +
Why is this separate from pir_to_irr syncing?

>  	if (vcpu->mode == EXITING_GUEST_MODE || vcpu->requests
>  	    || need_resched() || signal_pending(current)) {
>  		vcpu->mode = OUTSIDE_GUEST_MODE;
> -- 
> 1.7.1

--
			Gleb.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH v6 5/5] KVM : VMX: Use posted interrupt to deliver virtual interrupt
  2013-03-19  8:54   ` Gleb Natapov
@ 2013-03-19 12:11     ` Zhang, Yang Z
  2013-03-19 12:23       ` Gleb Natapov
  0 siblings, 1 reply; 31+ messages in thread
From: Zhang, Yang Z @ 2013-03-19 12:11 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: kvm, mtosatti, Zhang, Xiantao

Gleb Natapov wrote on 2013-03-19:
> On Fri, Mar 15, 2013 at 09:31:11PM +0800, Yang Zhang wrote:
>> From: Yang Zhang <yang.z.zhang@Intel.com>
>> 
>> If posted interrupt is avaliable, then uses it to inject virtual
>> interrupt to guest.
>> 
>> Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
>> ---
>>  arch/x86/kvm/irq.c   |    3 ++-
>>  arch/x86/kvm/lapic.c |   16 +++++++++++++---
>>  arch/x86/kvm/lapic.h |    1 +
>>  arch/x86/kvm/vmx.c   |   11 +++++++++++
>>  arch/x86/kvm/x86.c   |    4 ++++
>>  5 files changed, 31 insertions(+), 4 deletions(-)
>> diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
>> index 484bc87..5179988 100644
>> --- a/arch/x86/kvm/irq.c
>> +++ b/arch/x86/kvm/irq.c
>> @@ -81,7 +81,8 @@ int kvm_cpu_has_interrupt(struct kvm_vcpu *v)
>>  	if (kvm_cpu_has_extint(v))
>>  		return 1;
>> -	return kvm_apic_has_interrupt(v) != -1;	/* LAPIC */
>> +	return (kvm_apic_has_interrupt(v) != -1) ||
>> +		kvm_hwapic_has_interrupt(v);
> That's incorrect. kvm_cpu_has_interrupt() should return true only it
> there is IRR suitable to be injected, not just any IRR.
> kvm_apic_has_interrupt() should call kvm_apic_update_irr().
You are right.

>>  }
>>  EXPORT_SYMBOL_GPL(kvm_cpu_has_interrupt);
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index b3ea50e..46c7310 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -713,7 +713,10 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int
> delivery_mode,
>>  		} else
>>  			apic_clear_vector(vector, apic->regs + APIC_TMR);
>> -		result = !apic_test_and_set_irr(vector, apic);
>> +		result = 1;
>> +		if (!kvm_x86_ops->deliver_posted_interrupt(vcpu, vector))
>> +			result = !apic_test_and_set_irr(vector, apic);
>> +
>>  		trace_kvm_apic_accept_irq(vcpu->vcpu_id, delivery_mode,
>>  					  trig_mode, vector, !result);
>>  		if (!result) {
>> @@ -723,8 +726,10 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int
> delivery_mode,
>>  			break;
>>  		}
>> -		kvm_make_request(KVM_REQ_EVENT, vcpu);
>> -		kvm_vcpu_kick(vcpu);
>> +		if (!kvm_x86_ops->vm_has_apicv(vcpu->kvm)) {
>> +			kvm_make_request(KVM_REQ_EVENT, vcpu);
>> +			kvm_vcpu_kick(vcpu);
>> +		}
>>  		break;
> apicv code and non apicv code are completely different. What's the point
> checking for apicv twice here?
> Just do:
> 
> if (kvm_x86_ops->deliver_posted_interrupt)
> 	kvm_x86_ops->deliver_posted_interrupt(vcpu, vector)
> else {
> 	result = !apic_test_and_set_irr(vector, apic);
> 	kvm_make_request(KVM_REQ_EVENT, vcpu);
> 	kvm_vcpu_kick(vcpu);
> }
> 
> And set kvm_x86_ops->deliver_posted_interrupt only if apicv is enabled.
> 
> Also rearrange patches so that APIC_TMR handling goes before posted
> interrupt series.
Sure. 

>> 
>>  	case APIC_DM_REMRD: @@ -1604,6 +1609,11 @@ int
>>  kvm_apic_has_interrupt(struct kvm_vcpu *vcpu) 	return highest_irr; }
>> +bool kvm_hwapic_has_interrupt(struct kvm_vcpu *vcpu)
>> +{
>> +	return kvm_x86_ops->hwapic_has_interrupt(vcpu);
>> +}
>> +
>>  int kvm_apic_accept_pic_intr(struct kvm_vcpu *vcpu)
>>  {
>>  	u32 lvt0 = kvm_apic_get_reg(vcpu->arch.apic, APIC_LVT0);
>> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
>> index e5327be..c6abc63 100644
>> --- a/arch/x86/kvm/lapic.h
>> +++ b/arch/x86/kvm/lapic.h
>> @@ -37,6 +37,7 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu);
>>  void kvm_free_lapic(struct kvm_vcpu *vcpu);
>>  
>>  int kvm_apic_has_interrupt(struct kvm_vcpu *vcpu); +bool
>>  kvm_hwapic_has_interrupt(struct kvm_vcpu *vcpu); int
>>  kvm_apic_accept_pic_intr(struct kvm_vcpu *vcpu); int
>>  kvm_get_apic_interrupt(struct kvm_vcpu *vcpu); void
>>  kvm_lapic_reset(struct kvm_vcpu *vcpu);
>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>> index 0b5a8ae..48a2239 100644
>> --- a/arch/x86/kvm/vmx.c
>> +++ b/arch/x86/kvm/vmx.c
>> @@ -3932,6 +3932,17 @@ static void vmx_posted_intr_clear_on(struct
> kvm_vcpu *vcpu)
>>  	clear_bit(POSTED_INTR_ON, (unsigned long *)&vmx->pi_desc.u.control);
>>  }
>> +/*
>> + * Send interrupt to vcpu via posted interrupt way.
>> + * Return false if posted interrupt is not supported and the caller will
>> + * roll back to old way(via set vIRR).
>> + * Return true if posted interrupt is avalialbe, the interrupt is set
>> + * in pir(posted interrupt requests):
>> + * 1. If target vcpu is running(non-root mode), send posted interrupt
>> + * notification to vcpu and hardware will sync pir to vIRR atomically.
>> + * 2. If target vcpu isn't running(root mode), kick it to pick up the
>> + * interrupt from pir in next vmentry.
>> + */
> The comment should go into previous patch. Also I prefer to not check
> for posted interrupt inside the callback, but set it to NULL instead.
> This way we avoid calling a callback on a hot path needlessly.
It's make sense. So just follow the logic you mentioned above?
 
>>  static bool vmx_deliver_posted_interrupt(struct kvm_vcpu *vcpu, int vector)
>>  {
>>  	struct vcpu_vmx *vmx = to_vmx(vcpu);
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 0baa90d..0981100 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -2679,6 +2679,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>>  static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu, 				   
>>  struct kvm_lapic_state *s) { +	kvm_x86_ops->sync_pir_to_irr(vcpu);
>>  	memcpy(s->regs, vcpu->arch.apic->regs, sizeof *s);
>>  
>>  	return 0; @@ -5699,6 +5700,7 @@ static int vcpu_enter_guest(struct
>>  kvm_vcpu *vcpu) 	}
>>  
>>  	if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
>>  +		kvm_x86_ops->sync_pir_to_irr(vcpu); 		inject_pending_event(vcpu);
>>  
>>  		/* enable NMI/IRQ window open exits if needed */
>> @@ -5741,6 +5743,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
>> 
>>  	local_irq_disable();
>> +	kvm_x86_ops->posted_intr_clear_on(vcpu);
>> +
> Why is this separate from pir_to_irr syncing?
This is the result of discussion with Marcelo.
It is more reasonable to put it here to avoid unnecessary posted interrupt between:

vcpu->mode = IN_GUEST_MODE;

<--interrupt may arrived here and this is unnecessary.

local_irq_disable();

>>  	if (vcpu->mode == EXITING_GUEST_MODE || vcpu->requests
>>  	    || need_resched() || signal_pending(current)) {
>>  		vcpu->mode = OUTSIDE_GUEST_MODE;
>> --
>> 1.7.1
> 
> --
> 			Gleb.


Best regards,
Yang


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 5/5] KVM : VMX: Use posted interrupt to deliver virtual interrupt
  2013-03-19 12:11     ` Zhang, Yang Z
@ 2013-03-19 12:23       ` Gleb Natapov
  2013-03-19 12:42         ` Zhang, Yang Z
  2013-03-19 15:03         ` Marcelo Tosatti
  0 siblings, 2 replies; 31+ messages in thread
From: Gleb Natapov @ 2013-03-19 12:23 UTC (permalink / raw)
  To: Zhang, Yang Z; +Cc: kvm, mtosatti, Zhang, Xiantao

On Tue, Mar 19, 2013 at 12:11:47PM +0000, Zhang, Yang Z wrote:
> Gleb Natapov wrote on 2013-03-19:
> > On Fri, Mar 15, 2013 at 09:31:11PM +0800, Yang Zhang wrote:
> >> From: Yang Zhang <yang.z.zhang@Intel.com>
> >> 
> >> If posted interrupt is avaliable, then uses it to inject virtual
> >> interrupt to guest.
> >> 
> >> Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
> >> ---
> >>  arch/x86/kvm/irq.c   |    3 ++-
> >>  arch/x86/kvm/lapic.c |   16 +++++++++++++---
> >>  arch/x86/kvm/lapic.h |    1 +
> >>  arch/x86/kvm/vmx.c   |   11 +++++++++++
> >>  arch/x86/kvm/x86.c   |    4 ++++
> >>  5 files changed, 31 insertions(+), 4 deletions(-)
> >> diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
> >> index 484bc87..5179988 100644
> >> --- a/arch/x86/kvm/irq.c
> >> +++ b/arch/x86/kvm/irq.c
> >> @@ -81,7 +81,8 @@ int kvm_cpu_has_interrupt(struct kvm_vcpu *v)
> >>  	if (kvm_cpu_has_extint(v))
> >>  		return 1;
> >> -	return kvm_apic_has_interrupt(v) != -1;	/* LAPIC */
> >> +	return (kvm_apic_has_interrupt(v) != -1) ||
> >> +		kvm_hwapic_has_interrupt(v);
> > That's incorrect. kvm_cpu_has_interrupt() should return true only it
> > there is IRR suitable to be injected, not just any IRR.
> > kvm_apic_has_interrupt() should call kvm_apic_update_irr().
> You are right.
> 
> >>  }
> >>  EXPORT_SYMBOL_GPL(kvm_cpu_has_interrupt);
> >> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> >> index b3ea50e..46c7310 100644
> >> --- a/arch/x86/kvm/lapic.c
> >> +++ b/arch/x86/kvm/lapic.c
> >> @@ -713,7 +713,10 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int
> > delivery_mode,
> >>  		} else
> >>  			apic_clear_vector(vector, apic->regs + APIC_TMR);
> >> -		result = !apic_test_and_set_irr(vector, apic);
> >> +		result = 1;
> >> +		if (!kvm_x86_ops->deliver_posted_interrupt(vcpu, vector))
> >> +			result = !apic_test_and_set_irr(vector, apic);
> >> +
> >>  		trace_kvm_apic_accept_irq(vcpu->vcpu_id, delivery_mode,
> >>  					  trig_mode, vector, !result);
> >>  		if (!result) {
> >> @@ -723,8 +726,10 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int
> > delivery_mode,
> >>  			break;
> >>  		}
> >> -		kvm_make_request(KVM_REQ_EVENT, vcpu);
> >> -		kvm_vcpu_kick(vcpu);
> >> +		if (!kvm_x86_ops->vm_has_apicv(vcpu->kvm)) {
> >> +			kvm_make_request(KVM_REQ_EVENT, vcpu);
> >> +			kvm_vcpu_kick(vcpu);
> >> +		}
> >>  		break;
> > apicv code and non apicv code are completely different. What's the point
> > checking for apicv twice here?
> > Just do:
> > 
> > if (kvm_x86_ops->deliver_posted_interrupt)
> > 	kvm_x86_ops->deliver_posted_interrupt(vcpu, vector)
> > else {
> > 	result = !apic_test_and_set_irr(vector, apic);
> > 	kvm_make_request(KVM_REQ_EVENT, vcpu);
> > 	kvm_vcpu_kick(vcpu);
> > }
> > 
> > And set kvm_x86_ops->deliver_posted_interrupt only if apicv is enabled.
> > 
> > Also rearrange patches so that APIC_TMR handling goes before posted
> > interrupt series.
> Sure. 
> 
> >> 
> >>  	case APIC_DM_REMRD: @@ -1604,6 +1609,11 @@ int
> >>  kvm_apic_has_interrupt(struct kvm_vcpu *vcpu) 	return highest_irr; }
> >> +bool kvm_hwapic_has_interrupt(struct kvm_vcpu *vcpu)
> >> +{
> >> +	return kvm_x86_ops->hwapic_has_interrupt(vcpu);
> >> +}
> >> +
> >>  int kvm_apic_accept_pic_intr(struct kvm_vcpu *vcpu)
> >>  {
> >>  	u32 lvt0 = kvm_apic_get_reg(vcpu->arch.apic, APIC_LVT0);
> >> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
> >> index e5327be..c6abc63 100644
> >> --- a/arch/x86/kvm/lapic.h
> >> +++ b/arch/x86/kvm/lapic.h
> >> @@ -37,6 +37,7 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu);
> >>  void kvm_free_lapic(struct kvm_vcpu *vcpu);
> >>  
> >>  int kvm_apic_has_interrupt(struct kvm_vcpu *vcpu); +bool
> >>  kvm_hwapic_has_interrupt(struct kvm_vcpu *vcpu); int
> >>  kvm_apic_accept_pic_intr(struct kvm_vcpu *vcpu); int
> >>  kvm_get_apic_interrupt(struct kvm_vcpu *vcpu); void
> >>  kvm_lapic_reset(struct kvm_vcpu *vcpu);
> >> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> >> index 0b5a8ae..48a2239 100644
> >> --- a/arch/x86/kvm/vmx.c
> >> +++ b/arch/x86/kvm/vmx.c
> >> @@ -3932,6 +3932,17 @@ static void vmx_posted_intr_clear_on(struct
> > kvm_vcpu *vcpu)
> >>  	clear_bit(POSTED_INTR_ON, (unsigned long *)&vmx->pi_desc.u.control);
> >>  }
> >> +/*
> >> + * Send interrupt to vcpu via posted interrupt way.
> >> + * Return false if posted interrupt is not supported and the caller will
> >> + * roll back to old way(via set vIRR).
> >> + * Return true if posted interrupt is avalialbe, the interrupt is set
> >> + * in pir(posted interrupt requests):
> >> + * 1. If target vcpu is running(non-root mode), send posted interrupt
> >> + * notification to vcpu and hardware will sync pir to vIRR atomically.
> >> + * 2. If target vcpu isn't running(root mode), kick it to pick up the
> >> + * interrupt from pir in next vmentry.
> >> + */
> > The comment should go into previous patch. Also I prefer to not check
> > for posted interrupt inside the callback, but set it to NULL instead.
> > This way we avoid calling a callback on a hot path needlessly.
> It's make sense. So just follow the logic you mentioned above?
>  
Yes.

> >>  static bool vmx_deliver_posted_interrupt(struct kvm_vcpu *vcpu, int vector)
> >>  {
> >>  	struct vcpu_vmx *vmx = to_vmx(vcpu);
> >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> >> index 0baa90d..0981100 100644
> >> --- a/arch/x86/kvm/x86.c
> >> +++ b/arch/x86/kvm/x86.c
> >> @@ -2679,6 +2679,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
> >>  static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu, 				   
> >>  struct kvm_lapic_state *s) { +	kvm_x86_ops->sync_pir_to_irr(vcpu);
> >>  	memcpy(s->regs, vcpu->arch.apic->regs, sizeof *s);
> >>  
> >>  	return 0; @@ -5699,6 +5700,7 @@ static int vcpu_enter_guest(struct
> >>  kvm_vcpu *vcpu) 	}
> >>  
> >>  	if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
> >>  +		kvm_x86_ops->sync_pir_to_irr(vcpu); 		inject_pending_event(vcpu);
> >>  
> >>  		/* enable NMI/IRQ window open exits if needed */
> >> @@ -5741,6 +5743,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> >> 
> >>  	local_irq_disable();
> >> +	kvm_x86_ops->posted_intr_clear_on(vcpu);
> >> +
> > Why is this separate from pir_to_irr syncing?
> This is the result of discussion with Marcelo.
> It is more reasonable to put it here to avoid unnecessary posted interrupt between:
> 
> vcpu->mode = IN_GUEST_MODE;
> 
> <--interrupt may arrived here and this is unnecessary.
> 
> local_irq_disable();
> 

But this still can happen as far as I see:

vcpu0                                         vcpu1: 
pi_test_and_set_pir()
kvm_make_request(KVM_REQ_EVENT)
                                            if (KVM_REQ_EVENT)
                                                   sync_pir_to_irr()
                                            vcpu->mode = IN_GUEST_MODE;
if (vcpu->mode == IN_GUEST_MODE)
  if (!pi_test_and_set_on())
    apic->send_IPI_mask()
                                            --> IPI arrives here
                                            local_irq_disable();
                                            posted_intr_clear_on()


May be move vcpu->mode = IN_GUEST_MODE after local_irq_disable()?

--
			Gleb.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH v6 5/5] KVM : VMX: Use posted interrupt to deliver virtual interrupt
  2013-03-19 12:23       ` Gleb Natapov
@ 2013-03-19 12:42         ` Zhang, Yang Z
  2013-03-19 13:29           ` Gleb Natapov
  2013-03-19 15:03         ` Marcelo Tosatti
  1 sibling, 1 reply; 31+ messages in thread
From: Zhang, Yang Z @ 2013-03-19 12:42 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: kvm, mtosatti, Zhang, Xiantao

Gleb Natapov wrote on 2013-03-19:
> On Tue, Mar 19, 2013 at 12:11:47PM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2013-03-19:
>>> On Fri, Mar 15, 2013 at 09:31:11PM +0800, Yang Zhang wrote:
>>>> From: Yang Zhang <yang.z.zhang@Intel.com>
>>>> 
>>>> If posted interrupt is avaliable, then uses it to inject virtual
>>>> interrupt to guest.
>>>> 
>>>> Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
>>>> ---
>>>>  arch/x86/kvm/irq.c   |    3 ++-
>>>>  arch/x86/kvm/lapic.c |   16 +++++++++++++---
>>>>  arch/x86/kvm/lapic.h |    1 +
>>>>  arch/x86/kvm/vmx.c   |   11 +++++++++++
>>>>  arch/x86/kvm/x86.c   |    4 ++++
>>>>  5 files changed, 31 insertions(+), 4 deletions(-)
>>>> diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
>>>> index 484bc87..5179988 100644
>>>> --- a/arch/x86/kvm/irq.c
>>>> +++ b/arch/x86/kvm/irq.c
>>>> @@ -81,7 +81,8 @@ int kvm_cpu_has_interrupt(struct kvm_vcpu *v)
>>>>  	if (kvm_cpu_has_extint(v))
>>>>  		return 1;
>>>> -	return kvm_apic_has_interrupt(v) != -1;	/* LAPIC */
>>>> +	return (kvm_apic_has_interrupt(v) != -1) ||
>>>> +		kvm_hwapic_has_interrupt(v);
>>> That's incorrect. kvm_cpu_has_interrupt() should return true only it
>>> there is IRR suitable to be injected, not just any IRR.
>>> kvm_apic_has_interrupt() should call kvm_apic_update_irr().
>> You are right.
>> 
>>>>  }
>>>>  EXPORT_SYMBOL_GPL(kvm_cpu_has_interrupt);
>>>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>>>> index b3ea50e..46c7310 100644
>>>> --- a/arch/x86/kvm/lapic.c
>>>> +++ b/arch/x86/kvm/lapic.c
>>>> @@ -713,7 +713,10 @@ static int __apic_accept_irq(struct kvm_lapic *apic,
> int
>>> delivery_mode,
>>>>  		} else
>>>>  			apic_clear_vector(vector, apic->regs + APIC_TMR);
>>>> -		result = !apic_test_and_set_irr(vector, apic);
>>>> +		result = 1;
>>>> +		if (!kvm_x86_ops->deliver_posted_interrupt(vcpu, vector))
>>>> +			result = !apic_test_and_set_irr(vector, apic);
>>>> +
>>>>  		trace_kvm_apic_accept_irq(vcpu->vcpu_id, delivery_mode,
>>>>  					  trig_mode, vector, !result);
>>>>  		if (!result) {
>>>> @@ -723,8 +726,10 @@ static int __apic_accept_irq(struct kvm_lapic *apic,
> int
>>> delivery_mode,
>>>>  			break;
>>>>  		}
>>>> -		kvm_make_request(KVM_REQ_EVENT, vcpu);
>>>> -		kvm_vcpu_kick(vcpu);
>>>> +		if (!kvm_x86_ops->vm_has_apicv(vcpu->kvm)) {
>>>> +			kvm_make_request(KVM_REQ_EVENT, vcpu);
>>>> +			kvm_vcpu_kick(vcpu);
>>>> +		}
>>>>  		break;
>>> apicv code and non apicv code are completely different. What's the point
>>> checking for apicv twice here?
>>> Just do:
>>> 
>>> if (kvm_x86_ops->deliver_posted_interrupt)
>>> 	kvm_x86_ops->deliver_posted_interrupt(vcpu, vector)
>>> else {
>>> 	result = !apic_test_and_set_irr(vector, apic);
>>> 	kvm_make_request(KVM_REQ_EVENT, vcpu);
>>> 	kvm_vcpu_kick(vcpu);
>>> }
>>> 
>>> And set kvm_x86_ops->deliver_posted_interrupt only if apicv is enabled.
>>> 
>>> Also rearrange patches so that APIC_TMR handling goes before posted
>>> interrupt series.
>> Sure.
>> 
>>>> 
>>>>  	case APIC_DM_REMRD: @@ -1604,6 +1609,11 @@ int
>>>>  kvm_apic_has_interrupt(struct kvm_vcpu *vcpu) 	return highest_irr; }
>>>> +bool kvm_hwapic_has_interrupt(struct kvm_vcpu *vcpu)
>>>> +{
>>>> +	return kvm_x86_ops->hwapic_has_interrupt(vcpu);
>>>> +}
>>>> +
>>>>  int kvm_apic_accept_pic_intr(struct kvm_vcpu *vcpu)
>>>>  {
>>>>  	u32 lvt0 = kvm_apic_get_reg(vcpu->arch.apic, APIC_LVT0);
>>>> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
>>>> index e5327be..c6abc63 100644
>>>> --- a/arch/x86/kvm/lapic.h
>>>> +++ b/arch/x86/kvm/lapic.h
>>>> @@ -37,6 +37,7 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu);
>>>>  void kvm_free_lapic(struct kvm_vcpu *vcpu);
>>>>  
>>>>  int kvm_apic_has_interrupt(struct kvm_vcpu *vcpu); +bool
>>>>  kvm_hwapic_has_interrupt(struct kvm_vcpu *vcpu); int
>>>>  kvm_apic_accept_pic_intr(struct kvm_vcpu *vcpu); int
>>>>  kvm_get_apic_interrupt(struct kvm_vcpu *vcpu); void
>>>>  kvm_lapic_reset(struct kvm_vcpu *vcpu);
>>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
>>>> index 0b5a8ae..48a2239 100644
>>>> --- a/arch/x86/kvm/vmx.c
>>>> +++ b/arch/x86/kvm/vmx.c
>>>> @@ -3932,6 +3932,17 @@ static void vmx_posted_intr_clear_on(struct
>>> kvm_vcpu *vcpu)
>>>>  	clear_bit(POSTED_INTR_ON, (unsigned long
>>>>  *)&vmx->pi_desc.u.control); }
>>>> +/*
>>>> + * Send interrupt to vcpu via posted interrupt way.
>>>> + * Return false if posted interrupt is not supported and the caller will
>>>> + * roll back to old way(via set vIRR).
>>>> + * Return true if posted interrupt is avalialbe, the interrupt is set
>>>> + * in pir(posted interrupt requests):
>>>> + * 1. If target vcpu is running(non-root mode), send posted interrupt
>>>> + * notification to vcpu and hardware will sync pir to vIRR atomically.
>>>> + * 2. If target vcpu isn't running(root mode), kick it to pick up the
>>>> + * interrupt from pir in next vmentry.
>>>> + */
>>> The comment should go into previous patch. Also I prefer to not check
>>> for posted interrupt inside the callback, but set it to NULL instead.
>>> This way we avoid calling a callback on a hot path needlessly.
>> It's make sense. So just follow the logic you mentioned above?
>> 
> Yes.
> 
>>>>  static bool vmx_deliver_posted_interrupt(struct kvm_vcpu *vcpu, int
>>>>  vector) { 	struct vcpu_vmx *vmx = to_vmx(vcpu);
>>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>>> index 0baa90d..0981100 100644
>>>> --- a/arch/x86/kvm/x86.c
>>>> +++ b/arch/x86/kvm/x86.c
>>>> @@ -2679,6 +2679,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
>>>>  static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,
>>>>  
>>>>  struct kvm_lapic_state *s) { +	kvm_x86_ops->sync_pir_to_irr(vcpu);
>>>>  	memcpy(s->regs, vcpu->arch.apic->regs, sizeof *s);
>>>>  
>>>>  	return 0; @@ -5699,6 +5700,7 @@ static int vcpu_enter_guest(struct
>>>>  kvm_vcpu *vcpu) 	}
>>>>  
>>>>  	if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
>>>>  +		kvm_x86_ops->sync_pir_to_irr(vcpu);
> 	inject_pending_event(vcpu);
>>>> 
>>>>  		/* enable NMI/IRQ window open exits if needed */
>>>> @@ -5741,6 +5743,8 @@ static int vcpu_enter_guest(struct kvm_vcpu
>>>> *vcpu)
>>>> 
>>>>  	local_irq_disable();
>>>> +	kvm_x86_ops->posted_intr_clear_on(vcpu);
>>>> +
>>> Why is this separate from pir_to_irr syncing?
>> This is the result of discussion with Marcelo. It is more reasonable to
>> put it here to avoid unnecessary posted interrupt between:
>> 
>> vcpu->mode = IN_GUEST_MODE;
>> 
>> <--interrupt may arrived here and this is unnecessary.
>> 
>> local_irq_disable();
>> 
> 
> But this still can happen as far as I see:
> 
> vcpu0                                         vcpu1:
> pi_test_and_set_pir() kvm_make_request(KVM_REQ_EVENT)
>                                             if (KVM_REQ_EVENT)
>                                                    sync_pir_to_irr()
>                                             vcpu->mode =
> IN_GUEST_MODE;
> if (vcpu->mode == IN_GUEST_MODE)
>   if (!pi_test_and_set_on())
>     apic->send_IPI_mask()
>                                             --> IPI arrives here
>                                             local_irq_disable();
>                                             posted_intr_clear_on()
Current solution is trying to block other Posted Interrupt from other VCPUs at same time. It only mitigates it but cannot solve it. The case you mentioned still exists but it should be rare.

> May be move vcpu->mode = IN_GUEST_MODE after local_irq_disable()?
Yes, this will solve it. But I am not sure whether it will introduce any regressions. Is there any check relies on this sequence?
 	
> --
> 			Gleb.


Best regards,
Yang



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 5/5] KVM : VMX: Use posted interrupt to deliver virtual interrupt
  2013-03-19 12:42         ` Zhang, Yang Z
@ 2013-03-19 13:29           ` Gleb Natapov
  2013-03-19 13:59             ` Zhang, Yang Z
  2013-03-19 15:13             ` Marcelo Tosatti
  0 siblings, 2 replies; 31+ messages in thread
From: Gleb Natapov @ 2013-03-19 13:29 UTC (permalink / raw)
  To: Zhang, Yang Z; +Cc: kvm, mtosatti, Zhang, Xiantao

On Tue, Mar 19, 2013 at 12:42:01PM +0000, Zhang, Yang Z wrote:
> >>>>  	local_irq_disable();
> >>>> +	kvm_x86_ops->posted_intr_clear_on(vcpu);
> >>>> +
> >>> Why is this separate from pir_to_irr syncing?
> >> This is the result of discussion with Marcelo. It is more reasonable to
> >> put it here to avoid unnecessary posted interrupt between:
> >> 
> >> vcpu->mode = IN_GUEST_MODE;
> >> 
> >> <--interrupt may arrived here and this is unnecessary.
> >> 
> >> local_irq_disable();
> >> 
> > 
> > But this still can happen as far as I see:
> > 
> > vcpu0                                         vcpu1:
> > pi_test_and_set_pir() kvm_make_request(KVM_REQ_EVENT)
> >                                             if (KVM_REQ_EVENT)
> >                                                    sync_pir_to_irr()
> >                                             vcpu->mode =
> > IN_GUEST_MODE;
> > if (vcpu->mode == IN_GUEST_MODE)
> >   if (!pi_test_and_set_on())
> >     apic->send_IPI_mask()
> >                                             --> IPI arrives here
> >                                             local_irq_disable();
> >                                             posted_intr_clear_on()
> Current solution is trying to block other Posted Interrupt from other VCPUs at same time. It only mitigates it but cannot solve it. The case you mentioned still exists but it should be rare.
> 
I am not sure I follow. What scenario exactly are you talking about. I
looked over past discussion about it and saw that Marcelo gives an
example how IPI can be lost, but I think that's because we set "on" bit
after KVM_REQ_EVENT:

cpu0                                    cpu1            vcpu0
test_and_set_bit(PIR-A)
set KVM_REQ_EVENT
                                                        process REQ_EVENT
                                                        PIR-A->IRR

                                                        vcpu->mode=IN_GUEST

if (vcpu0->guest_mode)
        if (!t_a_s_bit(PIR notif))
                send IPI
                                                        linux_pir_handler

                                        t_a_s_b(PIR-B)=1
                                        no PIR IPI sent


But what if on delivery we do:
pi_test_and_set_pir()
r = pi_test_and_set_on()
kvm_make_request(KVM_REQ_EVENT)
if (!r)
   send_IPI_mask()
else
   kvm_vcpu_kick()

And on vcpu entry we do:
if (kvm_check_request(KVM_REQ_EVENT)
 if (test_and_clear_bit(on))
   kvm_apic_update_irr()

What are the downsides? Can we lost interrupts this way?

> > May be move vcpu->mode = IN_GUEST_MODE after local_irq_disable()?
> Yes, this will solve it. But I am not sure whether it will introduce any regressions. Is there any check relies on this sequence?
>  	
Do not think so.

--
			Gleb.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH v6 5/5] KVM : VMX: Use posted interrupt to deliver virtual interrupt
  2013-03-19 13:29           ` Gleb Natapov
@ 2013-03-19 13:59             ` Zhang, Yang Z
  2013-03-19 14:51               ` Gleb Natapov
  2013-03-19 15:13             ` Marcelo Tosatti
  1 sibling, 1 reply; 31+ messages in thread
From: Zhang, Yang Z @ 2013-03-19 13:59 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: kvm, mtosatti, Zhang, Xiantao

Gleb Natapov wrote on 2013-03-19:
> On Tue, Mar 19, 2013 at 12:42:01PM +0000, Zhang, Yang Z wrote:
>>>>>>  	local_irq_disable();
>>>>>> +	kvm_x86_ops->posted_intr_clear_on(vcpu);
>>>>>> +
>>>>> Why is this separate from pir_to_irr syncing?
>>>> This is the result of discussion with Marcelo. It is more reasonable to
>>>> put it here to avoid unnecessary posted interrupt between:
>>>> 
>>>> vcpu->mode = IN_GUEST_MODE;
>>>> 
>>>> <--interrupt may arrived here and this is unnecessary.
>>>> 
>>>> local_irq_disable();
>>>> 
>>> 
>>> But this still can happen as far as I see:
>>> 
>>> vcpu0                                         vcpu1:
>>> pi_test_and_set_pir() kvm_make_request(KVM_REQ_EVENT)
>>>                                             if (KVM_REQ_EVENT)
>>>                                                    sync_pir_to_irr()
>>>                                             vcpu->mode =
>>> IN_GUEST_MODE;
>>> if (vcpu->mode == IN_GUEST_MODE)
>>>   if (!pi_test_and_set_on())
>>>     apic->send_IPI_mask()
>>>                                             --> IPI arrives here
>>>                                             local_irq_disable();
>>>                                             posted_intr_clear_on()
>> Current solution is trying to block other Posted Interrupt from other VCPUs at
> same time. It only mitigates it but cannot solve it. The case you mentioned still
> exists but it should be rare.
>> 
> I am not sure I follow. What scenario exactly are you talking about. I
> looked over past discussion about it and saw that Marcelo gives an
> example how IPI can be lost, but I think that's because we set "on" bit
> after KVM_REQ_EVENT:
The IPI will not lost in his example(he misread the patch). 

> cpu0                                    cpu1            vcpu0
> test_and_set_bit(PIR-A) set KVM_REQ_EVENT
>                                                         process
>                                                         REQ_EVENT
>                                                         PIR-A->IRR
> 
> vcpu->mode=IN_GUEST
> 
> if (vcpu0->guest_mode)
>         if (!t_a_s_bit(PIR notif))
>                 send IPI
> linux_pir_handler
> 
>                                         t_a_s_b(PIR-B)=1
>                                         no PIR IPI sent
> 
> But what if on delivery we do:
> pi_test_and_set_pir()
> r = pi_test_and_set_on()
> kvm_make_request(KVM_REQ_EVENT)
> if (!r)
>    send_IPI_mask() else kvm_vcpu_kick()
> And on vcpu entry we do:
> if (kvm_check_request(KVM_REQ_EVENT)
>  if (test_and_clear_bit(on))
>    kvm_apic_update_irr()
> What are the downsides? Can we lost interrupts this way?
Need to check guest mode before sending IPI. Otherwise hypervisor may receive IPI.
I think current logic is ok. Only problem is that when to clear Outstanding Notification bit. Actually I prefer your suggestion to clear it before sync_pir_irr. But Marcelo prefer to clear ON bit after disabling irq.
 
>>> May be move vcpu->mode = IN_GUEST_MODE after local_irq_disable()?
>> Yes, this will solve it. But I am not sure whether it will introduce
>> any regressions. Is there any check relies on this sequence?
>> 
> Do not think so.
> 
> --
> 			Gleb.


Best regards,
Yang



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 5/5] KVM : VMX: Use posted interrupt to deliver virtual interrupt
  2013-03-19 13:59             ` Zhang, Yang Z
@ 2013-03-19 14:51               ` Gleb Natapov
  2013-03-19 15:12                 ` Gleb Natapov
  0 siblings, 1 reply; 31+ messages in thread
From: Gleb Natapov @ 2013-03-19 14:51 UTC (permalink / raw)
  To: Zhang, Yang Z; +Cc: kvm, mtosatti, Zhang, Xiantao

On Tue, Mar 19, 2013 at 01:59:21PM +0000, Zhang, Yang Z wrote:
> Gleb Natapov wrote on 2013-03-19:
> > On Tue, Mar 19, 2013 at 12:42:01PM +0000, Zhang, Yang Z wrote:
> >>>>>>  	local_irq_disable();
> >>>>>> +	kvm_x86_ops->posted_intr_clear_on(vcpu);
> >>>>>> +
> >>>>> Why is this separate from pir_to_irr syncing?
> >>>> This is the result of discussion with Marcelo. It is more reasonable to
> >>>> put it here to avoid unnecessary posted interrupt between:
> >>>> 
> >>>> vcpu->mode = IN_GUEST_MODE;
> >>>> 
> >>>> <--interrupt may arrived here and this is unnecessary.
> >>>> 
> >>>> local_irq_disable();
> >>>> 
> >>> 
> >>> But this still can happen as far as I see:
> >>> 
> >>> vcpu0                                         vcpu1:
> >>> pi_test_and_set_pir() kvm_make_request(KVM_REQ_EVENT)
> >>>                                             if (KVM_REQ_EVENT)
> >>>                                                    sync_pir_to_irr()
> >>>                                             vcpu->mode =
> >>> IN_GUEST_MODE;
> >>> if (vcpu->mode == IN_GUEST_MODE)
> >>>   if (!pi_test_and_set_on())
> >>>     apic->send_IPI_mask()
> >>>                                             --> IPI arrives here
> >>>                                             local_irq_disable();
> >>>                                             posted_intr_clear_on()
> >> Current solution is trying to block other Posted Interrupt from other VCPUs at
> > same time. It only mitigates it but cannot solve it. The case you mentioned still
> > exists but it should be rare.
> >> 
> > I am not sure I follow. What scenario exactly are you talking about. I
> > looked over past discussion about it and saw that Marcelo gives an
> > example how IPI can be lost, but I think that's because we set "on" bit
> > after KVM_REQ_EVENT:
> The IPI will not lost in his example(he misread the patch). 
> 
> > cpu0                                    cpu1            vcpu0
> > test_and_set_bit(PIR-A) set KVM_REQ_EVENT
> >                                                         process
> >                                                         REQ_EVENT
> >                                                         PIR-A->IRR
> > 
> > vcpu->mode=IN_GUEST
> > 
> > if (vcpu0->guest_mode)
> >         if (!t_a_s_bit(PIR notif))
> >                 send IPI
> > linux_pir_handler
> > 
> >                                         t_a_s_b(PIR-B)=1
> >                                         no PIR IPI sent
> > 
> > But what if on delivery we do:
> > pi_test_and_set_pir()
> > r = pi_test_and_set_on()
> > kvm_make_request(KVM_REQ_EVENT)
> > if (!r)
> >    send_IPI_mask() else kvm_vcpu_kick()
> > And on vcpu entry we do:
> > if (kvm_check_request(KVM_REQ_EVENT)
> >  if (test_and_clear_bit(on))
> >    kvm_apic_update_irr()
> > What are the downsides? Can we lost interrupts this way?
> Need to check guest mode before sending IPI. Otherwise hypervisor may receive IPI.
Of course, forget it. So if (!r) should be if (!r && mode == IN_GUEST)

> I think current logic is ok. Only problem is that when to clear Outstanding Notification bit. Actually I prefer your suggestion to clear it before sync_pir_irr. But Marcelo prefer to clear ON bit after disabling irq.
Marcelo what is the problem with the logic above?

--
			Gleb.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 5/5] KVM : VMX: Use posted interrupt to deliver virtual interrupt
  2013-03-19 12:23       ` Gleb Natapov
  2013-03-19 12:42         ` Zhang, Yang Z
@ 2013-03-19 15:03         ` Marcelo Tosatti
  2013-03-19 15:18           ` Gleb Natapov
  1 sibling, 1 reply; 31+ messages in thread
From: Marcelo Tosatti @ 2013-03-19 15:03 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Zhang, Yang Z, kvm, Zhang, Xiantao

On Tue, Mar 19, 2013 at 02:23:59PM +0200, Gleb Natapov wrote:
> On Tue, Mar 19, 2013 at 12:11:47PM +0000, Zhang, Yang Z wrote:
> > Gleb Natapov wrote on 2013-03-19:
> > > On Fri, Mar 15, 2013 at 09:31:11PM +0800, Yang Zhang wrote:
> > >> From: Yang Zhang <yang.z.zhang@Intel.com>
> > >> 
> > >> If posted interrupt is avaliable, then uses it to inject virtual
> > >> interrupt to guest.
> > >> 
> > >> Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
> > >> ---
> > >>  arch/x86/kvm/irq.c   |    3 ++-
> > >>  arch/x86/kvm/lapic.c |   16 +++++++++++++---
> > >>  arch/x86/kvm/lapic.h |    1 +
> > >>  arch/x86/kvm/vmx.c   |   11 +++++++++++
> > >>  arch/x86/kvm/x86.c   |    4 ++++
> > >>  5 files changed, 31 insertions(+), 4 deletions(-)
> > >> diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
> > >> index 484bc87..5179988 100644
> > >> --- a/arch/x86/kvm/irq.c
> > >> +++ b/arch/x86/kvm/irq.c
> > >> @@ -81,7 +81,8 @@ int kvm_cpu_has_interrupt(struct kvm_vcpu *v)
> > >>  	if (kvm_cpu_has_extint(v))
> > >>  		return 1;
> > >> -	return kvm_apic_has_interrupt(v) != -1;	/* LAPIC */
> > >> +	return (kvm_apic_has_interrupt(v) != -1) ||
> > >> +		kvm_hwapic_has_interrupt(v);
> > > That's incorrect. kvm_cpu_has_interrupt() should return true only it
> > > there is IRR suitable to be injected, not just any IRR.
> > > kvm_apic_has_interrupt() should call kvm_apic_update_irr().
> > You are right.
> > 
> > >>  }
> > >>  EXPORT_SYMBOL_GPL(kvm_cpu_has_interrupt);
> > >> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > >> index b3ea50e..46c7310 100644
> > >> --- a/arch/x86/kvm/lapic.c
> > >> +++ b/arch/x86/kvm/lapic.c
> > >> @@ -713,7 +713,10 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int
> > > delivery_mode,
> > >>  		} else
> > >>  			apic_clear_vector(vector, apic->regs + APIC_TMR);
> > >> -		result = !apic_test_and_set_irr(vector, apic);
> > >> +		result = 1;
> > >> +		if (!kvm_x86_ops->deliver_posted_interrupt(vcpu, vector))
> > >> +			result = !apic_test_and_set_irr(vector, apic);
> > >> +
> > >>  		trace_kvm_apic_accept_irq(vcpu->vcpu_id, delivery_mode,
> > >>  					  trig_mode, vector, !result);
> > >>  		if (!result) {
> > >> @@ -723,8 +726,10 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int
> > > delivery_mode,
> > >>  			break;
> > >>  		}
> > >> -		kvm_make_request(KVM_REQ_EVENT, vcpu);
> > >> -		kvm_vcpu_kick(vcpu);
> > >> +		if (!kvm_x86_ops->vm_has_apicv(vcpu->kvm)) {
> > >> +			kvm_make_request(KVM_REQ_EVENT, vcpu);
> > >> +			kvm_vcpu_kick(vcpu);
> > >> +		}
> > >>  		break;
> > > apicv code and non apicv code are completely different. What's the point
> > > checking for apicv twice here?
> > > Just do:
> > > 
> > > if (kvm_x86_ops->deliver_posted_interrupt)
> > > 	kvm_x86_ops->deliver_posted_interrupt(vcpu, vector)
> > > else {
> > > 	result = !apic_test_and_set_irr(vector, apic);
> > > 	kvm_make_request(KVM_REQ_EVENT, vcpu);
> > > 	kvm_vcpu_kick(vcpu);
> > > }
> > > 
> > > And set kvm_x86_ops->deliver_posted_interrupt only if apicv is enabled.
> > > 
> > > Also rearrange patches so that APIC_TMR handling goes before posted
> > > interrupt series.
> > Sure. 
> > 
> > >> 
> > >>  	case APIC_DM_REMRD: @@ -1604,6 +1609,11 @@ int
> > >>  kvm_apic_has_interrupt(struct kvm_vcpu *vcpu) 	return highest_irr; }
> > >> +bool kvm_hwapic_has_interrupt(struct kvm_vcpu *vcpu)
> > >> +{
> > >> +	return kvm_x86_ops->hwapic_has_interrupt(vcpu);
> > >> +}
> > >> +
> > >>  int kvm_apic_accept_pic_intr(struct kvm_vcpu *vcpu)
> > >>  {
> > >>  	u32 lvt0 = kvm_apic_get_reg(vcpu->arch.apic, APIC_LVT0);
> > >> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
> > >> index e5327be..c6abc63 100644
> > >> --- a/arch/x86/kvm/lapic.h
> > >> +++ b/arch/x86/kvm/lapic.h
> > >> @@ -37,6 +37,7 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu);
> > >>  void kvm_free_lapic(struct kvm_vcpu *vcpu);
> > >>  
> > >>  int kvm_apic_has_interrupt(struct kvm_vcpu *vcpu); +bool
> > >>  kvm_hwapic_has_interrupt(struct kvm_vcpu *vcpu); int
> > >>  kvm_apic_accept_pic_intr(struct kvm_vcpu *vcpu); int
> > >>  kvm_get_apic_interrupt(struct kvm_vcpu *vcpu); void
> > >>  kvm_lapic_reset(struct kvm_vcpu *vcpu);
> > >> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> > >> index 0b5a8ae..48a2239 100644
> > >> --- a/arch/x86/kvm/vmx.c
> > >> +++ b/arch/x86/kvm/vmx.c
> > >> @@ -3932,6 +3932,17 @@ static void vmx_posted_intr_clear_on(struct
> > > kvm_vcpu *vcpu)
> > >>  	clear_bit(POSTED_INTR_ON, (unsigned long *)&vmx->pi_desc.u.control);
> > >>  }
> > >> +/*
> > >> + * Send interrupt to vcpu via posted interrupt way.
> > >> + * Return false if posted interrupt is not supported and the caller will
> > >> + * roll back to old way(via set vIRR).
> > >> + * Return true if posted interrupt is avalialbe, the interrupt is set
> > >> + * in pir(posted interrupt requests):
> > >> + * 1. If target vcpu is running(non-root mode), send posted interrupt
> > >> + * notification to vcpu and hardware will sync pir to vIRR atomically.
> > >> + * 2. If target vcpu isn't running(root mode), kick it to pick up the
> > >> + * interrupt from pir in next vmentry.
> > >> + */
> > > The comment should go into previous patch. Also I prefer to not check
> > > for posted interrupt inside the callback, but set it to NULL instead.
> > > This way we avoid calling a callback on a hot path needlessly.
> > It's make sense. So just follow the logic you mentioned above?
> >  
> Yes.
> 
> > >>  static bool vmx_deliver_posted_interrupt(struct kvm_vcpu *vcpu, int vector)
> > >>  {
> > >>  	struct vcpu_vmx *vmx = to_vmx(vcpu);
> > >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > >> index 0baa90d..0981100 100644
> > >> --- a/arch/x86/kvm/x86.c
> > >> +++ b/arch/x86/kvm/x86.c
> > >> @@ -2679,6 +2679,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
> > >>  static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu, 				   
> > >>  struct kvm_lapic_state *s) { +	kvm_x86_ops->sync_pir_to_irr(vcpu);
> > >>  	memcpy(s->regs, vcpu->arch.apic->regs, sizeof *s);
> > >>  
> > >>  	return 0; @@ -5699,6 +5700,7 @@ static int vcpu_enter_guest(struct
> > >>  kvm_vcpu *vcpu) 	}
> > >>  
> > >>  	if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
> > >>  +		kvm_x86_ops->sync_pir_to_irr(vcpu); 		inject_pending_event(vcpu);
> > >>  
> > >>  		/* enable NMI/IRQ window open exits if needed */
> > >> @@ -5741,6 +5743,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> > >> 
> > >>  	local_irq_disable();
> > >> +	kvm_x86_ops->posted_intr_clear_on(vcpu);
> > >> +
> > > Why is this separate from pir_to_irr syncing?
> > This is the result of discussion with Marcelo.
> > It is more reasonable to put it here to avoid unnecessary posted interrupt between:
> > 
> > vcpu->mode = IN_GUEST_MODE;
> > 
> > <--interrupt may arrived here and this is unnecessary.
> > 
> > local_irq_disable();
> > 
> 
> But this still can happen as far as I see:
> 
> vcpu0                                         vcpu1: 
> pi_test_and_set_pir()
> kvm_make_request(KVM_REQ_EVENT)
>                                             if (KVM_REQ_EVENT)
>                                                    sync_pir_to_irr()
						^^^^^^^^^^^^^^^^^^
>                                             vcpu->mode = IN_GUEST_MODE;
> if (vcpu->mode == IN_GUEST_MODE)
>   if (!pi_test_and_set_on())
>     apic->send_IPI_mask()
>                                             --> IPI arrives here
>                                             local_irq_disable();
>                                             posted_intr_clear_on()
> 
> 
> May be move vcpu->mode = IN_GUEST_MODE after local_irq_disable()?

Scenario is correct because injected PIR has been synced to IRR before
guest entry.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 5/5] KVM : VMX: Use posted interrupt to deliver virtual interrupt
  2013-03-19 14:51               ` Gleb Natapov
@ 2013-03-19 15:12                 ` Gleb Natapov
  2013-03-19 15:19                   ` Marcelo Tosatti
  0 siblings, 1 reply; 31+ messages in thread
From: Gleb Natapov @ 2013-03-19 15:12 UTC (permalink / raw)
  To: Zhang, Yang Z; +Cc: kvm, mtosatti, Zhang, Xiantao

On Tue, Mar 19, 2013 at 04:51:04PM +0200, Gleb Natapov wrote:
> On Tue, Mar 19, 2013 at 01:59:21PM +0000, Zhang, Yang Z wrote:
> > Gleb Natapov wrote on 2013-03-19:
> > > On Tue, Mar 19, 2013 at 12:42:01PM +0000, Zhang, Yang Z wrote:
> > >>>>>>  	local_irq_disable();
> > >>>>>> +	kvm_x86_ops->posted_intr_clear_on(vcpu);
> > >>>>>> +
> > >>>>> Why is this separate from pir_to_irr syncing?
> > >>>> This is the result of discussion with Marcelo. It is more reasonable to
> > >>>> put it here to avoid unnecessary posted interrupt between:
> > >>>> 
> > >>>> vcpu->mode = IN_GUEST_MODE;
> > >>>> 
> > >>>> <--interrupt may arrived here and this is unnecessary.
> > >>>> 
> > >>>> local_irq_disable();
> > >>>> 
> > >>> 
> > >>> But this still can happen as far as I see:
> > >>> 
> > >>> vcpu0                                         vcpu1:
> > >>> pi_test_and_set_pir() kvm_make_request(KVM_REQ_EVENT)
> > >>>                                             if (KVM_REQ_EVENT)
> > >>>                                                    sync_pir_to_irr()
> > >>>                                             vcpu->mode =
> > >>> IN_GUEST_MODE;
> > >>> if (vcpu->mode == IN_GUEST_MODE)
> > >>>   if (!pi_test_and_set_on())
> > >>>     apic->send_IPI_mask()
> > >>>                                             --> IPI arrives here
> > >>>                                             local_irq_disable();
> > >>>                                             posted_intr_clear_on()
> > >> Current solution is trying to block other Posted Interrupt from other VCPUs at
> > > same time. It only mitigates it but cannot solve it. The case you mentioned still
> > > exists but it should be rare.
> > >> 
> > > I am not sure I follow. What scenario exactly are you talking about. I
> > > looked over past discussion about it and saw that Marcelo gives an
> > > example how IPI can be lost, but I think that's because we set "on" bit
> > > after KVM_REQ_EVENT:
> > The IPI will not lost in his example(he misread the patch). 
> > 
> > > cpu0                                    cpu1            vcpu0
> > > test_and_set_bit(PIR-A) set KVM_REQ_EVENT
> > >                                                         process
> > >                                                         REQ_EVENT
> > >                                                         PIR-A->IRR
> > > 
> > > vcpu->mode=IN_GUEST
> > > 
> > > if (vcpu0->guest_mode)
> > >         if (!t_a_s_bit(PIR notif))
> > >                 send IPI
> > > linux_pir_handler
> > > 
> > >                                         t_a_s_b(PIR-B)=1
> > >                                         no PIR IPI sent
> > > 
> > > But what if on delivery we do:
> > > pi_test_and_set_pir()
> > > r = pi_test_and_set_on()
> > > kvm_make_request(KVM_REQ_EVENT)
> > > if (!r)
> > >    send_IPI_mask() else kvm_vcpu_kick()
> > > And on vcpu entry we do:
> > > if (kvm_check_request(KVM_REQ_EVENT)
> > >  if (test_and_clear_bit(on))
> > >    kvm_apic_update_irr()
> > > What are the downsides? Can we lost interrupts this way?
> > Need to check guest mode before sending IPI. Otherwise hypervisor may receive IPI.
> Of course, forget it. So if (!r) should be if (!r && mode == IN_GUEST)
> 
> > I think current logic is ok. Only problem is that when to clear Outstanding Notification bit. Actually I prefer your suggestion to clear it before sync_pir_irr. But Marcelo prefer to clear ON bit after disabling irq.
> Marcelo what is the problem with the logic above?
> 
Just to clarify the advantages that I see are: one less callback, no
need to sync pir to irr on each event and, arguably, a little bit
simpler logic.

--
			Gleb.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 5/5] KVM : VMX: Use posted interrupt to deliver virtual interrupt
  2013-03-19 13:29           ` Gleb Natapov
  2013-03-19 13:59             ` Zhang, Yang Z
@ 2013-03-19 15:13             ` Marcelo Tosatti
  2013-03-19 15:21               ` Gleb Natapov
  1 sibling, 1 reply; 31+ messages in thread
From: Marcelo Tosatti @ 2013-03-19 15:13 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Zhang, Yang Z, kvm, Zhang, Xiantao

On Tue, Mar 19, 2013 at 03:29:24PM +0200, Gleb Natapov wrote:
> On Tue, Mar 19, 2013 at 12:42:01PM +0000, Zhang, Yang Z wrote:
> > >>>>  	local_irq_disable();
> > >>>> +	kvm_x86_ops->posted_intr_clear_on(vcpu);
> > >>>> +
> > >>> Why is this separate from pir_to_irr syncing?
> > >> This is the result of discussion with Marcelo. It is more reasonable to
> > >> put it here to avoid unnecessary posted interrupt between:
> > >> 
> > >> vcpu->mode = IN_GUEST_MODE;
> > >> 
> > >> <--interrupt may arrived here and this is unnecessary.
> > >> 
> > >> local_irq_disable();
> > >> 
> > > 
> > > But this still can happen as far as I see:
> > > 
> > > vcpu0                                         vcpu1:
> > > pi_test_and_set_pir() kvm_make_request(KVM_REQ_EVENT)
> > >                                             if (KVM_REQ_EVENT)
> > >                                                    sync_pir_to_irr()
> > >                                             vcpu->mode =
> > > IN_GUEST_MODE;
> > > if (vcpu->mode == IN_GUEST_MODE)
> > >   if (!pi_test_and_set_on())
> > >     apic->send_IPI_mask()
> > >                                             --> IPI arrives here
> > >                                             local_irq_disable();
> > >                                             posted_intr_clear_on()
> > Current solution is trying to block other Posted Interrupt from other VCPUs at same time. It only mitigates it but cannot solve it. The case you mentioned still exists but it should be rare.
> > 
> I am not sure I follow. What scenario exactly are you talking about. I
> looked over past discussion about it and saw that Marcelo gives an
> example how IPI can be lost, but I think that's because we set "on" bit
> after KVM_REQ_EVENT:
> 
> cpu0                                    cpu1            vcpu0
> test_and_set_bit(PIR-A)
> set KVM_REQ_EVENT
>                                                         process REQ_EVENT
>                                                         PIR-A->IRR
> 
>                                                         vcpu->mode=IN_GUEST
> 
> if (vcpu0->guest_mode)
>         if (!t_a_s_bit(PIR notif))
>                 send IPI
>                                                         linux_pir_handler
> 
>                                         t_a_s_b(PIR-B)=1
>                                         no PIR IPI sent
> 
> 
> But what if on delivery we do:
> pi_test_and_set_pir()
> r = pi_test_and_set_on()
> kvm_make_request(KVM_REQ_EVENT)
> if (!r)
>    send_IPI_mask()
> else
>    kvm_vcpu_kick()
> 
> And on vcpu entry we do:
> if (kvm_check_request(KVM_REQ_EVENT)
>  if (test_and_clear_bit(on))
>    kvm_apic_update_irr()
> 
> What are the downsides? Can we lost interrupts this way?

You should not ever enter guest mode with ON bit set on PIR (because that will
prevent PIR IPI from waking up interrupt injection logic). This is why
the ON bit should be cleared on VM entry.

> > > May be move vcpu->mode = IN_GUEST_MODE after local_irq_disable()?
> > Yes, this will solve it. But I am not sure whether it will introduce any regressions. Is there any check relies on this sequence?
> >  	
> Do not think so.

Can't see what is the problem with the code as it is?



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 5/5] KVM : VMX: Use posted interrupt to deliver virtual interrupt
  2013-03-19 15:03         ` Marcelo Tosatti
@ 2013-03-19 15:18           ` Gleb Natapov
  0 siblings, 0 replies; 31+ messages in thread
From: Gleb Natapov @ 2013-03-19 15:18 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Zhang, Yang Z, kvm, Zhang, Xiantao

On Tue, Mar 19, 2013 at 12:03:22PM -0300, Marcelo Tosatti wrote:
> On Tue, Mar 19, 2013 at 02:23:59PM +0200, Gleb Natapov wrote:
> > On Tue, Mar 19, 2013 at 12:11:47PM +0000, Zhang, Yang Z wrote:
> > > Gleb Natapov wrote on 2013-03-19:
> > > > On Fri, Mar 15, 2013 at 09:31:11PM +0800, Yang Zhang wrote:
> > > >> From: Yang Zhang <yang.z.zhang@Intel.com>
> > > >> 
> > > >> If posted interrupt is avaliable, then uses it to inject virtual
> > > >> interrupt to guest.
> > > >> 
> > > >> Signed-off-by: Yang Zhang <yang.z.zhang@Intel.com>
> > > >> ---
> > > >>  arch/x86/kvm/irq.c   |    3 ++-
> > > >>  arch/x86/kvm/lapic.c |   16 +++++++++++++---
> > > >>  arch/x86/kvm/lapic.h |    1 +
> > > >>  arch/x86/kvm/vmx.c   |   11 +++++++++++
> > > >>  arch/x86/kvm/x86.c   |    4 ++++
> > > >>  5 files changed, 31 insertions(+), 4 deletions(-)
> > > >> diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
> > > >> index 484bc87..5179988 100644
> > > >> --- a/arch/x86/kvm/irq.c
> > > >> +++ b/arch/x86/kvm/irq.c
> > > >> @@ -81,7 +81,8 @@ int kvm_cpu_has_interrupt(struct kvm_vcpu *v)
> > > >>  	if (kvm_cpu_has_extint(v))
> > > >>  		return 1;
> > > >> -	return kvm_apic_has_interrupt(v) != -1;	/* LAPIC */
> > > >> +	return (kvm_apic_has_interrupt(v) != -1) ||
> > > >> +		kvm_hwapic_has_interrupt(v);
> > > > That's incorrect. kvm_cpu_has_interrupt() should return true only it
> > > > there is IRR suitable to be injected, not just any IRR.
> > > > kvm_apic_has_interrupt() should call kvm_apic_update_irr().
> > > You are right.
> > > 
> > > >>  }
> > > >>  EXPORT_SYMBOL_GPL(kvm_cpu_has_interrupt);
> > > >> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > > >> index b3ea50e..46c7310 100644
> > > >> --- a/arch/x86/kvm/lapic.c
> > > >> +++ b/arch/x86/kvm/lapic.c
> > > >> @@ -713,7 +713,10 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int
> > > > delivery_mode,
> > > >>  		} else
> > > >>  			apic_clear_vector(vector, apic->regs + APIC_TMR);
> > > >> -		result = !apic_test_and_set_irr(vector, apic);
> > > >> +		result = 1;
> > > >> +		if (!kvm_x86_ops->deliver_posted_interrupt(vcpu, vector))
> > > >> +			result = !apic_test_and_set_irr(vector, apic);
> > > >> +
> > > >>  		trace_kvm_apic_accept_irq(vcpu->vcpu_id, delivery_mode,
> > > >>  					  trig_mode, vector, !result);
> > > >>  		if (!result) {
> > > >> @@ -723,8 +726,10 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int
> > > > delivery_mode,
> > > >>  			break;
> > > >>  		}
> > > >> -		kvm_make_request(KVM_REQ_EVENT, vcpu);
> > > >> -		kvm_vcpu_kick(vcpu);
> > > >> +		if (!kvm_x86_ops->vm_has_apicv(vcpu->kvm)) {
> > > >> +			kvm_make_request(KVM_REQ_EVENT, vcpu);
> > > >> +			kvm_vcpu_kick(vcpu);
> > > >> +		}
> > > >>  		break;
> > > > apicv code and non apicv code are completely different. What's the point
> > > > checking for apicv twice here?
> > > > Just do:
> > > > 
> > > > if (kvm_x86_ops->deliver_posted_interrupt)
> > > > 	kvm_x86_ops->deliver_posted_interrupt(vcpu, vector)
> > > > else {
> > > > 	result = !apic_test_and_set_irr(vector, apic);
> > > > 	kvm_make_request(KVM_REQ_EVENT, vcpu);
> > > > 	kvm_vcpu_kick(vcpu);
> > > > }
> > > > 
> > > > And set kvm_x86_ops->deliver_posted_interrupt only if apicv is enabled.
> > > > 
> > > > Also rearrange patches so that APIC_TMR handling goes before posted
> > > > interrupt series.
> > > Sure. 
> > > 
> > > >> 
> > > >>  	case APIC_DM_REMRD: @@ -1604,6 +1609,11 @@ int
> > > >>  kvm_apic_has_interrupt(struct kvm_vcpu *vcpu) 	return highest_irr; }
> > > >> +bool kvm_hwapic_has_interrupt(struct kvm_vcpu *vcpu)
> > > >> +{
> > > >> +	return kvm_x86_ops->hwapic_has_interrupt(vcpu);
> > > >> +}
> > > >> +
> > > >>  int kvm_apic_accept_pic_intr(struct kvm_vcpu *vcpu)
> > > >>  {
> > > >>  	u32 lvt0 = kvm_apic_get_reg(vcpu->arch.apic, APIC_LVT0);
> > > >> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
> > > >> index e5327be..c6abc63 100644
> > > >> --- a/arch/x86/kvm/lapic.h
> > > >> +++ b/arch/x86/kvm/lapic.h
> > > >> @@ -37,6 +37,7 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu);
> > > >>  void kvm_free_lapic(struct kvm_vcpu *vcpu);
> > > >>  
> > > >>  int kvm_apic_has_interrupt(struct kvm_vcpu *vcpu); +bool
> > > >>  kvm_hwapic_has_interrupt(struct kvm_vcpu *vcpu); int
> > > >>  kvm_apic_accept_pic_intr(struct kvm_vcpu *vcpu); int
> > > >>  kvm_get_apic_interrupt(struct kvm_vcpu *vcpu); void
> > > >>  kvm_lapic_reset(struct kvm_vcpu *vcpu);
> > > >> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> > > >> index 0b5a8ae..48a2239 100644
> > > >> --- a/arch/x86/kvm/vmx.c
> > > >> +++ b/arch/x86/kvm/vmx.c
> > > >> @@ -3932,6 +3932,17 @@ static void vmx_posted_intr_clear_on(struct
> > > > kvm_vcpu *vcpu)
> > > >>  	clear_bit(POSTED_INTR_ON, (unsigned long *)&vmx->pi_desc.u.control);
> > > >>  }
> > > >> +/*
> > > >> + * Send interrupt to vcpu via posted interrupt way.
> > > >> + * Return false if posted interrupt is not supported and the caller will
> > > >> + * roll back to old way(via set vIRR).
> > > >> + * Return true if posted interrupt is avalialbe, the interrupt is set
> > > >> + * in pir(posted interrupt requests):
> > > >> + * 1. If target vcpu is running(non-root mode), send posted interrupt
> > > >> + * notification to vcpu and hardware will sync pir to vIRR atomically.
> > > >> + * 2. If target vcpu isn't running(root mode), kick it to pick up the
> > > >> + * interrupt from pir in next vmentry.
> > > >> + */
> > > > The comment should go into previous patch. Also I prefer to not check
> > > > for posted interrupt inside the callback, but set it to NULL instead.
> > > > This way we avoid calling a callback on a hot path needlessly.
> > > It's make sense. So just follow the logic you mentioned above?
> > >  
> > Yes.
> > 
> > > >>  static bool vmx_deliver_posted_interrupt(struct kvm_vcpu *vcpu, int vector)
> > > >>  {
> > > >>  	struct vcpu_vmx *vmx = to_vmx(vcpu);
> > > >> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > > >> index 0baa90d..0981100 100644
> > > >> --- a/arch/x86/kvm/x86.c
> > > >> +++ b/arch/x86/kvm/x86.c
> > > >> @@ -2679,6 +2679,7 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
> > > >>  static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu, 				   
> > > >>  struct kvm_lapic_state *s) { +	kvm_x86_ops->sync_pir_to_irr(vcpu);
> > > >>  	memcpy(s->regs, vcpu->arch.apic->regs, sizeof *s);
> > > >>  
> > > >>  	return 0; @@ -5699,6 +5700,7 @@ static int vcpu_enter_guest(struct
> > > >>  kvm_vcpu *vcpu) 	}
> > > >>  
> > > >>  	if (kvm_check_request(KVM_REQ_EVENT, vcpu) || req_int_win) {
> > > >>  +		kvm_x86_ops->sync_pir_to_irr(vcpu); 		inject_pending_event(vcpu);
> > > >>  
> > > >>  		/* enable NMI/IRQ window open exits if needed */
> > > >> @@ -5741,6 +5743,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
> > > >> 
> > > >>  	local_irq_disable();
> > > >> +	kvm_x86_ops->posted_intr_clear_on(vcpu);
> > > >> +
> > > > Why is this separate from pir_to_irr syncing?
> > > This is the result of discussion with Marcelo.
> > > It is more reasonable to put it here to avoid unnecessary posted interrupt between:
> > > 
> > > vcpu->mode = IN_GUEST_MODE;
> > > 
> > > <--interrupt may arrived here and this is unnecessary.
> > > 
> > > local_irq_disable();
> > > 
> > 
> > But this still can happen as far as I see:
> > 
> > vcpu0                                         vcpu1: 
> > pi_test_and_set_pir()
> > kvm_make_request(KVM_REQ_EVENT)
> >                                             if (KVM_REQ_EVENT)
> >                                                    sync_pir_to_irr()
> 						^^^^^^^^^^^^^^^^^^
> >                                             vcpu->mode = IN_GUEST_MODE;
> > if (vcpu->mode == IN_GUEST_MODE)
> >   if (!pi_test_and_set_on())
> >     apic->send_IPI_mask()
> >                                             --> IPI arrives here
> >                                             local_irq_disable();
> >                                             posted_intr_clear_on()
> > 
> > 
> > May be move vcpu->mode = IN_GUEST_MODE after local_irq_disable()?
> 
> Scenario is correct because injected PIR has been synced to IRR before
> guest entry.
I do not say it is not. Yang said that the code tries to prevent this
from happening, I showed that this still can happen. There is not harm,
just useless IPI.

--
			Gleb.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 5/5] KVM : VMX: Use posted interrupt to deliver virtual interrupt
  2013-03-19 15:12                 ` Gleb Natapov
@ 2013-03-19 15:19                   ` Marcelo Tosatti
  2013-03-19 15:27                     ` Marcelo Tosatti
  2013-03-19 15:30                     ` Gleb Natapov
  0 siblings, 2 replies; 31+ messages in thread
From: Marcelo Tosatti @ 2013-03-19 15:19 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Zhang, Yang Z, kvm, Zhang, Xiantao

On Tue, Mar 19, 2013 at 05:12:55PM +0200, Gleb Natapov wrote:
> On Tue, Mar 19, 2013 at 04:51:04PM +0200, Gleb Natapov wrote:
> > On Tue, Mar 19, 2013 at 01:59:21PM +0000, Zhang, Yang Z wrote:
> > > Gleb Natapov wrote on 2013-03-19:
> > > > On Tue, Mar 19, 2013 at 12:42:01PM +0000, Zhang, Yang Z wrote:
> > > >>>>>>  	local_irq_disable();
> > > >>>>>> +	kvm_x86_ops->posted_intr_clear_on(vcpu);
> > > >>>>>> +
> > > >>>>> Why is this separate from pir_to_irr syncing?
> > > >>>> This is the result of discussion with Marcelo. It is more reasonable to
> > > >>>> put it here to avoid unnecessary posted interrupt between:
> > > >>>> 
> > > >>>> vcpu->mode = IN_GUEST_MODE;
> > > >>>> 
> > > >>>> <--interrupt may arrived here and this is unnecessary.
> > > >>>> 
> > > >>>> local_irq_disable();
> > > >>>> 
> > > >>> 
> > > >>> But this still can happen as far as I see:
> > > >>> 
> > > >>> vcpu0                                         vcpu1:
> > > >>> pi_test_and_set_pir() kvm_make_request(KVM_REQ_EVENT)
> > > >>>                                             if (KVM_REQ_EVENT)
> > > >>>                                                    sync_pir_to_irr()
> > > >>>                                             vcpu->mode =
> > > >>> IN_GUEST_MODE;
> > > >>> if (vcpu->mode == IN_GUEST_MODE)
> > > >>>   if (!pi_test_and_set_on())
> > > >>>     apic->send_IPI_mask()
> > > >>>                                             --> IPI arrives here
> > > >>>                                             local_irq_disable();
> > > >>>                                             posted_intr_clear_on()
> > > >> Current solution is trying to block other Posted Interrupt from other VCPUs at
> > > > same time. It only mitigates it but cannot solve it. The case you mentioned still
> > > > exists but it should be rare.
> > > >> 
> > > > I am not sure I follow. What scenario exactly are you talking about. I
> > > > looked over past discussion about it and saw that Marcelo gives an
> > > > example how IPI can be lost, but I think that's because we set "on" bit
> > > > after KVM_REQ_EVENT:
> > > The IPI will not lost in his example(he misread the patch). 
> > > 
> > > > cpu0                                    cpu1            vcpu0
> > > > test_and_set_bit(PIR-A) set KVM_REQ_EVENT
> > > >                                                         process
> > > >                                                         REQ_EVENT
> > > >                                                         PIR-A->IRR
> > > > 
> > > > vcpu->mode=IN_GUEST
> > > > 
> > > > if (vcpu0->guest_mode)
> > > >         if (!t_a_s_bit(PIR notif))
> > > >                 send IPI
> > > > linux_pir_handler
> > > > 
> > > >                                         t_a_s_b(PIR-B)=1
> > > >                                         no PIR IPI sent
> > > > 
> > > > But what if on delivery we do:
> > > > pi_test_and_set_pir()
> > > > r = pi_test_and_set_on()
> > > > kvm_make_request(KVM_REQ_EVENT)
> > > > if (!r)
> > > >    send_IPI_mask() else kvm_vcpu_kick()
> > > > And on vcpu entry we do:
> > > > if (kvm_check_request(KVM_REQ_EVENT)
> > > >  if (test_and_clear_bit(on))
> > > >    kvm_apic_update_irr()
> > > > What are the downsides? Can we lost interrupts this way?
> > > Need to check guest mode before sending IPI. Otherwise hypervisor may receive IPI.
> > Of course, forget it. So if (!r) should be if (!r && mode == IN_GUEST)
> > 
> > > I think current logic is ok. Only problem is that when to clear Outstanding Notification bit. Actually I prefer your suggestion to clear it before sync_pir_irr. But Marcelo prefer to clear ON bit after disabling irq.
> > Marcelo what is the problem with the logic above?
> > 
> Just to clarify the advantages that I see are: one less callback, no
> need to sync pir to irr on each event and, arguably, a little bit
> simpler logic.

See the previous argument: should never enter guest mode with PIR ON bit
set. With logic above:  

context1				context2                      context3
					set_bit(PIR-1)                   
					r = pi_test_and_set_on()	set_bit(PIR-40)
					set_bit(KVM_REQ_EVENT)
if (kvm_check_request(KVM_REQ_EVENT)	
 if (test_and_clear_bit(on))

   kvm_apic_update_irr()						r = pi_test_and_set_on()

guest entry with PIR ON=1


Thats the reason for unconditional clearing on guest entry: it is easy
to verify its correct. I understand and agree the callback (and VMWRITE)
is not nice.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 5/5] KVM : VMX: Use posted interrupt to deliver virtual interrupt
  2013-03-19 15:13             ` Marcelo Tosatti
@ 2013-03-19 15:21               ` Gleb Natapov
  0 siblings, 0 replies; 31+ messages in thread
From: Gleb Natapov @ 2013-03-19 15:21 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Zhang, Yang Z, kvm, Zhang, Xiantao

On Tue, Mar 19, 2013 at 12:13:11PM -0300, Marcelo Tosatti wrote:
> On Tue, Mar 19, 2013 at 03:29:24PM +0200, Gleb Natapov wrote:
> > On Tue, Mar 19, 2013 at 12:42:01PM +0000, Zhang, Yang Z wrote:
> > > >>>>  	local_irq_disable();
> > > >>>> +	kvm_x86_ops->posted_intr_clear_on(vcpu);
> > > >>>> +
> > > >>> Why is this separate from pir_to_irr syncing?
> > > >> This is the result of discussion with Marcelo. It is more reasonable to
> > > >> put it here to avoid unnecessary posted interrupt between:
> > > >> 
> > > >> vcpu->mode = IN_GUEST_MODE;
> > > >> 
> > > >> <--interrupt may arrived here and this is unnecessary.
> > > >> 
> > > >> local_irq_disable();
> > > >> 
> > > > 
> > > > But this still can happen as far as I see:
> > > > 
> > > > vcpu0                                         vcpu1:
> > > > pi_test_and_set_pir() kvm_make_request(KVM_REQ_EVENT)
> > > >                                             if (KVM_REQ_EVENT)
> > > >                                                    sync_pir_to_irr()
> > > >                                             vcpu->mode =
> > > > IN_GUEST_MODE;
> > > > if (vcpu->mode == IN_GUEST_MODE)
> > > >   if (!pi_test_and_set_on())
> > > >     apic->send_IPI_mask()
> > > >                                             --> IPI arrives here
> > > >                                             local_irq_disable();
> > > >                                             posted_intr_clear_on()
> > > Current solution is trying to block other Posted Interrupt from other VCPUs at same time. It only mitigates it but cannot solve it. The case you mentioned still exists but it should be rare.
> > > 
> > I am not sure I follow. What scenario exactly are you talking about. I
> > looked over past discussion about it and saw that Marcelo gives an
> > example how IPI can be lost, but I think that's because we set "on" bit
> > after KVM_REQ_EVENT:
> > 
> > cpu0                                    cpu1            vcpu0
> > test_and_set_bit(PIR-A)
> > set KVM_REQ_EVENT
> >                                                         process REQ_EVENT
> >                                                         PIR-A->IRR
> > 
> >                                                         vcpu->mode=IN_GUEST
> > 
> > if (vcpu0->guest_mode)
> >         if (!t_a_s_bit(PIR notif))
> >                 send IPI
> >                                                         linux_pir_handler
> > 
> >                                         t_a_s_b(PIR-B)=1
> >                                         no PIR IPI sent
> > 
> > 
> > But what if on delivery we do:
> > pi_test_and_set_pir()
> > r = pi_test_and_set_on()
> > kvm_make_request(KVM_REQ_EVENT)
> > if (!r)
> >    send_IPI_mask()
> > else
> >    kvm_vcpu_kick()
> > 
> > And on vcpu entry we do:
> > if (kvm_check_request(KVM_REQ_EVENT)
> >  if (test_and_clear_bit(on))
> >    kvm_apic_update_irr()
> > 
> > What are the downsides? Can we lost interrupts this way?
> 
> You should not ever enter guest mode with ON bit set on PIR (because that will
> prevent PIR IPI from waking up interrupt injection logic). This is why
> the ON bit should be cleared on VM entry.
I agree that we should not, but I claim that this will not happen since
KVM_REQ_EVENT will prevent guest entry with ON bit set.

> 
> > > > May be move vcpu->mode = IN_GUEST_MODE after local_irq_disable()?
> > > Yes, this will solve it. But I am not sure whether it will introduce any regressions. Is there any check relies on this sequence?
> > >  	
> > Do not think so.
> 
> Can't see what is the problem with the code as it is?
> 
Just the useless IPI that can be prevented. I agree that this is not a
big deal unless shown otherwise.

--
			Gleb.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 5/5] KVM : VMX: Use posted interrupt to deliver virtual interrupt
  2013-03-19 15:19                   ` Marcelo Tosatti
@ 2013-03-19 15:27                     ` Marcelo Tosatti
  2013-03-19 16:10                       ` Gleb Natapov
  2013-03-19 15:30                     ` Gleb Natapov
  1 sibling, 1 reply; 31+ messages in thread
From: Marcelo Tosatti @ 2013-03-19 15:27 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Zhang, Yang Z, kvm, Zhang, Xiantao

On Tue, Mar 19, 2013 at 12:19:55PM -0300, Marcelo Tosatti wrote:
> See the previous argument: should never enter guest mode with PIR ON bit
> set. With logic above:  
> 
> context1				context2                      context3
> 					set_bit(PIR-1)                   
> 					r = pi_test_and_set_on()	set_bit(PIR-40)
> 					set_bit(KVM_REQ_EVENT)
> if (kvm_check_request(KVM_REQ_EVENT)	
>  if (test_and_clear_bit(on))
> 
>    kvm_apic_update_irr()						r = pi_test_and_set_on()
> 
> guest entry with PIR ON=1
> 
> 
> Thats the reason for unconditional clearing on guest entry: it is easy
> to verify its correct. I understand and agree the callback (and VMWRITE)
> is not nice.

Re: KVM_REQ_EVENT setting after set_bit(KVM_REQ_EVENT) assures no guest
entry with PIR ON=1.

Might be, would have to verify. Its trickier though. Maybe add a FIXME:
to the callback and remove it later.


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 5/5] KVM : VMX: Use posted interrupt to deliver virtual interrupt
  2013-03-19 15:19                   ` Marcelo Tosatti
  2013-03-19 15:27                     ` Marcelo Tosatti
@ 2013-03-19 15:30                     ` Gleb Natapov
  1 sibling, 0 replies; 31+ messages in thread
From: Gleb Natapov @ 2013-03-19 15:30 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Zhang, Yang Z, kvm, Zhang, Xiantao

On Tue, Mar 19, 2013 at 12:19:55PM -0300, Marcelo Tosatti wrote:
> On Tue, Mar 19, 2013 at 05:12:55PM +0200, Gleb Natapov wrote:
> > On Tue, Mar 19, 2013 at 04:51:04PM +0200, Gleb Natapov wrote:
> > > On Tue, Mar 19, 2013 at 01:59:21PM +0000, Zhang, Yang Z wrote:
> > > > Gleb Natapov wrote on 2013-03-19:
> > > > > On Tue, Mar 19, 2013 at 12:42:01PM +0000, Zhang, Yang Z wrote:
> > > > >>>>>>  	local_irq_disable();
> > > > >>>>>> +	kvm_x86_ops->posted_intr_clear_on(vcpu);
> > > > >>>>>> +
> > > > >>>>> Why is this separate from pir_to_irr syncing?
> > > > >>>> This is the result of discussion with Marcelo. It is more reasonable to
> > > > >>>> put it here to avoid unnecessary posted interrupt between:
> > > > >>>> 
> > > > >>>> vcpu->mode = IN_GUEST_MODE;
> > > > >>>> 
> > > > >>>> <--interrupt may arrived here and this is unnecessary.
> > > > >>>> 
> > > > >>>> local_irq_disable();
> > > > >>>> 
> > > > >>> 
> > > > >>> But this still can happen as far as I see:
> > > > >>> 
> > > > >>> vcpu0                                         vcpu1:
> > > > >>> pi_test_and_set_pir() kvm_make_request(KVM_REQ_EVENT)
> > > > >>>                                             if (KVM_REQ_EVENT)
> > > > >>>                                                    sync_pir_to_irr()
> > > > >>>                                             vcpu->mode =
> > > > >>> IN_GUEST_MODE;
> > > > >>> if (vcpu->mode == IN_GUEST_MODE)
> > > > >>>   if (!pi_test_and_set_on())
> > > > >>>     apic->send_IPI_mask()
> > > > >>>                                             --> IPI arrives here
> > > > >>>                                             local_irq_disable();
> > > > >>>                                             posted_intr_clear_on()
> > > > >> Current solution is trying to block other Posted Interrupt from other VCPUs at
> > > > > same time. It only mitigates it but cannot solve it. The case you mentioned still
> > > > > exists but it should be rare.
> > > > >> 
> > > > > I am not sure I follow. What scenario exactly are you talking about. I
> > > > > looked over past discussion about it and saw that Marcelo gives an
> > > > > example how IPI can be lost, but I think that's because we set "on" bit
> > > > > after KVM_REQ_EVENT:
> > > > The IPI will not lost in his example(he misread the patch). 
> > > > 
> > > > > cpu0                                    cpu1            vcpu0
> > > > > test_and_set_bit(PIR-A) set KVM_REQ_EVENT
> > > > >                                                         process
> > > > >                                                         REQ_EVENT
> > > > >                                                         PIR-A->IRR
> > > > > 
> > > > > vcpu->mode=IN_GUEST
> > > > > 
> > > > > if (vcpu0->guest_mode)
> > > > >         if (!t_a_s_bit(PIR notif))
> > > > >                 send IPI
> > > > > linux_pir_handler
> > > > > 
> > > > >                                         t_a_s_b(PIR-B)=1
> > > > >                                         no PIR IPI sent
> > > > > 
> > > > > But what if on delivery we do:
> > > > > pi_test_and_set_pir()
> > > > > r = pi_test_and_set_on()
> > > > > kvm_make_request(KVM_REQ_EVENT)
> > > > > if (!r)
> > > > >    send_IPI_mask() else kvm_vcpu_kick()
> > > > > And on vcpu entry we do:
> > > > > if (kvm_check_request(KVM_REQ_EVENT)
> > > > >  if (test_and_clear_bit(on))
> > > > >    kvm_apic_update_irr()
> > > > > What are the downsides? Can we lost interrupts this way?
> > > > Need to check guest mode before sending IPI. Otherwise hypervisor may receive IPI.
> > > Of course, forget it. So if (!r) should be if (!r && mode == IN_GUEST)
> > > 
> > > > I think current logic is ok. Only problem is that when to clear Outstanding Notification bit. Actually I prefer your suggestion to clear it before sync_pir_irr. But Marcelo prefer to clear ON bit after disabling irq.
> > > Marcelo what is the problem with the logic above?
> > > 
> > Just to clarify the advantages that I see are: one less callback, no
> > need to sync pir to irr on each event and, arguably, a little bit
> > simpler logic.
> 
> See the previous argument: should never enter guest mode with PIR ON bit
> set. With logic above:  
> 
> context1				context2                      context3
> 					set_bit(PIR-1)                   
> 					r = pi_test_and_set_on()	set_bit(PIR-40)
> 					set_bit(KVM_REQ_EVENT)
> if (kvm_check_request(KVM_REQ_EVENT)	
>  if (test_and_clear_bit(on))
> 
>    kvm_apic_update_irr()						r = pi_test_and_set_on()
> 
> guest entry with PIR ON=1
> 
> 
But that's OK since context3 will send IPI and PIR will be processed by
CPU. Let me amend myself: we should not enter guest mode with ON bit set
and no PI IPI pending in LAPIC.

> Thats the reason for unconditional clearing on guest entry: it is easy
> to verify its correct. I understand and agree the callback (and VMWRITE)
> is not nice.

--
			Gleb.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 5/5] KVM : VMX: Use posted interrupt to deliver virtual interrupt
  2013-03-19 15:27                     ` Marcelo Tosatti
@ 2013-03-19 16:10                       ` Gleb Natapov
  2013-03-20 11:47                         ` Zhang, Yang Z
  0 siblings, 1 reply; 31+ messages in thread
From: Gleb Natapov @ 2013-03-19 16:10 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Zhang, Yang Z, kvm, Zhang, Xiantao

On Tue, Mar 19, 2013 at 12:27:38PM -0300, Marcelo Tosatti wrote:
> On Tue, Mar 19, 2013 at 12:19:55PM -0300, Marcelo Tosatti wrote:
> > See the previous argument: should never enter guest mode with PIR ON bit
> > set. With logic above:  
> > 
> > context1				context2                      context3
> > 					set_bit(PIR-1)                   
> > 					r = pi_test_and_set_on()	set_bit(PIR-40)
> > 					set_bit(KVM_REQ_EVENT)
> > if (kvm_check_request(KVM_REQ_EVENT)	
> >  if (test_and_clear_bit(on))
> > 
> >    kvm_apic_update_irr()						r = pi_test_and_set_on()
> > 
> > guest entry with PIR ON=1
> > 
> > 
> > Thats the reason for unconditional clearing on guest entry: it is easy
> > to verify its correct. I understand and agree the callback (and VMWRITE)
> > is not nice.
> 
> Re: KVM_REQ_EVENT setting after set_bit(KVM_REQ_EVENT) assures no guest
> entry with PIR ON=1.
> 
> Might be, would have to verify. Its trickier though. Maybe add a FIXME:
> to the callback and remove it later.
We have time still. RTC series is not ready yet. I'll think hard and try
to poke holes in the logic in this patch and you do the same for what I
propose.

--
			Gleb.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH v6 5/5] KVM : VMX: Use posted interrupt to deliver virtual interrupt
  2013-03-19 16:10                       ` Gleb Natapov
@ 2013-03-20 11:47                         ` Zhang, Yang Z
  2013-03-20 11:49                           ` Gleb Natapov
  0 siblings, 1 reply; 31+ messages in thread
From: Zhang, Yang Z @ 2013-03-20 11:47 UTC (permalink / raw)
  To: Gleb Natapov, Marcelo Tosatti; +Cc: kvm, Zhang, Xiantao

Gleb Natapov wrote on 2013-03-20:
> On Tue, Mar 19, 2013 at 12:27:38PM -0300, Marcelo Tosatti wrote:
>> On Tue, Mar 19, 2013 at 12:19:55PM -0300, Marcelo Tosatti wrote:
>>> See the previous argument: should never enter guest mode with PIR ON bit
>>> set. With logic above:
>>> 
>>> context1				context2                      context3
>>> 					set_bit(PIR-1)
>>> 					r = pi_test_and_set_on()	set_bit(PIR-40)
>>> 					set_bit(KVM_REQ_EVENT)
>>> if (kvm_check_request(KVM_REQ_EVENT)
>>>  if (test_and_clear_bit(on))
>>>    kvm_apic_update_irr()						r =
> pi_test_and_set_on()
>>> 
>>> guest entry with PIR ON=1
>>> 
>>> 
>>> Thats the reason for unconditional clearing on guest entry: it is easy
>>> to verify its correct. I understand and agree the callback (and VMWRITE)
>>> is not nice.
>> 
>> Re: KVM_REQ_EVENT setting after set_bit(KVM_REQ_EVENT) assures no guest
>> entry with PIR ON=1.
>> 
>> Might be, would have to verify. Its trickier though. Maybe add a FIXME:
>> to the callback and remove it later.
> We have time still. RTC series is not ready yet. I'll think hard and try
> to poke holes in the logic in this patch and you do the same for what I
> propose.
Any thought? As far as I see, the two solutions are ok. It's hard to say which is better. But clear ON bit when sync_pir_irr should be more clear and close to hardware's behavior.

Best regards,
Yang



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 5/5] KVM : VMX: Use posted interrupt to deliver virtual interrupt
  2013-03-20 11:47                         ` Zhang, Yang Z
@ 2013-03-20 11:49                           ` Gleb Natapov
  2013-03-20 11:52                             ` Zhang, Yang Z
  0 siblings, 1 reply; 31+ messages in thread
From: Gleb Natapov @ 2013-03-20 11:49 UTC (permalink / raw)
  To: Zhang, Yang Z; +Cc: Marcelo Tosatti, kvm, Zhang, Xiantao

On Wed, Mar 20, 2013 at 11:47:49AM +0000, Zhang, Yang Z wrote:
> Gleb Natapov wrote on 2013-03-20:
> > On Tue, Mar 19, 2013 at 12:27:38PM -0300, Marcelo Tosatti wrote:
> >> On Tue, Mar 19, 2013 at 12:19:55PM -0300, Marcelo Tosatti wrote:
> >>> See the previous argument: should never enter guest mode with PIR ON bit
> >>> set. With logic above:
> >>> 
> >>> context1				context2                      context3
> >>> 					set_bit(PIR-1)
> >>> 					r = pi_test_and_set_on()	set_bit(PIR-40)
> >>> 					set_bit(KVM_REQ_EVENT)
> >>> if (kvm_check_request(KVM_REQ_EVENT)
> >>>  if (test_and_clear_bit(on))
> >>>    kvm_apic_update_irr()						r =
> > pi_test_and_set_on()
> >>> 
> >>> guest entry with PIR ON=1
> >>> 
> >>> 
> >>> Thats the reason for unconditional clearing on guest entry: it is easy
> >>> to verify its correct. I understand and agree the callback (and VMWRITE)
> >>> is not nice.
> >> 
> >> Re: KVM_REQ_EVENT setting after set_bit(KVM_REQ_EVENT) assures no guest
> >> entry with PIR ON=1.
> >> 
> >> Might be, would have to verify. Its trickier though. Maybe add a FIXME:
> >> to the callback and remove it later.
> > We have time still. RTC series is not ready yet. I'll think hard and try
> > to poke holes in the logic in this patch and you do the same for what I
> > propose.
> Any thought? As far as I see, the two solutions are ok. It's hard to say which is better. But clear ON bit when sync_pir_irr should be more clear and close to hardware's behavior.
> 
Lets go with it unless we see why it will not work.

--
			Gleb.

^ permalink raw reply	[flat|nested] 31+ messages in thread

* RE: [PATCH v6 5/5] KVM : VMX: Use posted interrupt to deliver virtual interrupt
  2013-03-20 11:49                           ` Gleb Natapov
@ 2013-03-20 11:52                             ` Zhang, Yang Z
  0 siblings, 0 replies; 31+ messages in thread
From: Zhang, Yang Z @ 2013-03-20 11:52 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Marcelo Tosatti, kvm, Zhang, Xiantao

Gleb Natapov wrote on 2013-03-20:
> On Wed, Mar 20, 2013 at 11:47:49AM +0000, Zhang, Yang Z wrote:
>> Gleb Natapov wrote on 2013-03-20:
>>> On Tue, Mar 19, 2013 at 12:27:38PM -0300, Marcelo Tosatti wrote:
>>>> On Tue, Mar 19, 2013 at 12:19:55PM -0300, Marcelo Tosatti wrote:
>>>>> See the previous argument: should never enter guest mode with PIR ON
>>>>> bit set. With logic above:
>>>>> 
>>>>> context1				context2                      context3
>>>>> 					set_bit(PIR-1)
>>>>> 					r = pi_test_and_set_on()	set_bit(PIR-40)
>>>>> 					set_bit(KVM_REQ_EVENT)
>>>>> if (kvm_check_request(KVM_REQ_EVENT)
>>>>>  if (test_and_clear_bit(on))
>>>>>    kvm_apic_update_irr()						r =
>>> pi_test_and_set_on()
>>>>> 
>>>>> guest entry with PIR ON=1
>>>>> 
>>>>> 
>>>>> Thats the reason for unconditional clearing on guest entry: it is easy
>>>>> to verify its correct. I understand and agree the callback (and VMWRITE)
>>>>> is not nice.
>>>> 
>>>> Re: KVM_REQ_EVENT setting after set_bit(KVM_REQ_EVENT) assures no
>>>> guest entry with PIR ON=1.
>>>> 
>>>> Might be, would have to verify. Its trickier though. Maybe add a FIXME:
>>>> to the callback and remove it later.
>>> We have time still. RTC series is not ready yet. I'll think hard and try
>>> to poke holes in the logic in this patch and you do the same for what I
>>> propose.
>> Any thought? As far as I see, the two solutions are ok. It's hard to say which is
> better. But clear ON bit when sync_pir_irr should be more clear and close to
> hardware's behavior.
>> 
> Lets go with it unless we see why it will not work.
Sure.

Best regards,
Yang



^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2013-03-20 11:53 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-03-15 13:31 [PATCH v6 0/5] KVM: VMX: Add Posted Interrupt supporting Yang Zhang
2013-03-15 13:31 ` [PATCH v6 1/5] KVM: VMX: Enable acknowledge interupt on vmexit Yang Zhang
2013-03-15 13:31 ` [PATCH v6 2/5] KVM: VMX: Register a new IPI for posted interrupt Yang Zhang
2013-03-15 13:31 ` [PATCH v6 3/5] KVM: VMX: Check the posted interrupt capability Yang Zhang
2013-03-15 13:31 ` [PATCH v6 4/5] KVM: VMX: Add the algorithm of deliver posted interrupt Yang Zhang
2013-03-15 13:31 ` [PATCH v6 5/5] KVM : VMX: Use posted interrupt to deliver virtual interrupt Yang Zhang
2013-03-19  8:54   ` Gleb Natapov
2013-03-19 12:11     ` Zhang, Yang Z
2013-03-19 12:23       ` Gleb Natapov
2013-03-19 12:42         ` Zhang, Yang Z
2013-03-19 13:29           ` Gleb Natapov
2013-03-19 13:59             ` Zhang, Yang Z
2013-03-19 14:51               ` Gleb Natapov
2013-03-19 15:12                 ` Gleb Natapov
2013-03-19 15:19                   ` Marcelo Tosatti
2013-03-19 15:27                     ` Marcelo Tosatti
2013-03-19 16:10                       ` Gleb Natapov
2013-03-20 11:47                         ` Zhang, Yang Z
2013-03-20 11:49                           ` Gleb Natapov
2013-03-20 11:52                             ` Zhang, Yang Z
2013-03-19 15:30                     ` Gleb Natapov
2013-03-19 15:13             ` Marcelo Tosatti
2013-03-19 15:21               ` Gleb Natapov
2013-03-19 15:03         ` Marcelo Tosatti
2013-03-19 15:18           ` Gleb Natapov
2013-03-18  2:49 ` [PATCH v6 0/5] KVM: VMX: Add Posted Interrupt supporting Zhang, Yang Z
2013-03-18  9:16   ` Gleb Natapov
2013-03-18 10:43     ` Zhang, Yang Z
2013-03-18 11:28       ` Gleb Natapov
2013-03-18 11:44         ` Zhang, Yang Z
2013-03-18 22:20 ` Marcelo Tosatti

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.