All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/5] KVM: Hyper-V synthetic interrupt controller
@ 2015-11-10 12:36 ` Andrey Smetanin
  0 siblings, 0 replies; 38+ messages in thread
From: Andrey Smetanin @ 2015-11-10 12:36 UTC (permalink / raw)
  To: kvm; +Cc: Denis V. Lunev, Gleb Natapov, Paolo Bonzini, Roman Kagan, qemu-devel

This patchset implements the KVM part of the synthetic interrupt
controller (SynIC) which is a building block of the Hyper-V
paravirtualized device bus (vmbus).

SynIC is a lapic extension, which is controlled via MSRs and maintains
for each vCPU
 - 16 synthetic interrupt "lines" (SINT's); each can be configured to
   trigger a specific interrupt vector optionally with auto-EOI
   semantics
 - a message page in the guest memory with 16 256-byte per-SINT message
   slots
 - an event flag page in the guest memory with 16 2048-bit per-SINT
   event flag areas

The host triggers a SINT whenever it delivers a new message to the
corresponding slot or flips an event flag bit in the corresponding area.
The guest informs the host that it can try delivering a message by
explicitly asserting EOI in lapic or writing to End-Of-Message (EOM)
MSR.

The userspace (qemu) triggers interrupts and receives EOM notifications
via irqfd with resampler; for that, a GSI is allocated for each
configured SINT, and irq_routing api is extended to support GSI-SINT
mapping.

Besides, a new vcpu exit is introduced to notify the userspace of the
changes in SynIC configuraion triggered by guest writing to the
corresponding MSRs.

Since auto-EOI behavior of SynIC cannot be made compatible with APIC
hardware virtualization, the latter is disabled using a newly
introduced flag, when SynIC is activated.

This patches seria has been tested by running of kvm-unit-tests
(which also includes previosly sent 'hyperv_synic' test) with
host CPU which supports APICv (Intel(R) Xeon(R) CPU E5-2407 v2 @ 2.40GHz)

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org

Changes v4:
* disable APICv in case Hyper-V SynIC enabled
* patchset rebase into latest kvm/queue (10 Nov 2015)
* do Hyper-V SynIC exit only at !host(guest) msr's writes

Changes v3:
* Hyper-V SynIC KVM API documentation fixes

Changes v2:
* irqchip/eventfd preparation improvements to support
arch specific routing entries like Hyper-V SynIC.
* add Hyper-V SynIC vectors into EOI exit bitmap.
* do not use posted interrupts in case of Hyper-V SynIC
AutoEOI vectors

Andrey Smetanin (5):
  kvm/irqchip: kvm_arch_irq_routing_update renaming split
  kvm/x86: split ioapic-handled and EOI exit bitmaps
  kvm/x86: per-vcpu apicv deactivation support
  kvm/x86: Hyper-V synthetic interrupt controller
  kvm/x86: Hyper-V kvm exit

 Documentation/virtual/kvm/api.txt |  41 +++++
 arch/x86/include/asm/kvm_host.h   |  26 ++-
 arch/x86/kvm/hyperv.c             | 335 ++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/hyperv.h             |  23 +++
 arch/x86/kvm/ioapic.c             |   4 +-
 arch/x86/kvm/ioapic.h             |   7 +-
 arch/x86/kvm/irq.c                |   2 +-
 arch/x86/kvm/irq_comm.c           |  41 ++++-
 arch/x86/kvm/lapic.c              |  40 +++--
 arch/x86/kvm/lapic.h              |   9 +-
 arch/x86/kvm/svm.c                |  13 +-
 arch/x86/kvm/vmx.c                |  48 +++---
 arch/x86/kvm/x86.c                |  66 +++++++-
 include/linux/kvm_host.h          |  12 +-
 include/uapi/linux/kvm.h          |  25 +++
 virt/kvm/irqchip.c                |   7 +-
 16 files changed, 625 insertions(+), 74 deletions(-)

-- 
2.4.3


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Qemu-devel] [PATCH v4 0/5] KVM: Hyper-V synthetic interrupt controller
@ 2015-11-10 12:36 ` Andrey Smetanin
  0 siblings, 0 replies; 38+ messages in thread
From: Andrey Smetanin @ 2015-11-10 12:36 UTC (permalink / raw)
  To: kvm; +Cc: Gleb Natapov, Denis V. Lunev, qemu-devel, Roman Kagan, Paolo Bonzini

This patchset implements the KVM part of the synthetic interrupt
controller (SynIC) which is a building block of the Hyper-V
paravirtualized device bus (vmbus).

SynIC is a lapic extension, which is controlled via MSRs and maintains
for each vCPU
 - 16 synthetic interrupt "lines" (SINT's); each can be configured to
   trigger a specific interrupt vector optionally with auto-EOI
   semantics
 - a message page in the guest memory with 16 256-byte per-SINT message
   slots
 - an event flag page in the guest memory with 16 2048-bit per-SINT
   event flag areas

The host triggers a SINT whenever it delivers a new message to the
corresponding slot or flips an event flag bit in the corresponding area.
The guest informs the host that it can try delivering a message by
explicitly asserting EOI in lapic or writing to End-Of-Message (EOM)
MSR.

The userspace (qemu) triggers interrupts and receives EOM notifications
via irqfd with resampler; for that, a GSI is allocated for each
configured SINT, and irq_routing api is extended to support GSI-SINT
mapping.

Besides, a new vcpu exit is introduced to notify the userspace of the
changes in SynIC configuraion triggered by guest writing to the
corresponding MSRs.

Since auto-EOI behavior of SynIC cannot be made compatible with APIC
hardware virtualization, the latter is disabled using a newly
introduced flag, when SynIC is activated.

This patches seria has been tested by running of kvm-unit-tests
(which also includes previosly sent 'hyperv_synic' test) with
host CPU which supports APICv (Intel(R) Xeon(R) CPU E5-2407 v2 @ 2.40GHz)

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org

Changes v4:
* disable APICv in case Hyper-V SynIC enabled
* patchset rebase into latest kvm/queue (10 Nov 2015)
* do Hyper-V SynIC exit only at !host(guest) msr's writes

Changes v3:
* Hyper-V SynIC KVM API documentation fixes

Changes v2:
* irqchip/eventfd preparation improvements to support
arch specific routing entries like Hyper-V SynIC.
* add Hyper-V SynIC vectors into EOI exit bitmap.
* do not use posted interrupts in case of Hyper-V SynIC
AutoEOI vectors

Andrey Smetanin (5):
  kvm/irqchip: kvm_arch_irq_routing_update renaming split
  kvm/x86: split ioapic-handled and EOI exit bitmaps
  kvm/x86: per-vcpu apicv deactivation support
  kvm/x86: Hyper-V synthetic interrupt controller
  kvm/x86: Hyper-V kvm exit

 Documentation/virtual/kvm/api.txt |  41 +++++
 arch/x86/include/asm/kvm_host.h   |  26 ++-
 arch/x86/kvm/hyperv.c             | 335 ++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/hyperv.h             |  23 +++
 arch/x86/kvm/ioapic.c             |   4 +-
 arch/x86/kvm/ioapic.h             |   7 +-
 arch/x86/kvm/irq.c                |   2 +-
 arch/x86/kvm/irq_comm.c           |  41 ++++-
 arch/x86/kvm/lapic.c              |  40 +++--
 arch/x86/kvm/lapic.h              |   9 +-
 arch/x86/kvm/svm.c                |  13 +-
 arch/x86/kvm/vmx.c                |  48 +++---
 arch/x86/kvm/x86.c                |  66 +++++++-
 include/linux/kvm_host.h          |  12 +-
 include/uapi/linux/kvm.h          |  25 +++
 virt/kvm/irqchip.c                |   7 +-
 16 files changed, 625 insertions(+), 74 deletions(-)

-- 
2.4.3

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [PATCH v4 1/5] kvm/irqchip: kvm_arch_irq_routing_update renaming split
  2015-11-10 12:36 ` [Qemu-devel] " Andrey Smetanin
@ 2015-11-10 12:36   ` Andrey Smetanin
  -1 siblings, 0 replies; 38+ messages in thread
From: Andrey Smetanin @ 2015-11-10 12:36 UTC (permalink / raw)
  To: kvm; +Cc: Denis V. Lunev, Gleb Natapov, Paolo Bonzini, Roman Kagan, qemu-devel

Actually kvm_arch_irq_routing_update() should be
kvm_arch_post_irq_routing_update() as it's called at the end
of irq routing update.

This renaming frees kvm_arch_irq_routing_update function name.
kvm_arch_irq_routing_update() weak function which will be used
to update mappings for arch-specific irq routing entries
(in particular, the upcoming Hyper-V synthetic interrupts).

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org

---
 arch/x86/kvm/irq_comm.c  | 2 +-
 include/linux/kvm_host.h | 5 +++--
 virt/kvm/irqchip.c       | 7 ++++++-
 3 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c
index 84b96d3..e39768c 100644
--- a/arch/x86/kvm/irq_comm.c
+++ b/arch/x86/kvm/irq_comm.c
@@ -332,7 +332,7 @@ int kvm_setup_empty_irq_routing(struct kvm *kvm)
 	return kvm_set_irq_routing(kvm, empty_routing, 0, 0);
 }
 
-void kvm_arch_irq_routing_update(struct kvm *kvm)
+void kvm_arch_post_irq_routing_update(struct kvm *kvm)
 {
 	if (ioapic_in_kernel(kvm) || !irqchip_in_kernel(kvm))
 		return;
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 242a6d2..dbe2a2f 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -473,12 +473,12 @@ void vcpu_put(struct kvm_vcpu *vcpu);
 
 #ifdef __KVM_HAVE_IOAPIC
 void kvm_vcpu_request_scan_ioapic(struct kvm *kvm);
-void kvm_arch_irq_routing_update(struct kvm *kvm);
+void kvm_arch_post_irq_routing_update(struct kvm *kvm);
 #else
 static inline void kvm_vcpu_request_scan_ioapic(struct kvm *kvm)
 {
 }
-static inline void kvm_arch_irq_routing_update(struct kvm *kvm)
+static inline void kvm_arch_post_irq_routing_update(struct kvm *kvm)
 {
 }
 #endif
@@ -1080,6 +1080,7 @@ static inline void kvm_irq_routing_update(struct kvm *kvm)
 {
 }
 #endif
+void kvm_arch_irq_routing_update(struct kvm *kvm);
 
 static inline int kvm_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
 {
diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c
index f0b08a2..fe84e1a 100644
--- a/virt/kvm/irqchip.c
+++ b/virt/kvm/irqchip.c
@@ -166,6 +166,10 @@ out:
 	return r;
 }
 
+void __attribute__((weak)) kvm_arch_irq_routing_update(struct kvm *kvm)
+{
+}
+
 int kvm_set_irq_routing(struct kvm *kvm,
 			const struct kvm_irq_routing_entry *ue,
 			unsigned nr,
@@ -219,9 +223,10 @@ int kvm_set_irq_routing(struct kvm *kvm,
 	old = kvm->irq_routing;
 	rcu_assign_pointer(kvm->irq_routing, new);
 	kvm_irq_routing_update(kvm);
+	kvm_arch_irq_routing_update(kvm);
 	mutex_unlock(&kvm->irq_lock);
 
-	kvm_arch_irq_routing_update(kvm);
+	kvm_arch_post_irq_routing_update(kvm);
 
 	synchronize_srcu_expedited(&kvm->irq_srcu);
 
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [PATCH v4 1/5] kvm/irqchip: kvm_arch_irq_routing_update renaming split
@ 2015-11-10 12:36   ` Andrey Smetanin
  0 siblings, 0 replies; 38+ messages in thread
From: Andrey Smetanin @ 2015-11-10 12:36 UTC (permalink / raw)
  To: kvm; +Cc: Gleb Natapov, Denis V. Lunev, qemu-devel, Roman Kagan, Paolo Bonzini

Actually kvm_arch_irq_routing_update() should be
kvm_arch_post_irq_routing_update() as it's called at the end
of irq routing update.

This renaming frees kvm_arch_irq_routing_update function name.
kvm_arch_irq_routing_update() weak function which will be used
to update mappings for arch-specific irq routing entries
(in particular, the upcoming Hyper-V synthetic interrupts).

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org

---
 arch/x86/kvm/irq_comm.c  | 2 +-
 include/linux/kvm_host.h | 5 +++--
 virt/kvm/irqchip.c       | 7 ++++++-
 3 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c
index 84b96d3..e39768c 100644
--- a/arch/x86/kvm/irq_comm.c
+++ b/arch/x86/kvm/irq_comm.c
@@ -332,7 +332,7 @@ int kvm_setup_empty_irq_routing(struct kvm *kvm)
 	return kvm_set_irq_routing(kvm, empty_routing, 0, 0);
 }
 
-void kvm_arch_irq_routing_update(struct kvm *kvm)
+void kvm_arch_post_irq_routing_update(struct kvm *kvm)
 {
 	if (ioapic_in_kernel(kvm) || !irqchip_in_kernel(kvm))
 		return;
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index 242a6d2..dbe2a2f 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -473,12 +473,12 @@ void vcpu_put(struct kvm_vcpu *vcpu);
 
 #ifdef __KVM_HAVE_IOAPIC
 void kvm_vcpu_request_scan_ioapic(struct kvm *kvm);
-void kvm_arch_irq_routing_update(struct kvm *kvm);
+void kvm_arch_post_irq_routing_update(struct kvm *kvm);
 #else
 static inline void kvm_vcpu_request_scan_ioapic(struct kvm *kvm)
 {
 }
-static inline void kvm_arch_irq_routing_update(struct kvm *kvm)
+static inline void kvm_arch_post_irq_routing_update(struct kvm *kvm)
 {
 }
 #endif
@@ -1080,6 +1080,7 @@ static inline void kvm_irq_routing_update(struct kvm *kvm)
 {
 }
 #endif
+void kvm_arch_irq_routing_update(struct kvm *kvm);
 
 static inline int kvm_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
 {
diff --git a/virt/kvm/irqchip.c b/virt/kvm/irqchip.c
index f0b08a2..fe84e1a 100644
--- a/virt/kvm/irqchip.c
+++ b/virt/kvm/irqchip.c
@@ -166,6 +166,10 @@ out:
 	return r;
 }
 
+void __attribute__((weak)) kvm_arch_irq_routing_update(struct kvm *kvm)
+{
+}
+
 int kvm_set_irq_routing(struct kvm *kvm,
 			const struct kvm_irq_routing_entry *ue,
 			unsigned nr,
@@ -219,9 +223,10 @@ int kvm_set_irq_routing(struct kvm *kvm,
 	old = kvm->irq_routing;
 	rcu_assign_pointer(kvm->irq_routing, new);
 	kvm_irq_routing_update(kvm);
+	kvm_arch_irq_routing_update(kvm);
 	mutex_unlock(&kvm->irq_lock);
 
-	kvm_arch_irq_routing_update(kvm);
+	kvm_arch_post_irq_routing_update(kvm);
 
 	synchronize_srcu_expedited(&kvm->irq_srcu);
 
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v4 2/5] kvm/x86: split ioapic-handled and EOI exit bitmaps
  2015-11-10 12:36 ` [Qemu-devel] " Andrey Smetanin
@ 2015-11-10 12:36   ` Andrey Smetanin
  -1 siblings, 0 replies; 38+ messages in thread
From: Andrey Smetanin @ 2015-11-10 12:36 UTC (permalink / raw)
  To: kvm; +Cc: Denis V. Lunev, Gleb Natapov, Paolo Bonzini, Roman Kagan, qemu-devel

The function to determine if the vector is handled by ioapic used to
rely on the fact that only ioapic-handled vectors were set up to
cause vmexits when virtual apic was in use.

We're going to break this assumption when introducing Hyper-V
synthetic interrupts: they may need to cause vmexits too.

To achieve that, introduce a new bitmap dedicated specifically for
ioapic-handled vectors, and populate EOI exit bitmap from it for now.

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org

---
 arch/x86/include/asm/kvm_host.h |  4 ++--
 arch/x86/kvm/ioapic.c           |  4 ++--
 arch/x86/kvm/ioapic.h           |  7 ++++---
 arch/x86/kvm/irq_comm.c         |  5 +++--
 arch/x86/kvm/lapic.c            |  2 +-
 arch/x86/kvm/svm.c              |  2 +-
 arch/x86/kvm/vmx.c              |  3 +--
 arch/x86/kvm/x86.c              | 11 ++++++-----
 8 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 9265196..d51a7e1d 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -400,7 +400,7 @@ struct kvm_vcpu_arch {
 	u64 efer;
 	u64 apic_base;
 	struct kvm_lapic *apic;    /* kernel irqchip context */
-	u64 eoi_exit_bitmap[4];
+	DECLARE_BITMAP(ioapic_handled_vectors, 256);
 	unsigned long apic_attention;
 	int32_t apic_arb_prio;
 	int mp_state;
@@ -833,7 +833,7 @@ struct kvm_x86_ops {
 	int (*cpu_uses_apicv)(struct kvm_vcpu *vcpu);
 	void (*hwapic_irr_update)(struct kvm_vcpu *vcpu, int max_irr);
 	void (*hwapic_isr_update)(struct kvm *kvm, int isr);
-	void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu);
+	void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap);
 	void (*set_virtual_x2apic_mode)(struct kvm_vcpu *vcpu, bool set);
 	void (*set_apic_access_page_addr)(struct kvm_vcpu *vcpu, hpa_t hpa);
 	void (*deliver_posted_interrupt)(struct kvm_vcpu *vcpu, int vector);
diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c
index 88d0a92..1facfd6 100644
--- a/arch/x86/kvm/ioapic.c
+++ b/arch/x86/kvm/ioapic.c
@@ -233,7 +233,7 @@ static void kvm_ioapic_inject_all(struct kvm_ioapic *ioapic, unsigned long irr)
 }
 
 
-void kvm_ioapic_scan_entry(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
+void kvm_ioapic_scan_entry(struct kvm_vcpu *vcpu, ulong *ioapic_handled_vectors)
 {
 	struct kvm_ioapic *ioapic = vcpu->kvm->arch.vioapic;
 	union kvm_ioapic_redirect_entry *e;
@@ -250,7 +250,7 @@ void kvm_ioapic_scan_entry(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
 			    (e->fields.trig_mode == IOAPIC_EDGE_TRIG &&
 			     kvm_apic_pending_eoi(vcpu, e->fields.vector)))
 				__set_bit(e->fields.vector,
-					(unsigned long *)eoi_exit_bitmap);
+					  ioapic_handled_vectors);
 		}
 	}
 	spin_unlock(&ioapic->lock);
diff --git a/arch/x86/kvm/ioapic.h b/arch/x86/kvm/ioapic.h
index 084617d..2d16dc2 100644
--- a/arch/x86/kvm/ioapic.h
+++ b/arch/x86/kvm/ioapic.h
@@ -121,7 +121,8 @@ int kvm_irq_delivery_to_apic(struct kvm *kvm, struct kvm_lapic *src,
 		struct kvm_lapic_irq *irq, unsigned long *dest_map);
 int kvm_get_ioapic(struct kvm *kvm, struct kvm_ioapic_state *state);
 int kvm_set_ioapic(struct kvm *kvm, struct kvm_ioapic_state *state);
-void kvm_ioapic_scan_entry(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap);
-void kvm_scan_ioapic_routes(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap);
-
+void kvm_ioapic_scan_entry(struct kvm_vcpu *vcpu,
+			   ulong *ioapic_handled_vectors);
+void kvm_scan_ioapic_routes(struct kvm_vcpu *vcpu,
+			    ulong *ioapic_handled_vectors);
 #endif
diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c
index e39768c..ece901c 100644
--- a/arch/x86/kvm/irq_comm.c
+++ b/arch/x86/kvm/irq_comm.c
@@ -339,7 +339,8 @@ void kvm_arch_post_irq_routing_update(struct kvm *kvm)
 	kvm_make_scan_ioapic_request(kvm);
 }
 
-void kvm_scan_ioapic_routes(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
+void kvm_scan_ioapic_routes(struct kvm_vcpu *vcpu,
+			    ulong *ioapic_handled_vectors)
 {
 	struct kvm *kvm = vcpu->kvm;
 	struct kvm_kernel_irq_routing_entry *entry;
@@ -369,7 +370,7 @@ void kvm_scan_ioapic_routes(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
 				u32 vector = entry->msi.data & 0xff;
 
 				__set_bit(vector,
-					  (unsigned long *) eoi_exit_bitmap);
+					  ioapic_handled_vectors);
 			}
 		}
 	}
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index ecd4ea1..b14436d 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -932,7 +932,7 @@ int kvm_apic_compare_prio(struct kvm_vcpu *vcpu1, struct kvm_vcpu *vcpu2)
 
 static bool kvm_ioapic_handles_vector(struct kvm_lapic *apic, int vector)
 {
-	return test_bit(vector, (ulong *)apic->vcpu->arch.eoi_exit_bitmap);
+	return test_bit(vector, apic->vcpu->arch.ioapic_handled_vectors);
 }
 
 static void kvm_ioapic_send_eoi(struct kvm_lapic *apic, int vector)
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index f2c8e49..38d3adf 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3655,7 +3655,7 @@ static int svm_cpu_uses_apicv(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
-static void svm_load_eoi_exitmap(struct kvm_vcpu *vcpu)
+static void svm_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
 {
 	return;
 }
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 5eb56ed..9562542 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -8255,9 +8255,8 @@ static void vmx_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr)
 	}
 }
 
-static void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu)
+static void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
 {
-	u64 *eoi_exit_bitmap = vcpu->arch.eoi_exit_bitmap;
 	if (!vmx_cpu_uses_apicv(vcpu))
 		return;
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 30723a4..4cf2cb9 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6198,15 +6198,16 @@ static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
 	if (!kvm_apic_hw_enabled(vcpu->arch.apic))
 		return;
 
-	memset(vcpu->arch.eoi_exit_bitmap, 0, 256 / 8);
+	bitmap_zero(vcpu->arch.ioapic_handled_vectors, 256);
 
 	if (irqchip_split(vcpu->kvm))
-		kvm_scan_ioapic_routes(vcpu, vcpu->arch.eoi_exit_bitmap);
+		kvm_scan_ioapic_routes(vcpu, vcpu->arch.ioapic_handled_vectors);
 	else {
 		kvm_x86_ops->sync_pir_to_irr(vcpu);
-		kvm_ioapic_scan_entry(vcpu, vcpu->arch.eoi_exit_bitmap);
+		kvm_ioapic_scan_entry(vcpu, vcpu->arch.ioapic_handled_vectors);
 	}
-	kvm_x86_ops->load_eoi_exitmap(vcpu);
+	kvm_x86_ops->load_eoi_exitmap(vcpu,
+				      (u64 *)vcpu->arch.ioapic_handled_vectors);
 }
 
 static void kvm_vcpu_flush_tlb(struct kvm_vcpu *vcpu)
@@ -6312,7 +6313,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 		if (kvm_check_request(KVM_REQ_IOAPIC_EOI_EXIT, vcpu)) {
 			BUG_ON(vcpu->arch.pending_ioapic_eoi > 255);
 			if (test_bit(vcpu->arch.pending_ioapic_eoi,
-				     (void *) vcpu->arch.eoi_exit_bitmap)) {
+				     vcpu->arch.ioapic_handled_vectors)) {
 				vcpu->run->exit_reason = KVM_EXIT_IOAPIC_EOI;
 				vcpu->run->eoi.vector =
 						vcpu->arch.pending_ioapic_eoi;
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [PATCH v4 2/5] kvm/x86: split ioapic-handled and EOI exit bitmaps
@ 2015-11-10 12:36   ` Andrey Smetanin
  0 siblings, 0 replies; 38+ messages in thread
From: Andrey Smetanin @ 2015-11-10 12:36 UTC (permalink / raw)
  To: kvm; +Cc: Gleb Natapov, Denis V. Lunev, qemu-devel, Roman Kagan, Paolo Bonzini

The function to determine if the vector is handled by ioapic used to
rely on the fact that only ioapic-handled vectors were set up to
cause vmexits when virtual apic was in use.

We're going to break this assumption when introducing Hyper-V
synthetic interrupts: they may need to cause vmexits too.

To achieve that, introduce a new bitmap dedicated specifically for
ioapic-handled vectors, and populate EOI exit bitmap from it for now.

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org

---
 arch/x86/include/asm/kvm_host.h |  4 ++--
 arch/x86/kvm/ioapic.c           |  4 ++--
 arch/x86/kvm/ioapic.h           |  7 ++++---
 arch/x86/kvm/irq_comm.c         |  5 +++--
 arch/x86/kvm/lapic.c            |  2 +-
 arch/x86/kvm/svm.c              |  2 +-
 arch/x86/kvm/vmx.c              |  3 +--
 arch/x86/kvm/x86.c              | 11 ++++++-----
 8 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 9265196..d51a7e1d 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -400,7 +400,7 @@ struct kvm_vcpu_arch {
 	u64 efer;
 	u64 apic_base;
 	struct kvm_lapic *apic;    /* kernel irqchip context */
-	u64 eoi_exit_bitmap[4];
+	DECLARE_BITMAP(ioapic_handled_vectors, 256);
 	unsigned long apic_attention;
 	int32_t apic_arb_prio;
 	int mp_state;
@@ -833,7 +833,7 @@ struct kvm_x86_ops {
 	int (*cpu_uses_apicv)(struct kvm_vcpu *vcpu);
 	void (*hwapic_irr_update)(struct kvm_vcpu *vcpu, int max_irr);
 	void (*hwapic_isr_update)(struct kvm *kvm, int isr);
-	void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu);
+	void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap);
 	void (*set_virtual_x2apic_mode)(struct kvm_vcpu *vcpu, bool set);
 	void (*set_apic_access_page_addr)(struct kvm_vcpu *vcpu, hpa_t hpa);
 	void (*deliver_posted_interrupt)(struct kvm_vcpu *vcpu, int vector);
diff --git a/arch/x86/kvm/ioapic.c b/arch/x86/kvm/ioapic.c
index 88d0a92..1facfd6 100644
--- a/arch/x86/kvm/ioapic.c
+++ b/arch/x86/kvm/ioapic.c
@@ -233,7 +233,7 @@ static void kvm_ioapic_inject_all(struct kvm_ioapic *ioapic, unsigned long irr)
 }
 
 
-void kvm_ioapic_scan_entry(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
+void kvm_ioapic_scan_entry(struct kvm_vcpu *vcpu, ulong *ioapic_handled_vectors)
 {
 	struct kvm_ioapic *ioapic = vcpu->kvm->arch.vioapic;
 	union kvm_ioapic_redirect_entry *e;
@@ -250,7 +250,7 @@ void kvm_ioapic_scan_entry(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
 			    (e->fields.trig_mode == IOAPIC_EDGE_TRIG &&
 			     kvm_apic_pending_eoi(vcpu, e->fields.vector)))
 				__set_bit(e->fields.vector,
-					(unsigned long *)eoi_exit_bitmap);
+					  ioapic_handled_vectors);
 		}
 	}
 	spin_unlock(&ioapic->lock);
diff --git a/arch/x86/kvm/ioapic.h b/arch/x86/kvm/ioapic.h
index 084617d..2d16dc2 100644
--- a/arch/x86/kvm/ioapic.h
+++ b/arch/x86/kvm/ioapic.h
@@ -121,7 +121,8 @@ int kvm_irq_delivery_to_apic(struct kvm *kvm, struct kvm_lapic *src,
 		struct kvm_lapic_irq *irq, unsigned long *dest_map);
 int kvm_get_ioapic(struct kvm *kvm, struct kvm_ioapic_state *state);
 int kvm_set_ioapic(struct kvm *kvm, struct kvm_ioapic_state *state);
-void kvm_ioapic_scan_entry(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap);
-void kvm_scan_ioapic_routes(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap);
-
+void kvm_ioapic_scan_entry(struct kvm_vcpu *vcpu,
+			   ulong *ioapic_handled_vectors);
+void kvm_scan_ioapic_routes(struct kvm_vcpu *vcpu,
+			    ulong *ioapic_handled_vectors);
 #endif
diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c
index e39768c..ece901c 100644
--- a/arch/x86/kvm/irq_comm.c
+++ b/arch/x86/kvm/irq_comm.c
@@ -339,7 +339,8 @@ void kvm_arch_post_irq_routing_update(struct kvm *kvm)
 	kvm_make_scan_ioapic_request(kvm);
 }
 
-void kvm_scan_ioapic_routes(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
+void kvm_scan_ioapic_routes(struct kvm_vcpu *vcpu,
+			    ulong *ioapic_handled_vectors)
 {
 	struct kvm *kvm = vcpu->kvm;
 	struct kvm_kernel_irq_routing_entry *entry;
@@ -369,7 +370,7 @@ void kvm_scan_ioapic_routes(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
 				u32 vector = entry->msi.data & 0xff;
 
 				__set_bit(vector,
-					  (unsigned long *) eoi_exit_bitmap);
+					  ioapic_handled_vectors);
 			}
 		}
 	}
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index ecd4ea1..b14436d 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -932,7 +932,7 @@ int kvm_apic_compare_prio(struct kvm_vcpu *vcpu1, struct kvm_vcpu *vcpu2)
 
 static bool kvm_ioapic_handles_vector(struct kvm_lapic *apic, int vector)
 {
-	return test_bit(vector, (ulong *)apic->vcpu->arch.eoi_exit_bitmap);
+	return test_bit(vector, apic->vcpu->arch.ioapic_handled_vectors);
 }
 
 static void kvm_ioapic_send_eoi(struct kvm_lapic *apic, int vector)
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index f2c8e49..38d3adf 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3655,7 +3655,7 @@ static int svm_cpu_uses_apicv(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
-static void svm_load_eoi_exitmap(struct kvm_vcpu *vcpu)
+static void svm_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
 {
 	return;
 }
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 5eb56ed..9562542 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -8255,9 +8255,8 @@ static void vmx_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr)
 	}
 }
 
-static void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu)
+static void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
 {
-	u64 *eoi_exit_bitmap = vcpu->arch.eoi_exit_bitmap;
 	if (!vmx_cpu_uses_apicv(vcpu))
 		return;
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 30723a4..4cf2cb9 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6198,15 +6198,16 @@ static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
 	if (!kvm_apic_hw_enabled(vcpu->arch.apic))
 		return;
 
-	memset(vcpu->arch.eoi_exit_bitmap, 0, 256 / 8);
+	bitmap_zero(vcpu->arch.ioapic_handled_vectors, 256);
 
 	if (irqchip_split(vcpu->kvm))
-		kvm_scan_ioapic_routes(vcpu, vcpu->arch.eoi_exit_bitmap);
+		kvm_scan_ioapic_routes(vcpu, vcpu->arch.ioapic_handled_vectors);
 	else {
 		kvm_x86_ops->sync_pir_to_irr(vcpu);
-		kvm_ioapic_scan_entry(vcpu, vcpu->arch.eoi_exit_bitmap);
+		kvm_ioapic_scan_entry(vcpu, vcpu->arch.ioapic_handled_vectors);
 	}
-	kvm_x86_ops->load_eoi_exitmap(vcpu);
+	kvm_x86_ops->load_eoi_exitmap(vcpu,
+				      (u64 *)vcpu->arch.ioapic_handled_vectors);
 }
 
 static void kvm_vcpu_flush_tlb(struct kvm_vcpu *vcpu)
@@ -6312,7 +6313,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 		if (kvm_check_request(KVM_REQ_IOAPIC_EOI_EXIT, vcpu)) {
 			BUG_ON(vcpu->arch.pending_ioapic_eoi > 255);
 			if (test_bit(vcpu->arch.pending_ioapic_eoi,
-				     (void *) vcpu->arch.eoi_exit_bitmap)) {
+				     vcpu->arch.ioapic_handled_vectors)) {
 				vcpu->run->exit_reason = KVM_EXIT_IOAPIC_EOI;
 				vcpu->run->eoi.vector =
 						vcpu->arch.pending_ioapic_eoi;
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v4 3/5] kvm/x86: per-vcpu apicv deactivation support
  2015-11-10 12:36 ` [Qemu-devel] " Andrey Smetanin
@ 2015-11-10 12:36   ` Andrey Smetanin
  -1 siblings, 0 replies; 38+ messages in thread
From: Andrey Smetanin @ 2015-11-10 12:36 UTC (permalink / raw)
  To: kvm; +Cc: Denis V. Lunev, Gleb Natapov, Paolo Bonzini, Roman Kagan, qemu-devel

The decision on whether to use hardware APIC virtualization used to be
taken globally, based on the availability of the feature in the CPU
and the value of a module parameter.

However, under certain circumstances we want to control it on per-vcpu
basis.  In particular, when the userspace activates HyperV synthetic
interrupt controller (SynIC), APICv has to be disabled as it's
incompatible with SynIC auto-EOI behavior.

To achieve that, introduce 'apicv_active' flag on struct
kvm_vcpu_arch, and kvm_vcpu_deactivate_apicv() function to turn APICv
off.  The flag is initialized based on the module parameter and CPU
capability, and consulted whenever an APICv-specific action is
performed.

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org

---
 arch/x86/include/asm/kvm_host.h |  6 +++++-
 arch/x86/kvm/irq.c              |  2 +-
 arch/x86/kvm/lapic.c            | 23 +++++++++++----------
 arch/x86/kvm/lapic.h            |  4 ++--
 arch/x86/kvm/svm.c              | 11 +++++++---
 arch/x86/kvm/vmx.c              | 45 +++++++++++++++++------------------------
 arch/x86/kvm/x86.c              | 19 ++++++++++++++---
 7 files changed, 63 insertions(+), 47 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index d51a7e1d..a60a461 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -400,6 +400,7 @@ struct kvm_vcpu_arch {
 	u64 efer;
 	u64 apic_base;
 	struct kvm_lapic *apic;    /* kernel irqchip context */
+	bool apicv_active;
 	DECLARE_BITMAP(ioapic_handled_vectors, 256);
 	unsigned long apic_attention;
 	int32_t apic_arb_prio;
@@ -830,7 +831,8 @@ struct kvm_x86_ops {
 	void (*enable_nmi_window)(struct kvm_vcpu *vcpu);
 	void (*enable_irq_window)(struct kvm_vcpu *vcpu);
 	void (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, int irr);
-	int (*cpu_uses_apicv)(struct kvm_vcpu *vcpu);
+	bool (*get_enable_apicv)(void);
+	void (*refresh_apicv_exec_ctrl)(struct kvm_vcpu *vcpu);
 	void (*hwapic_irr_update)(struct kvm_vcpu *vcpu, int max_irr);
 	void (*hwapic_isr_update)(struct kvm *kvm, int isr);
 	void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap);
@@ -1096,6 +1098,8 @@ gpa_t kvm_mmu_gva_to_gpa_write(struct kvm_vcpu *vcpu, gva_t gva,
 gpa_t kvm_mmu_gva_to_gpa_system(struct kvm_vcpu *vcpu, gva_t gva,
 				struct x86_exception *exception);
 
+void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu);
+
 int kvm_emulate_hypercall(struct kvm_vcpu *vcpu);
 
 int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gva_t gva, u32 error_code,
diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
index 097060e..3982b47 100644
--- a/arch/x86/kvm/irq.c
+++ b/arch/x86/kvm/irq.c
@@ -76,7 +76,7 @@ int kvm_cpu_has_injectable_intr(struct kvm_vcpu *v)
 	if (kvm_cpu_has_extint(v))
 		return 1;
 
-	if (kvm_vcpu_apic_vid_enabled(v))
+	if (kvm_vcpu_apicv_active(v))
 		return 0;
 
 	return kvm_apic_has_interrupt(v) != -1; /* LAPIC */
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index b14436d..14d6fcc 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -379,7 +379,8 @@ static inline int apic_find_highest_irr(struct kvm_lapic *apic)
 	if (!apic->irr_pending)
 		return -1;
 
-	kvm_x86_ops->sync_pir_to_irr(apic->vcpu);
+	if (apic->vcpu->arch.apicv_active)
+		kvm_x86_ops->sync_pir_to_irr(apic->vcpu);
 	result = apic_search_irr(apic);
 	ASSERT(result == -1 || result >= 16);
 
@@ -392,7 +393,7 @@ static inline void apic_clear_irr(int vec, struct kvm_lapic *apic)
 
 	vcpu = apic->vcpu;
 
-	if (unlikely(kvm_vcpu_apic_vid_enabled(vcpu))) {
+	if (unlikely(vcpu->arch.apicv_active)) {
 		/* try to update RVI */
 		apic_clear_vector(vec, apic->regs + APIC_IRR);
 		kvm_make_request(KVM_REQ_EVENT, vcpu);
@@ -418,7 +419,7 @@ static inline void apic_set_isr(int vec, struct kvm_lapic *apic)
 	 * because the processor can modify ISR under the hood.  Instead
 	 * just set SVI.
 	 */
-	if (unlikely(kvm_x86_ops->hwapic_isr_update))
+	if (unlikely(vcpu->arch.apicv_active))
 		kvm_x86_ops->hwapic_isr_update(vcpu->kvm, vec);
 	else {
 		++apic->isr_count;
@@ -466,7 +467,7 @@ static inline void apic_clear_isr(int vec, struct kvm_lapic *apic)
 	 * on the other hand isr_count and highest_isr_cache are unused
 	 * and must be left alone.
 	 */
-	if (unlikely(kvm_x86_ops->hwapic_isr_update))
+	if (unlikely(vcpu->arch.apicv_active))
 		kvm_x86_ops->hwapic_isr_update(vcpu->kvm,
 					       apic_find_highest_isr(apic));
 	else {
@@ -852,7 +853,7 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
 				apic_clear_vector(vector, apic->regs + APIC_TMR);
 		}
 
-		if (kvm_x86_ops->deliver_posted_interrupt)
+		if (vcpu->arch.apicv_active)
 			kvm_x86_ops->deliver_posted_interrupt(vcpu, vector);
 		else {
 			apic_set_irr(vector, apic);
@@ -1225,7 +1226,7 @@ static bool lapic_timer_int_injected(struct kvm_vcpu *vcpu)
 		int vec = reg & APIC_VECTOR_MASK;
 		void *bitmap = apic->regs + APIC_ISR;
 
-		if (kvm_x86_ops->deliver_posted_interrupt)
+		if (vcpu->arch.apicv_active)
 			bitmap = apic->regs + APIC_IRR;
 
 		if (apic_test_vector(vec, bitmap))
@@ -1693,8 +1694,8 @@ void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event)
 		apic_set_reg(apic, APIC_ISR + 0x10 * i, 0);
 		apic_set_reg(apic, APIC_TMR + 0x10 * i, 0);
 	}
-	apic->irr_pending = kvm_vcpu_apic_vid_enabled(vcpu);
-	apic->isr_count = kvm_x86_ops->hwapic_isr_update ? 1 : 0;
+	apic->irr_pending = vcpu->arch.apicv_active;
+	apic->isr_count = vcpu->arch.apicv_active ? 1 : 0;
 	apic->highest_isr_cache = -1;
 	update_divide_count(apic);
 	atomic_set(&apic->lapic_timer.pending, 0);
@@ -1906,15 +1907,15 @@ void kvm_apic_post_state_restore(struct kvm_vcpu *vcpu,
 	update_divide_count(apic);
 	start_apic_timer(apic);
 	apic->irr_pending = true;
-	apic->isr_count = kvm_x86_ops->hwapic_isr_update ?
+	apic->isr_count = vcpu->arch.apicv_active ?
 				1 : count_vectors(apic->regs + APIC_ISR);
 	apic->highest_isr_cache = -1;
-	if (kvm_x86_ops->hwapic_irr_update)
+	if (vcpu->arch.apicv_active) {
 		kvm_x86_ops->hwapic_irr_update(vcpu,
 				apic_find_highest_irr(apic));
-	if (unlikely(kvm_x86_ops->hwapic_isr_update))
 		kvm_x86_ops->hwapic_isr_update(vcpu->kvm,
 				apic_find_highest_isr(apic));
+	}
 	kvm_make_request(KVM_REQ_EVENT, vcpu);
 	if (ioapic_in_kernel(vcpu->kvm))
 		kvm_rtc_eoi_tracking_restore_one(vcpu);
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index fde8e35d..5fc60e4 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -143,9 +143,9 @@ static inline int apic_x2apic_mode(struct kvm_lapic *apic)
 	return apic->vcpu->arch.apic_base & X2APIC_ENABLE;
 }
 
-static inline bool kvm_vcpu_apic_vid_enabled(struct kvm_vcpu *vcpu)
+static inline bool kvm_vcpu_apicv_active(struct kvm_vcpu *vcpu)
 {
-	return kvm_x86_ops->cpu_uses_apicv(vcpu);
+	return vcpu->arch.apic && vcpu->arch.apicv_active;
 }
 
 static inline bool kvm_apic_has_events(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 38d3adf..07b19cf 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3650,9 +3650,13 @@ static void svm_set_virtual_x2apic_mode(struct kvm_vcpu *vcpu, bool set)
 	return;
 }
 
-static int svm_cpu_uses_apicv(struct kvm_vcpu *vcpu)
+static bool svm_get_enable_apicv(void)
+{
+	return false;
+}
+
+static void svm_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
 {
-	return 0;
 }
 
 static void svm_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
@@ -4420,7 +4424,8 @@ static struct kvm_x86_ops svm_x86_ops = {
 	.enable_irq_window = enable_irq_window,
 	.update_cr8_intercept = update_cr8_intercept,
 	.set_virtual_x2apic_mode = svm_set_virtual_x2apic_mode,
-	.cpu_uses_apicv = svm_cpu_uses_apicv,
+	.get_enable_apicv = svm_get_enable_apicv,
+	.refresh_apicv_exec_ctrl = svm_refresh_apicv_exec_ctrl,
 	.load_eoi_exitmap = svm_load_eoi_exitmap,
 	.sync_pir_to_irr = svm_sync_pir_to_irr,
 
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 9562542..bf7488e 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -19,6 +19,7 @@
 #include "irq.h"
 #include "mmu.h"
 #include "cpuid.h"
+#include "lapic.h"
 
 #include <linux/kvm_host.h>
 #include <linux/module.h>
@@ -860,7 +861,6 @@ static void kvm_cpu_vmxon(u64 addr);
 static void kvm_cpu_vmxoff(void);
 static bool vmx_mpx_supported(void);
 static bool vmx_xsaves_supported(void);
-static int vmx_cpu_uses_apicv(struct kvm_vcpu *vcpu);
 static int vmx_set_tss_addr(struct kvm *kvm, unsigned int addr);
 static void vmx_set_segment(struct kvm_vcpu *vcpu,
 			    struct kvm_segment *var, int seg);
@@ -868,7 +868,6 @@ static void vmx_get_segment(struct kvm_vcpu *vcpu,
 			    struct kvm_segment *var, int seg);
 static bool guest_state_valid(struct kvm_vcpu *vcpu);
 static u32 vmx_segment_access_rights(struct kvm_segment *var);
-static void vmx_sync_pir_to_irr_dummy(struct kvm_vcpu *vcpu);
 static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx);
 static void copy_shadow_to_vmcs12(struct vcpu_vmx *vmx);
 static int alloc_identity_pagetable(struct kvm *kvm);
@@ -2504,7 +2503,7 @@ static void nested_vmx_setup_ctls_msrs(struct vcpu_vmx *vmx)
 	vmx->nested.nested_vmx_pinbased_ctls_high |=
 		PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR |
 		PIN_BASED_VMX_PREEMPTION_TIMER;
-	if (vmx_cpu_uses_apicv(&vmx->vcpu))
+	if (kvm_vcpu_apicv_active(&vmx->vcpu))
 		vmx->nested.nested_vmx_pinbased_ctls_high |=
 			PIN_BASED_POSTED_INTR;
 
@@ -4467,9 +4466,9 @@ static void vmx_disable_intercept_msr_write_x2apic(u32 msr)
 			msr, MSR_TYPE_W);
 }
 
-static int vmx_cpu_uses_apicv(struct kvm_vcpu *vcpu)
+static bool vmx_get_enable_apicv(void)
 {
-	return enable_apicv && lapic_in_kernel(vcpu);
+	return enable_apicv;
 }
 
 static int vmx_complete_nested_posted_interrupt(struct kvm_vcpu *vcpu)
@@ -4591,11 +4590,6 @@ static void vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
 	kvm_apic_update_irr(vcpu, vmx->pi_desc.pir);
 }
 
-static void vmx_sync_pir_to_irr_dummy(struct kvm_vcpu *vcpu)
-{
-	return;
-}
-
 /*
  * Set up the vmcs's constant host-state fields, i.e., host-state fields that
  * will not change in the lifetime of the guest.
@@ -4665,11 +4659,18 @@ static u32 vmx_pin_based_exec_ctrl(struct vcpu_vmx *vmx)
 {
 	u32 pin_based_exec_ctrl = vmcs_config.pin_based_exec_ctrl;
 
-	if (!vmx_cpu_uses_apicv(&vmx->vcpu))
+	if (!kvm_vcpu_apicv_active(&vmx->vcpu))
 		pin_based_exec_ctrl &= ~PIN_BASED_POSTED_INTR;
 	return pin_based_exec_ctrl;
 }
 
+static void vmx_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
+
+	vmcs_write32(PIN_BASED_VM_EXEC_CONTROL, vmx_pin_based_exec_ctrl(vmx));
+}
+
 static u32 vmx_exec_control(struct vcpu_vmx *vmx)
 {
 	u32 exec_control = vmcs_config.cpu_based_exec_ctrl;
@@ -4708,7 +4709,7 @@ static u32 vmx_secondary_exec_control(struct vcpu_vmx *vmx)
 		exec_control &= ~SECONDARY_EXEC_UNRESTRICTED_GUEST;
 	if (!ple_gap)
 		exec_control &= ~SECONDARY_EXEC_PAUSE_LOOP_EXITING;
-	if (!vmx_cpu_uses_apicv(&vmx->vcpu))
+	if (!kvm_vcpu_apicv_active(&vmx->vcpu))
 		exec_control &= ~(SECONDARY_EXEC_APIC_REGISTER_VIRT |
 				  SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY);
 	exec_control &= ~SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE;
@@ -4772,7 +4773,7 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx)
 		vmcs_write32(SECONDARY_VM_EXEC_CONTROL,
 				vmx_secondary_exec_control(vmx));
 
-	if (vmx_cpu_uses_apicv(&vmx->vcpu)) {
+	if (kvm_vcpu_apicv_active(&vmx->vcpu)) {
 		vmcs_write64(EOI_EXIT_BITMAP0, 0);
 		vmcs_write64(EOI_EXIT_BITMAP1, 0);
 		vmcs_write64(EOI_EXIT_BITMAP2, 0);
@@ -4924,7 +4925,7 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 
 	kvm_make_request(KVM_REQ_APIC_PAGE_RELOAD, vcpu);
 
-	if (vmx_cpu_uses_apicv(vcpu))
+	if (kvm_vcpu_apicv_active(vcpu))
 		memset(&vmx->pi_desc, 0, sizeof(struct pi_desc));
 
 	if (vmx->vpid != 0)
@@ -6199,15 +6200,6 @@ static __init int hardware_setup(void)
 	if (!cpu_has_vmx_apicv())
 		enable_apicv = 0;
 
-	if (enable_apicv)
-		kvm_x86_ops->update_cr8_intercept = NULL;
-	else {
-		kvm_x86_ops->hwapic_irr_update = NULL;
-		kvm_x86_ops->hwapic_isr_update = NULL;
-		kvm_x86_ops->deliver_posted_interrupt = NULL;
-		kvm_x86_ops->sync_pir_to_irr = vmx_sync_pir_to_irr_dummy;
-	}
-
 	vmx_disable_intercept_for_msr(MSR_FS_BASE, false);
 	vmx_disable_intercept_for_msr(MSR_GS_BASE, false);
 	vmx_disable_intercept_for_msr(MSR_KERNEL_GS_BASE, true);
@@ -8150,7 +8142,7 @@ static void vmx_set_virtual_x2apic_mode(struct kvm_vcpu *vcpu, bool set)
 	 * apicv
 	 */
 	if (!cpu_has_vmx_virtualize_x2apic_mode() ||
-				!vmx_cpu_uses_apicv(vcpu))
+				!kvm_vcpu_apicv_active(vcpu))
 		return;
 
 	if (!cpu_need_tpr_shadow(vcpu))
@@ -8257,7 +8249,7 @@ static void vmx_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr)
 
 static void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
 {
-	if (!vmx_cpu_uses_apicv(vcpu))
+	if (!kvm_vcpu_apicv_active(vcpu))
 		return;
 
 	vmcs_write64(EOI_EXIT_BITMAP0, eoi_exit_bitmap[0]);
@@ -10801,7 +10793,8 @@ static struct kvm_x86_ops vmx_x86_ops = {
 	.update_cr8_intercept = update_cr8_intercept,
 	.set_virtual_x2apic_mode = vmx_set_virtual_x2apic_mode,
 	.set_apic_access_page_addr = vmx_set_apic_access_page_addr,
-	.cpu_uses_apicv = vmx_cpu_uses_apicv,
+	.get_enable_apicv = vmx_get_enable_apicv,
+	.refresh_apicv_exec_ctrl = vmx_refresh_apicv_exec_ctrl,
 	.load_eoi_exitmap = vmx_load_eoi_exitmap,
 	.hwapic_irr_update = vmx_hwapic_irr_update,
 	.hwapic_isr_update = vmx_hwapic_isr_update,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 4cf2cb9..afb0a60 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2645,7 +2645,9 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,
 				    struct kvm_lapic_state *s)
 {
-	kvm_x86_ops->sync_pir_to_irr(vcpu);
+	if (vcpu->arch.apicv_active)
+		kvm_x86_ops->sync_pir_to_irr(vcpu);
+
 	memcpy(s->regs, vcpu->arch.apic->regs, sizeof *s);
 
 	return 0;
@@ -5743,6 +5745,12 @@ static void kvm_pv_kick_cpu_op(struct kvm *kvm, unsigned long flags, int apicid)
 	kvm_irq_delivery_to_apic(kvm, NULL, &lapic_irq, NULL);
 }
 
+void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu)
+{
+	vcpu->arch.apicv_active = false;
+	kvm_x86_ops->refresh_apicv_exec_ctrl(vcpu);
+}
+
 int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
 {
 	unsigned long nr, a0, a1, a2, a3, ret;
@@ -5857,6 +5865,9 @@ static void update_cr8_intercept(struct kvm_vcpu *vcpu)
 	if (!vcpu->arch.apic)
 		return;
 
+	if (vcpu->arch.apicv_active)
+		return;
+
 	if (!vcpu->arch.apic->vapic_addr)
 		max_irr = kvm_lapic_find_highest_irr(vcpu);
 	else
@@ -6203,7 +6214,8 @@ static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
 	if (irqchip_split(vcpu->kvm))
 		kvm_scan_ioapic_routes(vcpu, vcpu->arch.ioapic_handled_vectors);
 	else {
-		kvm_x86_ops->sync_pir_to_irr(vcpu);
+		if (vcpu->arch.apicv_active)
+			kvm_x86_ops->sync_pir_to_irr(vcpu);
 		kvm_ioapic_scan_entry(vcpu, vcpu->arch.ioapic_handled_vectors);
 	}
 	kvm_x86_ops->load_eoi_exitmap(vcpu,
@@ -6348,7 +6360,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 		 * Update architecture specific hints for APIC
 		 * virtual interrupt delivery.
 		 */
-		if (kvm_x86_ops->hwapic_irr_update)
+		if (vcpu->arch.apicv_active)
 			kvm_x86_ops->hwapic_irr_update(vcpu,
 				kvm_lapic_find_highest_irr(vcpu));
 	}
@@ -7405,6 +7417,7 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
 	BUG_ON(vcpu->kvm == NULL);
 	kvm = vcpu->kvm;
 
+	vcpu->arch.apicv_active = kvm_x86_ops->get_enable_apicv();
 	vcpu->arch.pv.pv_unhalted = false;
 	vcpu->arch.emulate_ctxt.ops = &emulate_ops;
 	if (!irqchip_in_kernel(kvm) || kvm_vcpu_is_reset_bsp(vcpu))
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [PATCH v4 3/5] kvm/x86: per-vcpu apicv deactivation support
@ 2015-11-10 12:36   ` Andrey Smetanin
  0 siblings, 0 replies; 38+ messages in thread
From: Andrey Smetanin @ 2015-11-10 12:36 UTC (permalink / raw)
  To: kvm; +Cc: Gleb Natapov, Denis V. Lunev, qemu-devel, Roman Kagan, Paolo Bonzini

The decision on whether to use hardware APIC virtualization used to be
taken globally, based on the availability of the feature in the CPU
and the value of a module parameter.

However, under certain circumstances we want to control it on per-vcpu
basis.  In particular, when the userspace activates HyperV synthetic
interrupt controller (SynIC), APICv has to be disabled as it's
incompatible with SynIC auto-EOI behavior.

To achieve that, introduce 'apicv_active' flag on struct
kvm_vcpu_arch, and kvm_vcpu_deactivate_apicv() function to turn APICv
off.  The flag is initialized based on the module parameter and CPU
capability, and consulted whenever an APICv-specific action is
performed.

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org

---
 arch/x86/include/asm/kvm_host.h |  6 +++++-
 arch/x86/kvm/irq.c              |  2 +-
 arch/x86/kvm/lapic.c            | 23 +++++++++++----------
 arch/x86/kvm/lapic.h            |  4 ++--
 arch/x86/kvm/svm.c              | 11 +++++++---
 arch/x86/kvm/vmx.c              | 45 +++++++++++++++++------------------------
 arch/x86/kvm/x86.c              | 19 ++++++++++++++---
 7 files changed, 63 insertions(+), 47 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index d51a7e1d..a60a461 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -400,6 +400,7 @@ struct kvm_vcpu_arch {
 	u64 efer;
 	u64 apic_base;
 	struct kvm_lapic *apic;    /* kernel irqchip context */
+	bool apicv_active;
 	DECLARE_BITMAP(ioapic_handled_vectors, 256);
 	unsigned long apic_attention;
 	int32_t apic_arb_prio;
@@ -830,7 +831,8 @@ struct kvm_x86_ops {
 	void (*enable_nmi_window)(struct kvm_vcpu *vcpu);
 	void (*enable_irq_window)(struct kvm_vcpu *vcpu);
 	void (*update_cr8_intercept)(struct kvm_vcpu *vcpu, int tpr, int irr);
-	int (*cpu_uses_apicv)(struct kvm_vcpu *vcpu);
+	bool (*get_enable_apicv)(void);
+	void (*refresh_apicv_exec_ctrl)(struct kvm_vcpu *vcpu);
 	void (*hwapic_irr_update)(struct kvm_vcpu *vcpu, int max_irr);
 	void (*hwapic_isr_update)(struct kvm *kvm, int isr);
 	void (*load_eoi_exitmap)(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap);
@@ -1096,6 +1098,8 @@ gpa_t kvm_mmu_gva_to_gpa_write(struct kvm_vcpu *vcpu, gva_t gva,
 gpa_t kvm_mmu_gva_to_gpa_system(struct kvm_vcpu *vcpu, gva_t gva,
 				struct x86_exception *exception);
 
+void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu);
+
 int kvm_emulate_hypercall(struct kvm_vcpu *vcpu);
 
 int kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gva_t gva, u32 error_code,
diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
index 097060e..3982b47 100644
--- a/arch/x86/kvm/irq.c
+++ b/arch/x86/kvm/irq.c
@@ -76,7 +76,7 @@ int kvm_cpu_has_injectable_intr(struct kvm_vcpu *v)
 	if (kvm_cpu_has_extint(v))
 		return 1;
 
-	if (kvm_vcpu_apic_vid_enabled(v))
+	if (kvm_vcpu_apicv_active(v))
 		return 0;
 
 	return kvm_apic_has_interrupt(v) != -1; /* LAPIC */
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index b14436d..14d6fcc 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -379,7 +379,8 @@ static inline int apic_find_highest_irr(struct kvm_lapic *apic)
 	if (!apic->irr_pending)
 		return -1;
 
-	kvm_x86_ops->sync_pir_to_irr(apic->vcpu);
+	if (apic->vcpu->arch.apicv_active)
+		kvm_x86_ops->sync_pir_to_irr(apic->vcpu);
 	result = apic_search_irr(apic);
 	ASSERT(result == -1 || result >= 16);
 
@@ -392,7 +393,7 @@ static inline void apic_clear_irr(int vec, struct kvm_lapic *apic)
 
 	vcpu = apic->vcpu;
 
-	if (unlikely(kvm_vcpu_apic_vid_enabled(vcpu))) {
+	if (unlikely(vcpu->arch.apicv_active)) {
 		/* try to update RVI */
 		apic_clear_vector(vec, apic->regs + APIC_IRR);
 		kvm_make_request(KVM_REQ_EVENT, vcpu);
@@ -418,7 +419,7 @@ static inline void apic_set_isr(int vec, struct kvm_lapic *apic)
 	 * because the processor can modify ISR under the hood.  Instead
 	 * just set SVI.
 	 */
-	if (unlikely(kvm_x86_ops->hwapic_isr_update))
+	if (unlikely(vcpu->arch.apicv_active))
 		kvm_x86_ops->hwapic_isr_update(vcpu->kvm, vec);
 	else {
 		++apic->isr_count;
@@ -466,7 +467,7 @@ static inline void apic_clear_isr(int vec, struct kvm_lapic *apic)
 	 * on the other hand isr_count and highest_isr_cache are unused
 	 * and must be left alone.
 	 */
-	if (unlikely(kvm_x86_ops->hwapic_isr_update))
+	if (unlikely(vcpu->arch.apicv_active))
 		kvm_x86_ops->hwapic_isr_update(vcpu->kvm,
 					       apic_find_highest_isr(apic));
 	else {
@@ -852,7 +853,7 @@ static int __apic_accept_irq(struct kvm_lapic *apic, int delivery_mode,
 				apic_clear_vector(vector, apic->regs + APIC_TMR);
 		}
 
-		if (kvm_x86_ops->deliver_posted_interrupt)
+		if (vcpu->arch.apicv_active)
 			kvm_x86_ops->deliver_posted_interrupt(vcpu, vector);
 		else {
 			apic_set_irr(vector, apic);
@@ -1225,7 +1226,7 @@ static bool lapic_timer_int_injected(struct kvm_vcpu *vcpu)
 		int vec = reg & APIC_VECTOR_MASK;
 		void *bitmap = apic->regs + APIC_ISR;
 
-		if (kvm_x86_ops->deliver_posted_interrupt)
+		if (vcpu->arch.apicv_active)
 			bitmap = apic->regs + APIC_IRR;
 
 		if (apic_test_vector(vec, bitmap))
@@ -1693,8 +1694,8 @@ void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event)
 		apic_set_reg(apic, APIC_ISR + 0x10 * i, 0);
 		apic_set_reg(apic, APIC_TMR + 0x10 * i, 0);
 	}
-	apic->irr_pending = kvm_vcpu_apic_vid_enabled(vcpu);
-	apic->isr_count = kvm_x86_ops->hwapic_isr_update ? 1 : 0;
+	apic->irr_pending = vcpu->arch.apicv_active;
+	apic->isr_count = vcpu->arch.apicv_active ? 1 : 0;
 	apic->highest_isr_cache = -1;
 	update_divide_count(apic);
 	atomic_set(&apic->lapic_timer.pending, 0);
@@ -1906,15 +1907,15 @@ void kvm_apic_post_state_restore(struct kvm_vcpu *vcpu,
 	update_divide_count(apic);
 	start_apic_timer(apic);
 	apic->irr_pending = true;
-	apic->isr_count = kvm_x86_ops->hwapic_isr_update ?
+	apic->isr_count = vcpu->arch.apicv_active ?
 				1 : count_vectors(apic->regs + APIC_ISR);
 	apic->highest_isr_cache = -1;
-	if (kvm_x86_ops->hwapic_irr_update)
+	if (vcpu->arch.apicv_active) {
 		kvm_x86_ops->hwapic_irr_update(vcpu,
 				apic_find_highest_irr(apic));
-	if (unlikely(kvm_x86_ops->hwapic_isr_update))
 		kvm_x86_ops->hwapic_isr_update(vcpu->kvm,
 				apic_find_highest_isr(apic));
+	}
 	kvm_make_request(KVM_REQ_EVENT, vcpu);
 	if (ioapic_in_kernel(vcpu->kvm))
 		kvm_rtc_eoi_tracking_restore_one(vcpu);
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index fde8e35d..5fc60e4 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -143,9 +143,9 @@ static inline int apic_x2apic_mode(struct kvm_lapic *apic)
 	return apic->vcpu->arch.apic_base & X2APIC_ENABLE;
 }
 
-static inline bool kvm_vcpu_apic_vid_enabled(struct kvm_vcpu *vcpu)
+static inline bool kvm_vcpu_apicv_active(struct kvm_vcpu *vcpu)
 {
-	return kvm_x86_ops->cpu_uses_apicv(vcpu);
+	return vcpu->arch.apic && vcpu->arch.apicv_active;
 }
 
 static inline bool kvm_apic_has_events(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 38d3adf..07b19cf 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3650,9 +3650,13 @@ static void svm_set_virtual_x2apic_mode(struct kvm_vcpu *vcpu, bool set)
 	return;
 }
 
-static int svm_cpu_uses_apicv(struct kvm_vcpu *vcpu)
+static bool svm_get_enable_apicv(void)
+{
+	return false;
+}
+
+static void svm_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
 {
-	return 0;
 }
 
 static void svm_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
@@ -4420,7 +4424,8 @@ static struct kvm_x86_ops svm_x86_ops = {
 	.enable_irq_window = enable_irq_window,
 	.update_cr8_intercept = update_cr8_intercept,
 	.set_virtual_x2apic_mode = svm_set_virtual_x2apic_mode,
-	.cpu_uses_apicv = svm_cpu_uses_apicv,
+	.get_enable_apicv = svm_get_enable_apicv,
+	.refresh_apicv_exec_ctrl = svm_refresh_apicv_exec_ctrl,
 	.load_eoi_exitmap = svm_load_eoi_exitmap,
 	.sync_pir_to_irr = svm_sync_pir_to_irr,
 
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 9562542..bf7488e 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -19,6 +19,7 @@
 #include "irq.h"
 #include "mmu.h"
 #include "cpuid.h"
+#include "lapic.h"
 
 #include <linux/kvm_host.h>
 #include <linux/module.h>
@@ -860,7 +861,6 @@ static void kvm_cpu_vmxon(u64 addr);
 static void kvm_cpu_vmxoff(void);
 static bool vmx_mpx_supported(void);
 static bool vmx_xsaves_supported(void);
-static int vmx_cpu_uses_apicv(struct kvm_vcpu *vcpu);
 static int vmx_set_tss_addr(struct kvm *kvm, unsigned int addr);
 static void vmx_set_segment(struct kvm_vcpu *vcpu,
 			    struct kvm_segment *var, int seg);
@@ -868,7 +868,6 @@ static void vmx_get_segment(struct kvm_vcpu *vcpu,
 			    struct kvm_segment *var, int seg);
 static bool guest_state_valid(struct kvm_vcpu *vcpu);
 static u32 vmx_segment_access_rights(struct kvm_segment *var);
-static void vmx_sync_pir_to_irr_dummy(struct kvm_vcpu *vcpu);
 static void copy_vmcs12_to_shadow(struct vcpu_vmx *vmx);
 static void copy_shadow_to_vmcs12(struct vcpu_vmx *vmx);
 static int alloc_identity_pagetable(struct kvm *kvm);
@@ -2504,7 +2503,7 @@ static void nested_vmx_setup_ctls_msrs(struct vcpu_vmx *vmx)
 	vmx->nested.nested_vmx_pinbased_ctls_high |=
 		PIN_BASED_ALWAYSON_WITHOUT_TRUE_MSR |
 		PIN_BASED_VMX_PREEMPTION_TIMER;
-	if (vmx_cpu_uses_apicv(&vmx->vcpu))
+	if (kvm_vcpu_apicv_active(&vmx->vcpu))
 		vmx->nested.nested_vmx_pinbased_ctls_high |=
 			PIN_BASED_POSTED_INTR;
 
@@ -4467,9 +4466,9 @@ static void vmx_disable_intercept_msr_write_x2apic(u32 msr)
 			msr, MSR_TYPE_W);
 }
 
-static int vmx_cpu_uses_apicv(struct kvm_vcpu *vcpu)
+static bool vmx_get_enable_apicv(void)
 {
-	return enable_apicv && lapic_in_kernel(vcpu);
+	return enable_apicv;
 }
 
 static int vmx_complete_nested_posted_interrupt(struct kvm_vcpu *vcpu)
@@ -4591,11 +4590,6 @@ static void vmx_sync_pir_to_irr(struct kvm_vcpu *vcpu)
 	kvm_apic_update_irr(vcpu, vmx->pi_desc.pir);
 }
 
-static void vmx_sync_pir_to_irr_dummy(struct kvm_vcpu *vcpu)
-{
-	return;
-}
-
 /*
  * Set up the vmcs's constant host-state fields, i.e., host-state fields that
  * will not change in the lifetime of the guest.
@@ -4665,11 +4659,18 @@ static u32 vmx_pin_based_exec_ctrl(struct vcpu_vmx *vmx)
 {
 	u32 pin_based_exec_ctrl = vmcs_config.pin_based_exec_ctrl;
 
-	if (!vmx_cpu_uses_apicv(&vmx->vcpu))
+	if (!kvm_vcpu_apicv_active(&vmx->vcpu))
 		pin_based_exec_ctrl &= ~PIN_BASED_POSTED_INTR;
 	return pin_based_exec_ctrl;
 }
 
+static void vmx_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
+{
+	struct vcpu_vmx *vmx = to_vmx(vcpu);
+
+	vmcs_write32(PIN_BASED_VM_EXEC_CONTROL, vmx_pin_based_exec_ctrl(vmx));
+}
+
 static u32 vmx_exec_control(struct vcpu_vmx *vmx)
 {
 	u32 exec_control = vmcs_config.cpu_based_exec_ctrl;
@@ -4708,7 +4709,7 @@ static u32 vmx_secondary_exec_control(struct vcpu_vmx *vmx)
 		exec_control &= ~SECONDARY_EXEC_UNRESTRICTED_GUEST;
 	if (!ple_gap)
 		exec_control &= ~SECONDARY_EXEC_PAUSE_LOOP_EXITING;
-	if (!vmx_cpu_uses_apicv(&vmx->vcpu))
+	if (!kvm_vcpu_apicv_active(&vmx->vcpu))
 		exec_control &= ~(SECONDARY_EXEC_APIC_REGISTER_VIRT |
 				  SECONDARY_EXEC_VIRTUAL_INTR_DELIVERY);
 	exec_control &= ~SECONDARY_EXEC_VIRTUALIZE_X2APIC_MODE;
@@ -4772,7 +4773,7 @@ static int vmx_vcpu_setup(struct vcpu_vmx *vmx)
 		vmcs_write32(SECONDARY_VM_EXEC_CONTROL,
 				vmx_secondary_exec_control(vmx));
 
-	if (vmx_cpu_uses_apicv(&vmx->vcpu)) {
+	if (kvm_vcpu_apicv_active(&vmx->vcpu)) {
 		vmcs_write64(EOI_EXIT_BITMAP0, 0);
 		vmcs_write64(EOI_EXIT_BITMAP1, 0);
 		vmcs_write64(EOI_EXIT_BITMAP2, 0);
@@ -4924,7 +4925,7 @@ static void vmx_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
 
 	kvm_make_request(KVM_REQ_APIC_PAGE_RELOAD, vcpu);
 
-	if (vmx_cpu_uses_apicv(vcpu))
+	if (kvm_vcpu_apicv_active(vcpu))
 		memset(&vmx->pi_desc, 0, sizeof(struct pi_desc));
 
 	if (vmx->vpid != 0)
@@ -6199,15 +6200,6 @@ static __init int hardware_setup(void)
 	if (!cpu_has_vmx_apicv())
 		enable_apicv = 0;
 
-	if (enable_apicv)
-		kvm_x86_ops->update_cr8_intercept = NULL;
-	else {
-		kvm_x86_ops->hwapic_irr_update = NULL;
-		kvm_x86_ops->hwapic_isr_update = NULL;
-		kvm_x86_ops->deliver_posted_interrupt = NULL;
-		kvm_x86_ops->sync_pir_to_irr = vmx_sync_pir_to_irr_dummy;
-	}
-
 	vmx_disable_intercept_for_msr(MSR_FS_BASE, false);
 	vmx_disable_intercept_for_msr(MSR_GS_BASE, false);
 	vmx_disable_intercept_for_msr(MSR_KERNEL_GS_BASE, true);
@@ -8150,7 +8142,7 @@ static void vmx_set_virtual_x2apic_mode(struct kvm_vcpu *vcpu, bool set)
 	 * apicv
 	 */
 	if (!cpu_has_vmx_virtualize_x2apic_mode() ||
-				!vmx_cpu_uses_apicv(vcpu))
+				!kvm_vcpu_apicv_active(vcpu))
 		return;
 
 	if (!cpu_need_tpr_shadow(vcpu))
@@ -8257,7 +8249,7 @@ static void vmx_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr)
 
 static void vmx_load_eoi_exitmap(struct kvm_vcpu *vcpu, u64 *eoi_exit_bitmap)
 {
-	if (!vmx_cpu_uses_apicv(vcpu))
+	if (!kvm_vcpu_apicv_active(vcpu))
 		return;
 
 	vmcs_write64(EOI_EXIT_BITMAP0, eoi_exit_bitmap[0]);
@@ -10801,7 +10793,8 @@ static struct kvm_x86_ops vmx_x86_ops = {
 	.update_cr8_intercept = update_cr8_intercept,
 	.set_virtual_x2apic_mode = vmx_set_virtual_x2apic_mode,
 	.set_apic_access_page_addr = vmx_set_apic_access_page_addr,
-	.cpu_uses_apicv = vmx_cpu_uses_apicv,
+	.get_enable_apicv = vmx_get_enable_apicv,
+	.refresh_apicv_exec_ctrl = vmx_refresh_apicv_exec_ctrl,
 	.load_eoi_exitmap = vmx_load_eoi_exitmap,
 	.hwapic_irr_update = vmx_hwapic_irr_update,
 	.hwapic_isr_update = vmx_hwapic_isr_update,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 4cf2cb9..afb0a60 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2645,7 +2645,9 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 static int kvm_vcpu_ioctl_get_lapic(struct kvm_vcpu *vcpu,
 				    struct kvm_lapic_state *s)
 {
-	kvm_x86_ops->sync_pir_to_irr(vcpu);
+	if (vcpu->arch.apicv_active)
+		kvm_x86_ops->sync_pir_to_irr(vcpu);
+
 	memcpy(s->regs, vcpu->arch.apic->regs, sizeof *s);
 
 	return 0;
@@ -5743,6 +5745,12 @@ static void kvm_pv_kick_cpu_op(struct kvm *kvm, unsigned long flags, int apicid)
 	kvm_irq_delivery_to_apic(kvm, NULL, &lapic_irq, NULL);
 }
 
+void kvm_vcpu_deactivate_apicv(struct kvm_vcpu *vcpu)
+{
+	vcpu->arch.apicv_active = false;
+	kvm_x86_ops->refresh_apicv_exec_ctrl(vcpu);
+}
+
 int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
 {
 	unsigned long nr, a0, a1, a2, a3, ret;
@@ -5857,6 +5865,9 @@ static void update_cr8_intercept(struct kvm_vcpu *vcpu)
 	if (!vcpu->arch.apic)
 		return;
 
+	if (vcpu->arch.apicv_active)
+		return;
+
 	if (!vcpu->arch.apic->vapic_addr)
 		max_irr = kvm_lapic_find_highest_irr(vcpu);
 	else
@@ -6203,7 +6214,8 @@ static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
 	if (irqchip_split(vcpu->kvm))
 		kvm_scan_ioapic_routes(vcpu, vcpu->arch.ioapic_handled_vectors);
 	else {
-		kvm_x86_ops->sync_pir_to_irr(vcpu);
+		if (vcpu->arch.apicv_active)
+			kvm_x86_ops->sync_pir_to_irr(vcpu);
 		kvm_ioapic_scan_entry(vcpu, vcpu->arch.ioapic_handled_vectors);
 	}
 	kvm_x86_ops->load_eoi_exitmap(vcpu,
@@ -6348,7 +6360,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 		 * Update architecture specific hints for APIC
 		 * virtual interrupt delivery.
 		 */
-		if (kvm_x86_ops->hwapic_irr_update)
+		if (vcpu->arch.apicv_active)
 			kvm_x86_ops->hwapic_irr_update(vcpu,
 				kvm_lapic_find_highest_irr(vcpu));
 	}
@@ -7405,6 +7417,7 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
 	BUG_ON(vcpu->kvm == NULL);
 	kvm = vcpu->kvm;
 
+	vcpu->arch.apicv_active = kvm_x86_ops->get_enable_apicv();
 	vcpu->arch.pv.pv_unhalted = false;
 	vcpu->arch.emulate_ctxt.ops = &emulate_ops;
 	if (!irqchip_in_kernel(kvm) || kvm_vcpu_is_reset_bsp(vcpu))
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v4 4/5] kvm/x86: Hyper-V synthetic interrupt controller
  2015-11-10 12:36 ` [Qemu-devel] " Andrey Smetanin
@ 2015-11-10 12:36   ` Andrey Smetanin
  -1 siblings, 0 replies; 38+ messages in thread
From: Andrey Smetanin @ 2015-11-10 12:36 UTC (permalink / raw)
  To: kvm; +Cc: Denis V. Lunev, Gleb Natapov, Paolo Bonzini, Roman Kagan, qemu-devel

SynIC (synthetic interrupt controller) is a lapic extension,
which is controlled via MSRs and maintains for each vCPU
 - 16 synthetic interrupt "lines" (SINT's); each can be configured to
   trigger a specific interrupt vector optionally with auto-EOI
   semantics
 - a message page in the guest memory with 16 256-byte per-SINT message
   slots
 - an event flag page in the guest memory with 16 2048-bit per-SINT
   event flag areas

The host triggers a SINT whenever it delivers a new message to the
corresponding slot or flips an event flag bit in the corresponding area.
The guest informs the host that it can try delivering a message by
explicitly asserting EOI in lapic or writing to End-Of-Message (EOM)
MSR.

The userspace (qemu) triggers interrupts and receives EOM notifications
via irqfd with resampler; for that, a GSI is allocated for each
configured SINT, and irq_routing api is extended to support GSI-SINT
mapping.

Changes v4:
* added activation of SynIC by vcpu KVM_ENABLE_CAP
* added per SynIC active flag
* added deactivation of APICv upon SynIC activation

Changes v3:
* added KVM_CAP_HYPERV_SYNIC and KVM_IRQ_ROUTING_HV_SINT notes into
docs

Changes v2:
* do not use posted interrupts for Hyper-V SynIC AutoEOI vectors
* add Hyper-V SynIC vectors into EOI exit bitmap
* Hyper-V SyniIC SINT msr write logic simplified

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org

---
 Documentation/virtual/kvm/api.txt |  19 +++
 arch/x86/include/asm/kvm_host.h   |  15 ++
 arch/x86/kvm/hyperv.c             | 315 ++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/hyperv.h             |  23 +++
 arch/x86/kvm/irq_comm.c           |  34 ++++
 arch/x86/kvm/lapic.c              |  15 +-
 arch/x86/kvm/lapic.h              |   5 +
 arch/x86/kvm/x86.c                |  34 +++-
 include/linux/kvm_host.h          |   6 +
 include/uapi/linux/kvm.h          |   8 +
 10 files changed, 467 insertions(+), 7 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 34cc068..16096a2 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -1451,6 +1451,7 @@ struct kvm_irq_routing_entry {
 		struct kvm_irq_routing_irqchip irqchip;
 		struct kvm_irq_routing_msi msi;
 		struct kvm_irq_routing_s390_adapter adapter;
+		struct kvm_irq_routing_hv_sint hv_sint;
 		__u32 pad[8];
 	} u;
 };
@@ -1459,6 +1460,7 @@ struct kvm_irq_routing_entry {
 #define KVM_IRQ_ROUTING_IRQCHIP 1
 #define KVM_IRQ_ROUTING_MSI 2
 #define KVM_IRQ_ROUTING_S390_ADAPTER 3
+#define KVM_IRQ_ROUTING_HV_SINT 4
 
 No flags are specified so far, the corresponding field must be set to zero.
 
@@ -1482,6 +1484,10 @@ struct kvm_irq_routing_s390_adapter {
 	__u32 adapter_id;
 };
 
+struct kvm_irq_routing_hv_sint {
+	__u32 vcpu;
+	__u32 sint;
+};
 
 4.53 KVM_ASSIGN_SET_MSIX_NR (deprecated)
 
@@ -3685,3 +3691,16 @@ available, means that that the kernel has an implementation of the
 H_RANDOM hypercall backed by a hardware random-number generator.
 If present, the kernel H_RANDOM handler can be enabled for guest use
 with the KVM_CAP_PPC_ENABLE_HCALL capability.
+
+8.2 KVM_CAP_HYPERV_SYNIC
+
+Architectures: x86
+This capability, if KVM_CHECK_EXTENSION indicates that it is
+available, means that that the kernel has an implementation of the
+Hyper-V Synthetic interrupt controller(SynIC). Hyper-V SynIC is
+used to support Windows Hyper-V based guest paravirt drivers(VMBus).
+
+In order to use SynIC, it has to be activated by setting this
+capability via KVM_ENABLE_CAP ioctl on the vcpu fd. Note that this
+will disable the use of APIC hardware virtualization even if supported
+by the CPU, as it's incompatible with SynIC auto-EOI behavior.
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index a60a461..ad29e89 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -25,6 +25,7 @@
 #include <linux/pvclock_gtod.h>
 #include <linux/clocksource.h>
 #include <linux/irqbypass.h>
+#include <linux/hyperv.h>
 
 #include <asm/pvclock-abi.h>
 #include <asm/desc.h>
@@ -374,10 +375,24 @@ struct kvm_mtrr {
 	struct list_head head;
 };
 
+/* Hyper-V synthetic interrupt controller (SynIC)*/
+struct kvm_vcpu_hv_synic {
+	u64 version;
+	u64 control;
+	u64 msg_page;
+	u64 evt_page;
+	atomic64_t sint[HV_SYNIC_SINT_COUNT];
+	atomic_t sint_to_gsi[HV_SYNIC_SINT_COUNT];
+	DECLARE_BITMAP(auto_eoi_bitmap, 256);
+	DECLARE_BITMAP(vec_bitmap, 256);
+	bool active;
+};
+
 /* Hyper-V per vcpu emulation context */
 struct kvm_vcpu_hv {
 	u64 hv_vapic;
 	s64 runtime_offset;
+	struct kvm_vcpu_hv_synic synic;
 };
 
 struct kvm_vcpu_arch {
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 62cf8c9..83a3c0c 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -23,13 +23,314 @@
 
 #include "x86.h"
 #include "lapic.h"
+#include "ioapic.h"
 #include "hyperv.h"
 
 #include <linux/kvm_host.h>
+#include <asm/apicdef.h>
 #include <trace/events/kvm.h>
 
 #include "trace.h"
 
+static inline u64 synic_read_sint(struct kvm_vcpu_hv_synic *synic, int sint)
+{
+	return atomic64_read(&synic->sint[sint]);
+}
+
+static inline int synic_get_sint_vector(u64 sint_value)
+{
+	if (sint_value & HV_SYNIC_SINT_MASKED)
+		return -1;
+	return sint_value & HV_SYNIC_SINT_VECTOR_MASK;
+}
+
+static bool synic_has_vector_connected(struct kvm_vcpu_hv_synic *synic,
+				      int vector)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(synic->sint); i++) {
+		if (synic_get_sint_vector(synic_read_sint(synic, i)) == vector)
+			return true;
+	}
+	return false;
+}
+
+static bool synic_has_vector_auto_eoi(struct kvm_vcpu_hv_synic *synic,
+				     int vector)
+{
+	int i;
+	u64 sint_value;
+
+	for (i = 0; i < ARRAY_SIZE(synic->sint); i++) {
+		sint_value = synic_read_sint(synic, i);
+		if (synic_get_sint_vector(sint_value) == vector &&
+		    sint_value & HV_SYNIC_SINT_AUTO_EOI)
+			return true;
+	}
+	return false;
+}
+
+static int synic_set_sint(struct kvm_vcpu_hv_synic *synic, int sint, u64 data)
+{
+	int vector;
+
+	vector = data & HV_SYNIC_SINT_VECTOR_MASK;
+	if (vector < 16)
+		return 1;
+	/*
+	 * Guest may configure multiple SINTs to use the same vector, so
+	 * we maintain a bitmap of vectors handled by synic, and a
+	 * bitmap of vectors with auto-eoi behavior.  The bitmaps are
+	 * updated here, and atomically queried on fast paths.
+	 */
+
+	atomic64_set(&synic->sint[sint], data);
+
+	if (synic_has_vector_connected(synic, vector))
+		__set_bit(vector, synic->vec_bitmap);
+	else
+		__clear_bit(vector, synic->vec_bitmap);
+
+	if (synic_has_vector_auto_eoi(synic, vector))
+		__set_bit(vector, synic->auto_eoi_bitmap);
+	else
+		__clear_bit(vector, synic->auto_eoi_bitmap);
+
+	/* Load SynIC vectors into EOI exit bitmap */
+	kvm_make_request(KVM_REQ_SCAN_IOAPIC, synic_to_vcpu(synic));
+	return 0;
+}
+
+static struct kvm_vcpu_hv_synic *synic_get(struct kvm *kvm, u32 vcpu_id)
+{
+	struct kvm_vcpu *vcpu;
+	struct kvm_vcpu_hv_synic *synic;
+
+	if (vcpu_id >= atomic_read(&kvm->online_vcpus))
+		return NULL;
+	vcpu = kvm_get_vcpu(kvm, vcpu_id);
+	if (!vcpu)
+		return NULL;
+	synic = vcpu_to_synic(vcpu);
+	return (synic->active) ? synic : NULL;
+}
+
+static void kvm_hv_notify_acked_sint(struct kvm_vcpu *vcpu, u32 sint)
+{
+	struct kvm *kvm = vcpu->kvm;
+	int gsi, idx;
+
+	vcpu_debug(vcpu, "Hyper-V SynIC acked sint %d\n", sint);
+
+	idx = srcu_read_lock(&kvm->irq_srcu);
+	gsi = atomic_read(&vcpu_to_synic(vcpu)->sint_to_gsi[sint]);
+	if (gsi != -1)
+		kvm_notify_acked_gsi(kvm, gsi);
+	srcu_read_unlock(&kvm->irq_srcu, idx);
+}
+
+static int synic_set_msr(struct kvm_vcpu_hv_synic *synic,
+			 u32 msr, u64 data, bool host)
+{
+	struct kvm_vcpu *vcpu = synic_to_vcpu(synic);
+	int ret;
+
+	if (!synic->active)
+		return 1;
+
+	vcpu_debug(vcpu, "Hyper-V SynIC set msr 0x%x 0x%llx host %d\n",
+		   msr, data, host);
+	ret = 0;
+	switch (msr) {
+	case HV_X64_MSR_SCONTROL:
+		synic->control = data;
+		break;
+	case HV_X64_MSR_SVERSION:
+		if (!host) {
+			ret = 1;
+			break;
+		}
+		synic->version = data;
+		break;
+	case HV_X64_MSR_SIEFP:
+		if (data & HV_SYNIC_SIEFP_ENABLE)
+			if (kvm_clear_guest(vcpu->kvm,
+					    data & PAGE_MASK, PAGE_SIZE)) {
+				ret = 1;
+				break;
+			}
+		synic->evt_page = data;
+		break;
+	case HV_X64_MSR_SIMP:
+		if (data & HV_SYNIC_SIMP_ENABLE)
+			if (kvm_clear_guest(vcpu->kvm,
+					    data & PAGE_MASK, PAGE_SIZE)) {
+				ret = 1;
+				break;
+			}
+		synic->msg_page = data;
+		break;
+	case HV_X64_MSR_EOM: {
+		int i;
+
+		for (i = 0; i < ARRAY_SIZE(synic->sint); i++)
+			kvm_hv_notify_acked_sint(vcpu, i);
+		break;
+	}
+	case HV_X64_MSR_SINT0 ... HV_X64_MSR_SINT15:
+		ret = synic_set_sint(synic, msr - HV_X64_MSR_SINT0, data);
+		break;
+	default:
+		ret = 1;
+		break;
+	}
+	return ret;
+}
+
+static int synic_get_msr(struct kvm_vcpu_hv_synic *synic, u32 msr, u64 *pdata)
+{
+	int ret;
+
+	if (!synic->active)
+		return 1;
+
+	ret = 0;
+	switch (msr) {
+	case HV_X64_MSR_SCONTROL:
+		*pdata = synic->control;
+		break;
+	case HV_X64_MSR_SVERSION:
+		*pdata = synic->version;
+		break;
+	case HV_X64_MSR_SIEFP:
+		*pdata = synic->evt_page;
+		break;
+	case HV_X64_MSR_SIMP:
+		*pdata = synic->msg_page;
+		break;
+	case HV_X64_MSR_EOM:
+		*pdata = 0;
+		break;
+	case HV_X64_MSR_SINT0 ... HV_X64_MSR_SINT15:
+		*pdata = atomic64_read(&synic->sint[msr - HV_X64_MSR_SINT0]);
+		break;
+	default:
+		ret = 1;
+		break;
+	}
+	return ret;
+}
+
+int synic_set_irq(struct kvm_vcpu_hv_synic *synic, u32 sint)
+{
+	struct kvm_vcpu *vcpu = synic_to_vcpu(synic);
+	struct kvm_lapic_irq irq;
+	int ret, vector;
+
+	if (sint >= ARRAY_SIZE(synic->sint))
+		return -EINVAL;
+
+	vector = synic_get_sint_vector(synic_read_sint(synic, sint));
+	if (vector < 0)
+		return -ENOENT;
+
+	memset(&irq, 0, sizeof(irq));
+	irq.dest_id = kvm_apic_id(vcpu->arch.apic);
+	irq.dest_mode = APIC_DEST_PHYSICAL;
+	irq.delivery_mode = APIC_DM_FIXED;
+	irq.vector = vector;
+	irq.level = 1;
+
+	ret = kvm_irq_delivery_to_apic(vcpu->kvm, NULL, &irq, NULL);
+	vcpu_debug(vcpu, "Hyper-V SynIC set irq ret %d\n", ret);
+	return ret;
+}
+
+int kvm_hv_synic_set_irq(struct kvm *kvm, u32 vcpu_id, u32 sint)
+{
+	struct kvm_vcpu_hv_synic *synic;
+
+	synic = synic_get(kvm, vcpu_id);
+	if (!synic)
+		return -EINVAL;
+
+	return synic_set_irq(synic, sint);
+}
+
+void kvm_hv_synic_send_eoi(struct kvm_vcpu *vcpu, int vector)
+{
+	struct kvm_vcpu_hv_synic *synic = vcpu_to_synic(vcpu);
+	int i;
+
+	vcpu_debug(vcpu, "Hyper-V SynIC send eoi vec %d\n", vector);
+
+	for (i = 0; i < ARRAY_SIZE(synic->sint); i++)
+		if (synic_get_sint_vector(synic_read_sint(synic, i)) == vector)
+			kvm_hv_notify_acked_sint(vcpu, i);
+}
+
+static int kvm_hv_set_sint_gsi(struct kvm *kvm, u32 vcpu_id, u32 sint, int gsi)
+{
+	struct kvm_vcpu_hv_synic *synic;
+
+	synic = synic_get(kvm, vcpu_id);
+	if (!synic)
+		return -EINVAL;
+
+	if (sint >= ARRAY_SIZE(synic->sint_to_gsi))
+		return -EINVAL;
+
+	atomic_set(&synic->sint_to_gsi[sint], gsi);
+	return 0;
+}
+
+void kvm_hv_irq_routing_update(struct kvm *kvm)
+{
+	struct kvm_irq_routing_table *irq_rt;
+	struct kvm_kernel_irq_routing_entry *e;
+	u32 gsi;
+
+	irq_rt = srcu_dereference_check(kvm->irq_routing, &kvm->irq_srcu,
+					lockdep_is_held(&kvm->irq_lock));
+
+	for (gsi = 0; gsi < irq_rt->nr_rt_entries; gsi++) {
+		hlist_for_each_entry(e, &irq_rt->map[gsi], link) {
+			if (e->type == KVM_IRQ_ROUTING_HV_SINT)
+				kvm_hv_set_sint_gsi(kvm, e->hv_sint.vcpu,
+						    e->hv_sint.sint, gsi);
+		}
+	}
+}
+
+static void synic_init(struct kvm_vcpu_hv_synic *synic)
+{
+	int i;
+
+	memset(synic, 0, sizeof(*synic));
+	synic->version = HV_SYNIC_VERSION_1;
+	for (i = 0; i < ARRAY_SIZE(synic->sint); i++) {
+		atomic64_set(&synic->sint[i], HV_SYNIC_SINT_MASKED);
+		atomic_set(&synic->sint_to_gsi[i], -1);
+	}
+}
+
+void kvm_hv_vcpu_init(struct kvm_vcpu *vcpu)
+{
+	synic_init(vcpu_to_synic(vcpu));
+}
+
+int kvm_hv_activate_synic(struct kvm_vcpu *vcpu)
+{
+	/*
+	 * Hyper-V SynIC auto EOI SINT's are
+	 * not compatible with APICV, so deactivate APICV
+	 */
+	kvm_vcpu_deactivate_apicv(vcpu);
+	vcpu_to_synic(vcpu)->active = true;
+	return 0;
+}
+
 static bool kvm_hv_msr_partition_wide(u32 msr)
 {
 	bool r = false;
@@ -226,6 +527,13 @@ static int kvm_hv_set_msr(struct kvm_vcpu *vcpu, u32 msr, u64 data, bool host)
 			return 1;
 		hv->runtime_offset = data - current_task_runtime_100ns();
 		break;
+	case HV_X64_MSR_SCONTROL:
+	case HV_X64_MSR_SVERSION:
+	case HV_X64_MSR_SIEFP:
+	case HV_X64_MSR_SIMP:
+	case HV_X64_MSR_EOM:
+	case HV_X64_MSR_SINT0 ... HV_X64_MSR_SINT15:
+		return synic_set_msr(vcpu_to_synic(vcpu), msr, data, host);
 	default:
 		vcpu_unimpl(vcpu, "Hyper-V uhandled wrmsr: 0x%x data 0x%llx\n",
 			    msr, data);
@@ -304,6 +612,13 @@ static int kvm_hv_get_msr(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata)
 	case HV_X64_MSR_VP_RUNTIME:
 		data = current_task_runtime_100ns() + hv->runtime_offset;
 		break;
+	case HV_X64_MSR_SCONTROL:
+	case HV_X64_MSR_SVERSION:
+	case HV_X64_MSR_SIEFP:
+	case HV_X64_MSR_SIMP:
+	case HV_X64_MSR_EOM:
+	case HV_X64_MSR_SINT0 ... HV_X64_MSR_SINT15:
+		return synic_get_msr(vcpu_to_synic(vcpu), msr, pdata);
 	default:
 		vcpu_unimpl(vcpu, "Hyper-V unhandled rdmsr: 0x%x\n", msr);
 		return 1;
diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
index c7bce55..315af4b 100644
--- a/arch/x86/kvm/hyperv.h
+++ b/arch/x86/kvm/hyperv.h
@@ -29,4 +29,27 @@ int kvm_hv_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata);
 bool kvm_hv_hypercall_enabled(struct kvm *kvm);
 int kvm_hv_hypercall(struct kvm_vcpu *vcpu);
 
+int kvm_hv_synic_set_irq(struct kvm *kvm, u32 vcpu_id, u32 sint);
+void kvm_hv_synic_send_eoi(struct kvm_vcpu *vcpu, int vector);
+
+static inline struct kvm_vcpu_hv_synic *vcpu_to_synic(struct kvm_vcpu *vcpu)
+{
+	return &vcpu->arch.hyperv.synic;
+}
+
+static inline struct kvm_vcpu *synic_to_vcpu(struct kvm_vcpu_hv_synic *synic)
+{
+	struct kvm_vcpu_hv *hv;
+	struct kvm_vcpu_arch *arch;
+
+	hv = container_of(synic, struct kvm_vcpu_hv, synic);
+	arch = container_of(hv, struct kvm_vcpu_arch, hyperv);
+	return container_of(arch, struct kvm_vcpu, arch);
+}
+void kvm_hv_irq_routing_update(struct kvm *kvm);
+
+void kvm_hv_vcpu_init(struct kvm_vcpu *vcpu);
+
+int kvm_hv_activate_synic(struct kvm_vcpu *vcpu);
+
 #endif
diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c
index ece901c..8fc89ef 100644
--- a/arch/x86/kvm/irq_comm.c
+++ b/arch/x86/kvm/irq_comm.c
@@ -33,6 +33,8 @@
 
 #include "lapic.h"
 
+#include "hyperv.h"
+
 static int kvm_set_pic_irq(struct kvm_kernel_irq_routing_entry *e,
 			   struct kvm *kvm, int irq_source_id, int level,
 			   bool line_status)
@@ -219,6 +221,16 @@ void kvm_fire_mask_notifiers(struct kvm *kvm, unsigned irqchip, unsigned pin,
 	srcu_read_unlock(&kvm->irq_srcu, idx);
 }
 
+static int kvm_hv_set_sint(struct kvm_kernel_irq_routing_entry *e,
+		    struct kvm *kvm, int irq_source_id, int level,
+		    bool line_status)
+{
+	if (!level)
+		return -1;
+
+	return kvm_hv_synic_set_irq(kvm, e->hv_sint.vcpu, e->hv_sint.sint);
+}
+
 int kvm_set_routing_entry(struct kvm_kernel_irq_routing_entry *e,
 			  const struct kvm_irq_routing_entry *ue)
 {
@@ -257,6 +269,11 @@ int kvm_set_routing_entry(struct kvm_kernel_irq_routing_entry *e,
 		e->msi.address_hi = ue->u.msi.address_hi;
 		e->msi.data = ue->u.msi.data;
 		break;
+	case KVM_IRQ_ROUTING_HV_SINT:
+		e->set = kvm_hv_set_sint;
+		e->hv_sint.vcpu = ue->u.hv_sint.vcpu;
+		e->hv_sint.sint = ue->u.hv_sint.sint;
+		break;
 	default:
 		goto out;
 	}
@@ -376,3 +393,20 @@ void kvm_scan_ioapic_routes(struct kvm_vcpu *vcpu,
 	}
 	srcu_read_unlock(&kvm->irq_srcu, idx);
 }
+
+int kvm_arch_set_irq(struct kvm_kernel_irq_routing_entry *irq, struct kvm *kvm,
+		     int irq_source_id, int level, bool line_status)
+{
+	switch (irq->type) {
+	case KVM_IRQ_ROUTING_HV_SINT:
+		return kvm_hv_set_sint(irq, kvm, irq_source_id, level,
+				       line_status);
+	default:
+		return -EWOULDBLOCK;
+	}
+}
+
+void kvm_arch_irq_routing_update(struct kvm *kvm)
+{
+	kvm_hv_irq_routing_update(kvm);
+}
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 14d6fcc..80a13c0 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -41,6 +41,7 @@
 #include "trace.h"
 #include "x86.h"
 #include "cpuid.h"
+#include "hyperv.h"
 
 #ifndef CONFIG_X86_64
 #define mod_64(x, y) ((x) - (y) * div64_u64(x, y))
@@ -128,11 +129,6 @@ static inline int apic_enabled(struct kvm_lapic *apic)
 	(LVT_MASK | APIC_MODE_MASK | APIC_INPUT_POLARITY | \
 	 APIC_LVT_REMOTE_IRR | APIC_LVT_LEVEL_TRIGGER)
 
-static inline int kvm_apic_id(struct kvm_lapic *apic)
-{
-	return (kvm_apic_get_reg(apic, APIC_ID) >> 24) & 0xff;
-}
-
 /* The logical map is definitely wrong if we have multiple
  * modes at the same time.  (Physical map is always right.)
  */
@@ -975,6 +971,9 @@ static int apic_set_eoi(struct kvm_lapic *apic)
 	apic_clear_isr(vector, apic);
 	apic_update_ppr(apic);
 
+	if (test_bit(vector, vcpu_to_synic(apic->vcpu)->vec_bitmap))
+		kvm_hv_synic_send_eoi(apic->vcpu, vector);
+
 	kvm_ioapic_send_eoi(apic, vector);
 	kvm_make_request(KVM_REQ_EVENT, apic->vcpu);
 	return vector;
@@ -1884,6 +1883,12 @@ int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu)
 	apic_set_isr(vector, apic);
 	apic_update_ppr(apic);
 	apic_clear_irr(vector, apic);
+
+	if (test_bit(vector, vcpu_to_synic(vcpu)->auto_eoi_bitmap)) {
+		apic_clear_isr(vector, apic);
+		apic_update_ppr(apic);
+	}
+
 	return vector;
 }
 
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index 5fc60e4..41bdb35 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -164,6 +164,11 @@ static inline int kvm_lapic_latched_init(struct kvm_vcpu *vcpu)
 	return kvm_vcpu_has_lapic(vcpu) && test_bit(KVM_APIC_INIT, &vcpu->arch.apic->pending_events);
 }
 
+static inline int kvm_apic_id(struct kvm_lapic *apic)
+{
+	return (kvm_apic_get_reg(apic, APIC_ID) >> 24) & 0xff;
+}
+
 bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector);
 
 void wait_lapic_expire(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index afb0a60..41f3030 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -960,6 +960,7 @@ static u32 emulated_msrs[] = {
 	HV_X64_MSR_RESET,
 	HV_X64_MSR_VP_INDEX,
 	HV_X64_MSR_VP_RUNTIME,
+	HV_X64_MSR_SCONTROL,
 	HV_X64_MSR_APIC_ASSIST_PAGE, MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME,
 	MSR_KVM_PV_EOI_EN,
 
@@ -2438,6 +2439,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_HYPERV:
 	case KVM_CAP_HYPERV_VAPIC:
 	case KVM_CAP_HYPERV_SPIN:
+	case KVM_CAP_HYPERV_SYNIC:
 	case KVM_CAP_PCI_SEGMENT:
 	case KVM_CAP_DEBUGREGS:
 	case KVM_CAP_X86_ROBUST_SINGLESTEP:
@@ -3069,6 +3071,20 @@ static int kvm_set_guest_paused(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
+static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
+				     struct kvm_enable_cap *cap)
+{
+	if (cap->flags)
+		return -EINVAL;
+
+	switch (cap->cap) {
+	case KVM_CAP_HYPERV_SYNIC:
+		return kvm_hv_activate_synic(vcpu);
+	default:
+		return -EINVAL;
+	}
+}
+
 long kvm_arch_vcpu_ioctl(struct file *filp,
 			 unsigned int ioctl, unsigned long arg)
 {
@@ -3333,6 +3349,15 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 		r = kvm_set_guest_paused(vcpu);
 		goto out;
 	}
+	case KVM_ENABLE_CAP: {
+		struct kvm_enable_cap cap;
+
+		r = -EFAULT;
+		if (copy_from_user(&cap, argp, sizeof(cap)))
+			goto out;
+		r = kvm_vcpu_ioctl_enable_cap(vcpu, &cap);
+		break;
+	}
 	default:
 		r = -EINVAL;
 	}
@@ -6206,6 +6231,8 @@ static void process_smi(struct kvm_vcpu *vcpu)
 
 static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
 {
+	u64 eoi_exit_bitmap[4];
+
 	if (!kvm_apic_hw_enabled(vcpu->arch.apic))
 		return;
 
@@ -6218,8 +6245,9 @@ static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
 			kvm_x86_ops->sync_pir_to_irr(vcpu);
 		kvm_ioapic_scan_entry(vcpu, vcpu->arch.ioapic_handled_vectors);
 	}
-	kvm_x86_ops->load_eoi_exitmap(vcpu,
-				      (u64 *)vcpu->arch.ioapic_handled_vectors);
+	bitmap_or((ulong *)eoi_exit_bitmap, vcpu->arch.ioapic_handled_vectors,
+		  vcpu_to_synic(vcpu)->vec_bitmap, 256);
+	kvm_x86_ops->load_eoi_exitmap(vcpu, eoi_exit_bitmap);
 }
 
 static void kvm_vcpu_flush_tlb(struct kvm_vcpu *vcpu)
@@ -7475,6 +7503,8 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
 
 	vcpu->arch.pending_external_vector = -1;
 
+	kvm_hv_vcpu_init(vcpu);
+
 	return 0;
 
 fail_free_mce_banks:
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index dbe2a2f..a998e55 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -318,6 +318,11 @@ struct kvm_s390_adapter_int {
 	u32 adapter_id;
 };
 
+struct kvm_hv_sint {
+	u32 vcpu;
+	u32 sint;
+};
+
 struct kvm_kernel_irq_routing_entry {
 	u32 gsi;
 	u32 type;
@@ -331,6 +336,7 @@ struct kvm_kernel_irq_routing_entry {
 		} irqchip;
 		struct msi_msg msi;
 		struct kvm_s390_adapter_int adapter;
+		struct kvm_hv_sint hv_sint;
 	};
 	struct hlist_node link;
 };
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 03f3618..27ce460 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -831,6 +831,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_GUEST_DEBUG_HW_WPS 120
 #define KVM_CAP_SPLIT_IRQCHIP 121
 #define KVM_CAP_IOEVENTFD_ANY_LENGTH 122
+#define KVM_CAP_HYPERV_SYNIC 123
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -854,10 +855,16 @@ struct kvm_irq_routing_s390_adapter {
 	__u32 adapter_id;
 };
 
+struct kvm_irq_routing_hv_sint {
+	__u32 vcpu;
+	__u32 sint;
+};
+
 /* gsi routing entry types */
 #define KVM_IRQ_ROUTING_IRQCHIP 1
 #define KVM_IRQ_ROUTING_MSI 2
 #define KVM_IRQ_ROUTING_S390_ADAPTER 3
+#define KVM_IRQ_ROUTING_HV_SINT 4
 
 struct kvm_irq_routing_entry {
 	__u32 gsi;
@@ -868,6 +875,7 @@ struct kvm_irq_routing_entry {
 		struct kvm_irq_routing_irqchip irqchip;
 		struct kvm_irq_routing_msi msi;
 		struct kvm_irq_routing_s390_adapter adapter;
+		struct kvm_irq_routing_hv_sint hv_sint;
 		__u32 pad[8];
 	} u;
 };
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [PATCH v4 4/5] kvm/x86: Hyper-V synthetic interrupt controller
@ 2015-11-10 12:36   ` Andrey Smetanin
  0 siblings, 0 replies; 38+ messages in thread
From: Andrey Smetanin @ 2015-11-10 12:36 UTC (permalink / raw)
  To: kvm; +Cc: Gleb Natapov, Denis V. Lunev, qemu-devel, Roman Kagan, Paolo Bonzini

SynIC (synthetic interrupt controller) is a lapic extension,
which is controlled via MSRs and maintains for each vCPU
 - 16 synthetic interrupt "lines" (SINT's); each can be configured to
   trigger a specific interrupt vector optionally with auto-EOI
   semantics
 - a message page in the guest memory with 16 256-byte per-SINT message
   slots
 - an event flag page in the guest memory with 16 2048-bit per-SINT
   event flag areas

The host triggers a SINT whenever it delivers a new message to the
corresponding slot or flips an event flag bit in the corresponding area.
The guest informs the host that it can try delivering a message by
explicitly asserting EOI in lapic or writing to End-Of-Message (EOM)
MSR.

The userspace (qemu) triggers interrupts and receives EOM notifications
via irqfd with resampler; for that, a GSI is allocated for each
configured SINT, and irq_routing api is extended to support GSI-SINT
mapping.

Changes v4:
* added activation of SynIC by vcpu KVM_ENABLE_CAP
* added per SynIC active flag
* added deactivation of APICv upon SynIC activation

Changes v3:
* added KVM_CAP_HYPERV_SYNIC and KVM_IRQ_ROUTING_HV_SINT notes into
docs

Changes v2:
* do not use posted interrupts for Hyper-V SynIC AutoEOI vectors
* add Hyper-V SynIC vectors into EOI exit bitmap
* Hyper-V SyniIC SINT msr write logic simplified

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org

---
 Documentation/virtual/kvm/api.txt |  19 +++
 arch/x86/include/asm/kvm_host.h   |  15 ++
 arch/x86/kvm/hyperv.c             | 315 ++++++++++++++++++++++++++++++++++++++
 arch/x86/kvm/hyperv.h             |  23 +++
 arch/x86/kvm/irq_comm.c           |  34 ++++
 arch/x86/kvm/lapic.c              |  15 +-
 arch/x86/kvm/lapic.h              |   5 +
 arch/x86/kvm/x86.c                |  34 +++-
 include/linux/kvm_host.h          |   6 +
 include/uapi/linux/kvm.h          |   8 +
 10 files changed, 467 insertions(+), 7 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 34cc068..16096a2 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -1451,6 +1451,7 @@ struct kvm_irq_routing_entry {
 		struct kvm_irq_routing_irqchip irqchip;
 		struct kvm_irq_routing_msi msi;
 		struct kvm_irq_routing_s390_adapter adapter;
+		struct kvm_irq_routing_hv_sint hv_sint;
 		__u32 pad[8];
 	} u;
 };
@@ -1459,6 +1460,7 @@ struct kvm_irq_routing_entry {
 #define KVM_IRQ_ROUTING_IRQCHIP 1
 #define KVM_IRQ_ROUTING_MSI 2
 #define KVM_IRQ_ROUTING_S390_ADAPTER 3
+#define KVM_IRQ_ROUTING_HV_SINT 4
 
 No flags are specified so far, the corresponding field must be set to zero.
 
@@ -1482,6 +1484,10 @@ struct kvm_irq_routing_s390_adapter {
 	__u32 adapter_id;
 };
 
+struct kvm_irq_routing_hv_sint {
+	__u32 vcpu;
+	__u32 sint;
+};
 
 4.53 KVM_ASSIGN_SET_MSIX_NR (deprecated)
 
@@ -3685,3 +3691,16 @@ available, means that that the kernel has an implementation of the
 H_RANDOM hypercall backed by a hardware random-number generator.
 If present, the kernel H_RANDOM handler can be enabled for guest use
 with the KVM_CAP_PPC_ENABLE_HCALL capability.
+
+8.2 KVM_CAP_HYPERV_SYNIC
+
+Architectures: x86
+This capability, if KVM_CHECK_EXTENSION indicates that it is
+available, means that that the kernel has an implementation of the
+Hyper-V Synthetic interrupt controller(SynIC). Hyper-V SynIC is
+used to support Windows Hyper-V based guest paravirt drivers(VMBus).
+
+In order to use SynIC, it has to be activated by setting this
+capability via KVM_ENABLE_CAP ioctl on the vcpu fd. Note that this
+will disable the use of APIC hardware virtualization even if supported
+by the CPU, as it's incompatible with SynIC auto-EOI behavior.
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index a60a461..ad29e89 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -25,6 +25,7 @@
 #include <linux/pvclock_gtod.h>
 #include <linux/clocksource.h>
 #include <linux/irqbypass.h>
+#include <linux/hyperv.h>
 
 #include <asm/pvclock-abi.h>
 #include <asm/desc.h>
@@ -374,10 +375,24 @@ struct kvm_mtrr {
 	struct list_head head;
 };
 
+/* Hyper-V synthetic interrupt controller (SynIC)*/
+struct kvm_vcpu_hv_synic {
+	u64 version;
+	u64 control;
+	u64 msg_page;
+	u64 evt_page;
+	atomic64_t sint[HV_SYNIC_SINT_COUNT];
+	atomic_t sint_to_gsi[HV_SYNIC_SINT_COUNT];
+	DECLARE_BITMAP(auto_eoi_bitmap, 256);
+	DECLARE_BITMAP(vec_bitmap, 256);
+	bool active;
+};
+
 /* Hyper-V per vcpu emulation context */
 struct kvm_vcpu_hv {
 	u64 hv_vapic;
 	s64 runtime_offset;
+	struct kvm_vcpu_hv_synic synic;
 };
 
 struct kvm_vcpu_arch {
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 62cf8c9..83a3c0c 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -23,13 +23,314 @@
 
 #include "x86.h"
 #include "lapic.h"
+#include "ioapic.h"
 #include "hyperv.h"
 
 #include <linux/kvm_host.h>
+#include <asm/apicdef.h>
 #include <trace/events/kvm.h>
 
 #include "trace.h"
 
+static inline u64 synic_read_sint(struct kvm_vcpu_hv_synic *synic, int sint)
+{
+	return atomic64_read(&synic->sint[sint]);
+}
+
+static inline int synic_get_sint_vector(u64 sint_value)
+{
+	if (sint_value & HV_SYNIC_SINT_MASKED)
+		return -1;
+	return sint_value & HV_SYNIC_SINT_VECTOR_MASK;
+}
+
+static bool synic_has_vector_connected(struct kvm_vcpu_hv_synic *synic,
+				      int vector)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(synic->sint); i++) {
+		if (synic_get_sint_vector(synic_read_sint(synic, i)) == vector)
+			return true;
+	}
+	return false;
+}
+
+static bool synic_has_vector_auto_eoi(struct kvm_vcpu_hv_synic *synic,
+				     int vector)
+{
+	int i;
+	u64 sint_value;
+
+	for (i = 0; i < ARRAY_SIZE(synic->sint); i++) {
+		sint_value = synic_read_sint(synic, i);
+		if (synic_get_sint_vector(sint_value) == vector &&
+		    sint_value & HV_SYNIC_SINT_AUTO_EOI)
+			return true;
+	}
+	return false;
+}
+
+static int synic_set_sint(struct kvm_vcpu_hv_synic *synic, int sint, u64 data)
+{
+	int vector;
+
+	vector = data & HV_SYNIC_SINT_VECTOR_MASK;
+	if (vector < 16)
+		return 1;
+	/*
+	 * Guest may configure multiple SINTs to use the same vector, so
+	 * we maintain a bitmap of vectors handled by synic, and a
+	 * bitmap of vectors with auto-eoi behavior.  The bitmaps are
+	 * updated here, and atomically queried on fast paths.
+	 */
+
+	atomic64_set(&synic->sint[sint], data);
+
+	if (synic_has_vector_connected(synic, vector))
+		__set_bit(vector, synic->vec_bitmap);
+	else
+		__clear_bit(vector, synic->vec_bitmap);
+
+	if (synic_has_vector_auto_eoi(synic, vector))
+		__set_bit(vector, synic->auto_eoi_bitmap);
+	else
+		__clear_bit(vector, synic->auto_eoi_bitmap);
+
+	/* Load SynIC vectors into EOI exit bitmap */
+	kvm_make_request(KVM_REQ_SCAN_IOAPIC, synic_to_vcpu(synic));
+	return 0;
+}
+
+static struct kvm_vcpu_hv_synic *synic_get(struct kvm *kvm, u32 vcpu_id)
+{
+	struct kvm_vcpu *vcpu;
+	struct kvm_vcpu_hv_synic *synic;
+
+	if (vcpu_id >= atomic_read(&kvm->online_vcpus))
+		return NULL;
+	vcpu = kvm_get_vcpu(kvm, vcpu_id);
+	if (!vcpu)
+		return NULL;
+	synic = vcpu_to_synic(vcpu);
+	return (synic->active) ? synic : NULL;
+}
+
+static void kvm_hv_notify_acked_sint(struct kvm_vcpu *vcpu, u32 sint)
+{
+	struct kvm *kvm = vcpu->kvm;
+	int gsi, idx;
+
+	vcpu_debug(vcpu, "Hyper-V SynIC acked sint %d\n", sint);
+
+	idx = srcu_read_lock(&kvm->irq_srcu);
+	gsi = atomic_read(&vcpu_to_synic(vcpu)->sint_to_gsi[sint]);
+	if (gsi != -1)
+		kvm_notify_acked_gsi(kvm, gsi);
+	srcu_read_unlock(&kvm->irq_srcu, idx);
+}
+
+static int synic_set_msr(struct kvm_vcpu_hv_synic *synic,
+			 u32 msr, u64 data, bool host)
+{
+	struct kvm_vcpu *vcpu = synic_to_vcpu(synic);
+	int ret;
+
+	if (!synic->active)
+		return 1;
+
+	vcpu_debug(vcpu, "Hyper-V SynIC set msr 0x%x 0x%llx host %d\n",
+		   msr, data, host);
+	ret = 0;
+	switch (msr) {
+	case HV_X64_MSR_SCONTROL:
+		synic->control = data;
+		break;
+	case HV_X64_MSR_SVERSION:
+		if (!host) {
+			ret = 1;
+			break;
+		}
+		synic->version = data;
+		break;
+	case HV_X64_MSR_SIEFP:
+		if (data & HV_SYNIC_SIEFP_ENABLE)
+			if (kvm_clear_guest(vcpu->kvm,
+					    data & PAGE_MASK, PAGE_SIZE)) {
+				ret = 1;
+				break;
+			}
+		synic->evt_page = data;
+		break;
+	case HV_X64_MSR_SIMP:
+		if (data & HV_SYNIC_SIMP_ENABLE)
+			if (kvm_clear_guest(vcpu->kvm,
+					    data & PAGE_MASK, PAGE_SIZE)) {
+				ret = 1;
+				break;
+			}
+		synic->msg_page = data;
+		break;
+	case HV_X64_MSR_EOM: {
+		int i;
+
+		for (i = 0; i < ARRAY_SIZE(synic->sint); i++)
+			kvm_hv_notify_acked_sint(vcpu, i);
+		break;
+	}
+	case HV_X64_MSR_SINT0 ... HV_X64_MSR_SINT15:
+		ret = synic_set_sint(synic, msr - HV_X64_MSR_SINT0, data);
+		break;
+	default:
+		ret = 1;
+		break;
+	}
+	return ret;
+}
+
+static int synic_get_msr(struct kvm_vcpu_hv_synic *synic, u32 msr, u64 *pdata)
+{
+	int ret;
+
+	if (!synic->active)
+		return 1;
+
+	ret = 0;
+	switch (msr) {
+	case HV_X64_MSR_SCONTROL:
+		*pdata = synic->control;
+		break;
+	case HV_X64_MSR_SVERSION:
+		*pdata = synic->version;
+		break;
+	case HV_X64_MSR_SIEFP:
+		*pdata = synic->evt_page;
+		break;
+	case HV_X64_MSR_SIMP:
+		*pdata = synic->msg_page;
+		break;
+	case HV_X64_MSR_EOM:
+		*pdata = 0;
+		break;
+	case HV_X64_MSR_SINT0 ... HV_X64_MSR_SINT15:
+		*pdata = atomic64_read(&synic->sint[msr - HV_X64_MSR_SINT0]);
+		break;
+	default:
+		ret = 1;
+		break;
+	}
+	return ret;
+}
+
+int synic_set_irq(struct kvm_vcpu_hv_synic *synic, u32 sint)
+{
+	struct kvm_vcpu *vcpu = synic_to_vcpu(synic);
+	struct kvm_lapic_irq irq;
+	int ret, vector;
+
+	if (sint >= ARRAY_SIZE(synic->sint))
+		return -EINVAL;
+
+	vector = synic_get_sint_vector(synic_read_sint(synic, sint));
+	if (vector < 0)
+		return -ENOENT;
+
+	memset(&irq, 0, sizeof(irq));
+	irq.dest_id = kvm_apic_id(vcpu->arch.apic);
+	irq.dest_mode = APIC_DEST_PHYSICAL;
+	irq.delivery_mode = APIC_DM_FIXED;
+	irq.vector = vector;
+	irq.level = 1;
+
+	ret = kvm_irq_delivery_to_apic(vcpu->kvm, NULL, &irq, NULL);
+	vcpu_debug(vcpu, "Hyper-V SynIC set irq ret %d\n", ret);
+	return ret;
+}
+
+int kvm_hv_synic_set_irq(struct kvm *kvm, u32 vcpu_id, u32 sint)
+{
+	struct kvm_vcpu_hv_synic *synic;
+
+	synic = synic_get(kvm, vcpu_id);
+	if (!synic)
+		return -EINVAL;
+
+	return synic_set_irq(synic, sint);
+}
+
+void kvm_hv_synic_send_eoi(struct kvm_vcpu *vcpu, int vector)
+{
+	struct kvm_vcpu_hv_synic *synic = vcpu_to_synic(vcpu);
+	int i;
+
+	vcpu_debug(vcpu, "Hyper-V SynIC send eoi vec %d\n", vector);
+
+	for (i = 0; i < ARRAY_SIZE(synic->sint); i++)
+		if (synic_get_sint_vector(synic_read_sint(synic, i)) == vector)
+			kvm_hv_notify_acked_sint(vcpu, i);
+}
+
+static int kvm_hv_set_sint_gsi(struct kvm *kvm, u32 vcpu_id, u32 sint, int gsi)
+{
+	struct kvm_vcpu_hv_synic *synic;
+
+	synic = synic_get(kvm, vcpu_id);
+	if (!synic)
+		return -EINVAL;
+
+	if (sint >= ARRAY_SIZE(synic->sint_to_gsi))
+		return -EINVAL;
+
+	atomic_set(&synic->sint_to_gsi[sint], gsi);
+	return 0;
+}
+
+void kvm_hv_irq_routing_update(struct kvm *kvm)
+{
+	struct kvm_irq_routing_table *irq_rt;
+	struct kvm_kernel_irq_routing_entry *e;
+	u32 gsi;
+
+	irq_rt = srcu_dereference_check(kvm->irq_routing, &kvm->irq_srcu,
+					lockdep_is_held(&kvm->irq_lock));
+
+	for (gsi = 0; gsi < irq_rt->nr_rt_entries; gsi++) {
+		hlist_for_each_entry(e, &irq_rt->map[gsi], link) {
+			if (e->type == KVM_IRQ_ROUTING_HV_SINT)
+				kvm_hv_set_sint_gsi(kvm, e->hv_sint.vcpu,
+						    e->hv_sint.sint, gsi);
+		}
+	}
+}
+
+static void synic_init(struct kvm_vcpu_hv_synic *synic)
+{
+	int i;
+
+	memset(synic, 0, sizeof(*synic));
+	synic->version = HV_SYNIC_VERSION_1;
+	for (i = 0; i < ARRAY_SIZE(synic->sint); i++) {
+		atomic64_set(&synic->sint[i], HV_SYNIC_SINT_MASKED);
+		atomic_set(&synic->sint_to_gsi[i], -1);
+	}
+}
+
+void kvm_hv_vcpu_init(struct kvm_vcpu *vcpu)
+{
+	synic_init(vcpu_to_synic(vcpu));
+}
+
+int kvm_hv_activate_synic(struct kvm_vcpu *vcpu)
+{
+	/*
+	 * Hyper-V SynIC auto EOI SINT's are
+	 * not compatible with APICV, so deactivate APICV
+	 */
+	kvm_vcpu_deactivate_apicv(vcpu);
+	vcpu_to_synic(vcpu)->active = true;
+	return 0;
+}
+
 static bool kvm_hv_msr_partition_wide(u32 msr)
 {
 	bool r = false;
@@ -226,6 +527,13 @@ static int kvm_hv_set_msr(struct kvm_vcpu *vcpu, u32 msr, u64 data, bool host)
 			return 1;
 		hv->runtime_offset = data - current_task_runtime_100ns();
 		break;
+	case HV_X64_MSR_SCONTROL:
+	case HV_X64_MSR_SVERSION:
+	case HV_X64_MSR_SIEFP:
+	case HV_X64_MSR_SIMP:
+	case HV_X64_MSR_EOM:
+	case HV_X64_MSR_SINT0 ... HV_X64_MSR_SINT15:
+		return synic_set_msr(vcpu_to_synic(vcpu), msr, data, host);
 	default:
 		vcpu_unimpl(vcpu, "Hyper-V uhandled wrmsr: 0x%x data 0x%llx\n",
 			    msr, data);
@@ -304,6 +612,13 @@ static int kvm_hv_get_msr(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata)
 	case HV_X64_MSR_VP_RUNTIME:
 		data = current_task_runtime_100ns() + hv->runtime_offset;
 		break;
+	case HV_X64_MSR_SCONTROL:
+	case HV_X64_MSR_SVERSION:
+	case HV_X64_MSR_SIEFP:
+	case HV_X64_MSR_SIMP:
+	case HV_X64_MSR_EOM:
+	case HV_X64_MSR_SINT0 ... HV_X64_MSR_SINT15:
+		return synic_get_msr(vcpu_to_synic(vcpu), msr, pdata);
 	default:
 		vcpu_unimpl(vcpu, "Hyper-V unhandled rdmsr: 0x%x\n", msr);
 		return 1;
diff --git a/arch/x86/kvm/hyperv.h b/arch/x86/kvm/hyperv.h
index c7bce55..315af4b 100644
--- a/arch/x86/kvm/hyperv.h
+++ b/arch/x86/kvm/hyperv.h
@@ -29,4 +29,27 @@ int kvm_hv_get_msr_common(struct kvm_vcpu *vcpu, u32 msr, u64 *pdata);
 bool kvm_hv_hypercall_enabled(struct kvm *kvm);
 int kvm_hv_hypercall(struct kvm_vcpu *vcpu);
 
+int kvm_hv_synic_set_irq(struct kvm *kvm, u32 vcpu_id, u32 sint);
+void kvm_hv_synic_send_eoi(struct kvm_vcpu *vcpu, int vector);
+
+static inline struct kvm_vcpu_hv_synic *vcpu_to_synic(struct kvm_vcpu *vcpu)
+{
+	return &vcpu->arch.hyperv.synic;
+}
+
+static inline struct kvm_vcpu *synic_to_vcpu(struct kvm_vcpu_hv_synic *synic)
+{
+	struct kvm_vcpu_hv *hv;
+	struct kvm_vcpu_arch *arch;
+
+	hv = container_of(synic, struct kvm_vcpu_hv, synic);
+	arch = container_of(hv, struct kvm_vcpu_arch, hyperv);
+	return container_of(arch, struct kvm_vcpu, arch);
+}
+void kvm_hv_irq_routing_update(struct kvm *kvm);
+
+void kvm_hv_vcpu_init(struct kvm_vcpu *vcpu);
+
+int kvm_hv_activate_synic(struct kvm_vcpu *vcpu);
+
 #endif
diff --git a/arch/x86/kvm/irq_comm.c b/arch/x86/kvm/irq_comm.c
index ece901c..8fc89ef 100644
--- a/arch/x86/kvm/irq_comm.c
+++ b/arch/x86/kvm/irq_comm.c
@@ -33,6 +33,8 @@
 
 #include "lapic.h"
 
+#include "hyperv.h"
+
 static int kvm_set_pic_irq(struct kvm_kernel_irq_routing_entry *e,
 			   struct kvm *kvm, int irq_source_id, int level,
 			   bool line_status)
@@ -219,6 +221,16 @@ void kvm_fire_mask_notifiers(struct kvm *kvm, unsigned irqchip, unsigned pin,
 	srcu_read_unlock(&kvm->irq_srcu, idx);
 }
 
+static int kvm_hv_set_sint(struct kvm_kernel_irq_routing_entry *e,
+		    struct kvm *kvm, int irq_source_id, int level,
+		    bool line_status)
+{
+	if (!level)
+		return -1;
+
+	return kvm_hv_synic_set_irq(kvm, e->hv_sint.vcpu, e->hv_sint.sint);
+}
+
 int kvm_set_routing_entry(struct kvm_kernel_irq_routing_entry *e,
 			  const struct kvm_irq_routing_entry *ue)
 {
@@ -257,6 +269,11 @@ int kvm_set_routing_entry(struct kvm_kernel_irq_routing_entry *e,
 		e->msi.address_hi = ue->u.msi.address_hi;
 		e->msi.data = ue->u.msi.data;
 		break;
+	case KVM_IRQ_ROUTING_HV_SINT:
+		e->set = kvm_hv_set_sint;
+		e->hv_sint.vcpu = ue->u.hv_sint.vcpu;
+		e->hv_sint.sint = ue->u.hv_sint.sint;
+		break;
 	default:
 		goto out;
 	}
@@ -376,3 +393,20 @@ void kvm_scan_ioapic_routes(struct kvm_vcpu *vcpu,
 	}
 	srcu_read_unlock(&kvm->irq_srcu, idx);
 }
+
+int kvm_arch_set_irq(struct kvm_kernel_irq_routing_entry *irq, struct kvm *kvm,
+		     int irq_source_id, int level, bool line_status)
+{
+	switch (irq->type) {
+	case KVM_IRQ_ROUTING_HV_SINT:
+		return kvm_hv_set_sint(irq, kvm, irq_source_id, level,
+				       line_status);
+	default:
+		return -EWOULDBLOCK;
+	}
+}
+
+void kvm_arch_irq_routing_update(struct kvm *kvm)
+{
+	kvm_hv_irq_routing_update(kvm);
+}
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 14d6fcc..80a13c0 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -41,6 +41,7 @@
 #include "trace.h"
 #include "x86.h"
 #include "cpuid.h"
+#include "hyperv.h"
 
 #ifndef CONFIG_X86_64
 #define mod_64(x, y) ((x) - (y) * div64_u64(x, y))
@@ -128,11 +129,6 @@ static inline int apic_enabled(struct kvm_lapic *apic)
 	(LVT_MASK | APIC_MODE_MASK | APIC_INPUT_POLARITY | \
 	 APIC_LVT_REMOTE_IRR | APIC_LVT_LEVEL_TRIGGER)
 
-static inline int kvm_apic_id(struct kvm_lapic *apic)
-{
-	return (kvm_apic_get_reg(apic, APIC_ID) >> 24) & 0xff;
-}
-
 /* The logical map is definitely wrong if we have multiple
  * modes at the same time.  (Physical map is always right.)
  */
@@ -975,6 +971,9 @@ static int apic_set_eoi(struct kvm_lapic *apic)
 	apic_clear_isr(vector, apic);
 	apic_update_ppr(apic);
 
+	if (test_bit(vector, vcpu_to_synic(apic->vcpu)->vec_bitmap))
+		kvm_hv_synic_send_eoi(apic->vcpu, vector);
+
 	kvm_ioapic_send_eoi(apic, vector);
 	kvm_make_request(KVM_REQ_EVENT, apic->vcpu);
 	return vector;
@@ -1884,6 +1883,12 @@ int kvm_get_apic_interrupt(struct kvm_vcpu *vcpu)
 	apic_set_isr(vector, apic);
 	apic_update_ppr(apic);
 	apic_clear_irr(vector, apic);
+
+	if (test_bit(vector, vcpu_to_synic(vcpu)->auto_eoi_bitmap)) {
+		apic_clear_isr(vector, apic);
+		apic_update_ppr(apic);
+	}
+
 	return vector;
 }
 
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index 5fc60e4..41bdb35 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -164,6 +164,11 @@ static inline int kvm_lapic_latched_init(struct kvm_vcpu *vcpu)
 	return kvm_vcpu_has_lapic(vcpu) && test_bit(KVM_APIC_INIT, &vcpu->arch.apic->pending_events);
 }
 
+static inline int kvm_apic_id(struct kvm_lapic *apic)
+{
+	return (kvm_apic_get_reg(apic, APIC_ID) >> 24) & 0xff;
+}
+
 bool kvm_apic_pending_eoi(struct kvm_vcpu *vcpu, int vector);
 
 void wait_lapic_expire(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index afb0a60..41f3030 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -960,6 +960,7 @@ static u32 emulated_msrs[] = {
 	HV_X64_MSR_RESET,
 	HV_X64_MSR_VP_INDEX,
 	HV_X64_MSR_VP_RUNTIME,
+	HV_X64_MSR_SCONTROL,
 	HV_X64_MSR_APIC_ASSIST_PAGE, MSR_KVM_ASYNC_PF_EN, MSR_KVM_STEAL_TIME,
 	MSR_KVM_PV_EOI_EN,
 
@@ -2438,6 +2439,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_HYPERV:
 	case KVM_CAP_HYPERV_VAPIC:
 	case KVM_CAP_HYPERV_SPIN:
+	case KVM_CAP_HYPERV_SYNIC:
 	case KVM_CAP_PCI_SEGMENT:
 	case KVM_CAP_DEBUGREGS:
 	case KVM_CAP_X86_ROBUST_SINGLESTEP:
@@ -3069,6 +3071,20 @@ static int kvm_set_guest_paused(struct kvm_vcpu *vcpu)
 	return 0;
 }
 
+static int kvm_vcpu_ioctl_enable_cap(struct kvm_vcpu *vcpu,
+				     struct kvm_enable_cap *cap)
+{
+	if (cap->flags)
+		return -EINVAL;
+
+	switch (cap->cap) {
+	case KVM_CAP_HYPERV_SYNIC:
+		return kvm_hv_activate_synic(vcpu);
+	default:
+		return -EINVAL;
+	}
+}
+
 long kvm_arch_vcpu_ioctl(struct file *filp,
 			 unsigned int ioctl, unsigned long arg)
 {
@@ -3333,6 +3349,15 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
 		r = kvm_set_guest_paused(vcpu);
 		goto out;
 	}
+	case KVM_ENABLE_CAP: {
+		struct kvm_enable_cap cap;
+
+		r = -EFAULT;
+		if (copy_from_user(&cap, argp, sizeof(cap)))
+			goto out;
+		r = kvm_vcpu_ioctl_enable_cap(vcpu, &cap);
+		break;
+	}
 	default:
 		r = -EINVAL;
 	}
@@ -6206,6 +6231,8 @@ static void process_smi(struct kvm_vcpu *vcpu)
 
 static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
 {
+	u64 eoi_exit_bitmap[4];
+
 	if (!kvm_apic_hw_enabled(vcpu->arch.apic))
 		return;
 
@@ -6218,8 +6245,9 @@ static void vcpu_scan_ioapic(struct kvm_vcpu *vcpu)
 			kvm_x86_ops->sync_pir_to_irr(vcpu);
 		kvm_ioapic_scan_entry(vcpu, vcpu->arch.ioapic_handled_vectors);
 	}
-	kvm_x86_ops->load_eoi_exitmap(vcpu,
-				      (u64 *)vcpu->arch.ioapic_handled_vectors);
+	bitmap_or((ulong *)eoi_exit_bitmap, vcpu->arch.ioapic_handled_vectors,
+		  vcpu_to_synic(vcpu)->vec_bitmap, 256);
+	kvm_x86_ops->load_eoi_exitmap(vcpu, eoi_exit_bitmap);
 }
 
 static void kvm_vcpu_flush_tlb(struct kvm_vcpu *vcpu)
@@ -7475,6 +7503,8 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
 
 	vcpu->arch.pending_external_vector = -1;
 
+	kvm_hv_vcpu_init(vcpu);
+
 	return 0;
 
 fail_free_mce_banks:
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index dbe2a2f..a998e55 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -318,6 +318,11 @@ struct kvm_s390_adapter_int {
 	u32 adapter_id;
 };
 
+struct kvm_hv_sint {
+	u32 vcpu;
+	u32 sint;
+};
+
 struct kvm_kernel_irq_routing_entry {
 	u32 gsi;
 	u32 type;
@@ -331,6 +336,7 @@ struct kvm_kernel_irq_routing_entry {
 		} irqchip;
 		struct msi_msg msi;
 		struct kvm_s390_adapter_int adapter;
+		struct kvm_hv_sint hv_sint;
 	};
 	struct hlist_node link;
 };
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 03f3618..27ce460 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -831,6 +831,7 @@ struct kvm_ppc_smmu_info {
 #define KVM_CAP_GUEST_DEBUG_HW_WPS 120
 #define KVM_CAP_SPLIT_IRQCHIP 121
 #define KVM_CAP_IOEVENTFD_ANY_LENGTH 122
+#define KVM_CAP_HYPERV_SYNIC 123
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -854,10 +855,16 @@ struct kvm_irq_routing_s390_adapter {
 	__u32 adapter_id;
 };
 
+struct kvm_irq_routing_hv_sint {
+	__u32 vcpu;
+	__u32 sint;
+};
+
 /* gsi routing entry types */
 #define KVM_IRQ_ROUTING_IRQCHIP 1
 #define KVM_IRQ_ROUTING_MSI 2
 #define KVM_IRQ_ROUTING_S390_ADAPTER 3
+#define KVM_IRQ_ROUTING_HV_SINT 4
 
 struct kvm_irq_routing_entry {
 	__u32 gsi;
@@ -868,6 +875,7 @@ struct kvm_irq_routing_entry {
 		struct kvm_irq_routing_irqchip irqchip;
 		struct kvm_irq_routing_msi msi;
 		struct kvm_irq_routing_s390_adapter adapter;
+		struct kvm_irq_routing_hv_sint hv_sint;
 		__u32 pad[8];
 	} u;
 };
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit
  2015-11-10 12:36 ` [Qemu-devel] " Andrey Smetanin
@ 2015-11-10 12:36   ` Andrey Smetanin
  -1 siblings, 0 replies; 38+ messages in thread
From: Andrey Smetanin @ 2015-11-10 12:36 UTC (permalink / raw)
  To: kvm; +Cc: Denis V. Lunev, Gleb Natapov, Paolo Bonzini, Roman Kagan, qemu-devel

A new vcpu exit is introduced to notify the userspace of the
changes in Hyper-V SynIC configuration triggered by guest writing to the
corresponding MSRs.

Changes v4:
* exit into userspace only if guest writes into SynIC MSR's

Changes v3:
* added KVM_EXIT_HYPERV types and structs notes into docs

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org

---
 Documentation/virtual/kvm/api.txt | 22 ++++++++++++++++++++++
 arch/x86/include/asm/kvm_host.h   |  1 +
 arch/x86/kvm/hyperv.c             | 20 ++++++++++++++++++++
 arch/x86/kvm/x86.c                |  6 ++++++
 include/linux/kvm_host.h          |  1 +
 include/uapi/linux/kvm.h          | 17 +++++++++++++++++
 6 files changed, 67 insertions(+)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 16096a2..abc4f48 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -3337,6 +3337,28 @@ the userspace IOAPIC should process the EOI and retrigger the interrupt if
 it is still asserted.  Vector is the LAPIC interrupt vector for which the
 EOI was received.
 
+		struct kvm_hyperv_exit {
+#define KVM_EXIT_HYPERV_SYNIC          1
+			__u32 type;
+			union {
+				struct {
+					__u32 msr;
+					__u64 control;
+					__u64 evt_page;
+					__u64 msg_page;
+				} synic;
+			} u;
+		};
+		/* KVM_EXIT_HYPERV */
+                struct kvm_hyperv_exit hyperv;
+Indicates that the VCPU exits into userspace to process some tasks
+related to Hyper-V emulation.
+Valid values for 'type' are:
+	KVM_EXIT_HYPERV_SYNIC -- synchronously notify user-space about
+Hyper-V SynIC state change. Notification is used to remap SynIC
+event/message pages and to enable/disable SynIC messages/events processing
+in userspace.
+
 		/* Fix the size of the union. */
 		char padding[256];
 	};
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index ad29e89..1cefa1e 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -393,6 +393,7 @@ struct kvm_vcpu_hv {
 	u64 hv_vapic;
 	s64 runtime_offset;
 	struct kvm_vcpu_hv_synic synic;
+	struct kvm_hyperv_exit exit;
 };
 
 struct kvm_vcpu_arch {
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 83a3c0c..41869a9 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -130,6 +130,20 @@ static void kvm_hv_notify_acked_sint(struct kvm_vcpu *vcpu, u32 sint)
 	srcu_read_unlock(&kvm->irq_srcu, idx);
 }
 
+static void synic_exit(struct kvm_vcpu_hv_synic *synic, u32 msr)
+{
+	struct kvm_vcpu *vcpu = synic_to_vcpu(synic);
+	struct kvm_vcpu_hv *hv_vcpu = &vcpu->arch.hyperv;
+
+	hv_vcpu->exit.type = KVM_EXIT_HYPERV_SYNIC;
+	hv_vcpu->exit.u.synic.msr = msr;
+	hv_vcpu->exit.u.synic.control = synic->control;
+	hv_vcpu->exit.u.synic.evt_page = synic->evt_page;
+	hv_vcpu->exit.u.synic.msg_page = synic->msg_page;
+
+	kvm_make_request(KVM_REQ_HV_EXIT, vcpu);
+}
+
 static int synic_set_msr(struct kvm_vcpu_hv_synic *synic,
 			 u32 msr, u64 data, bool host)
 {
@@ -145,6 +159,8 @@ static int synic_set_msr(struct kvm_vcpu_hv_synic *synic,
 	switch (msr) {
 	case HV_X64_MSR_SCONTROL:
 		synic->control = data;
+		if (!host)
+			synic_exit(synic, msr);
 		break;
 	case HV_X64_MSR_SVERSION:
 		if (!host) {
@@ -161,6 +177,8 @@ static int synic_set_msr(struct kvm_vcpu_hv_synic *synic,
 				break;
 			}
 		synic->evt_page = data;
+		if (!host)
+			synic_exit(synic, msr);
 		break;
 	case HV_X64_MSR_SIMP:
 		if (data & HV_SYNIC_SIMP_ENABLE)
@@ -170,6 +188,8 @@ static int synic_set_msr(struct kvm_vcpu_hv_synic *synic,
 				break;
 			}
 		synic->msg_page = data;
+		if (!host)
+			synic_exit(synic, msr);
 		break;
 	case HV_X64_MSR_EOM: {
 		int i;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 41f3030..04daf32 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6377,6 +6377,12 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 			r = 0;
 			goto out;
 		}
+		if (kvm_check_request(KVM_REQ_HV_EXIT, vcpu)) {
+			vcpu->run->exit_reason = KVM_EXIT_HYPERV;
+			vcpu->run->hyperv = vcpu->arch.hyperv.exit;
+			r = 0;
+			goto out;
+		}
 	}
 
 	/*
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index a998e55..092bbd7 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -143,6 +143,7 @@ static inline bool is_error_page(struct page *page)
 #define KVM_REQ_HV_CRASH          27
 #define KVM_REQ_IOAPIC_EOI_EXIT   28
 #define KVM_REQ_HV_RESET          29
+#define KVM_REQ_HV_EXIT           30
 
 #define KVM_USERSPACE_IRQ_SOURCE_ID		0
 #define KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID	1
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 27ce460..6e32f75 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -154,6 +154,20 @@ struct kvm_s390_skeys {
 	__u32 flags;
 	__u32 reserved[9];
 };
+
+struct kvm_hyperv_exit {
+#define KVM_EXIT_HYPERV_SYNIC          1
+	__u32 type;
+	union {
+		struct {
+			__u32 msr;
+			__u64 control;
+			__u64 evt_page;
+			__u64 msg_page;
+		} synic;
+	} u;
+};
+
 #define KVM_S390_GET_SKEYS_NONE   1
 #define KVM_S390_SKEYS_MAX        1048576
 
@@ -184,6 +198,7 @@ struct kvm_s390_skeys {
 #define KVM_EXIT_SYSTEM_EVENT     24
 #define KVM_EXIT_S390_STSI        25
 #define KVM_EXIT_IOAPIC_EOI       26
+#define KVM_EXIT_HYPERV           27
 
 /* For KVM_EXIT_INTERNAL_ERROR */
 /* Emulate instruction failed. */
@@ -338,6 +353,8 @@ struct kvm_run {
 		struct {
 			__u8 vector;
 		} eoi;
+		/* KVM_EXIT_HYPERV */
+		struct kvm_hyperv_exit hyperv;
 		/* Fix the size of the union. */
 		char padding[256];
 	};
-- 
2.4.3


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit
@ 2015-11-10 12:36   ` Andrey Smetanin
  0 siblings, 0 replies; 38+ messages in thread
From: Andrey Smetanin @ 2015-11-10 12:36 UTC (permalink / raw)
  To: kvm; +Cc: Gleb Natapov, Denis V. Lunev, qemu-devel, Roman Kagan, Paolo Bonzini

A new vcpu exit is introduced to notify the userspace of the
changes in Hyper-V SynIC configuration triggered by guest writing to the
corresponding MSRs.

Changes v4:
* exit into userspace only if guest writes into SynIC MSR's

Changes v3:
* added KVM_EXIT_HYPERV types and structs notes into docs

Signed-off-by: Andrey Smetanin <asmetanin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Gleb Natapov <gleb@kernel.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Roman Kagan <rkagan@virtuozzo.com>
CC: Denis V. Lunev <den@openvz.org>
CC: qemu-devel@nongnu.org

---
 Documentation/virtual/kvm/api.txt | 22 ++++++++++++++++++++++
 arch/x86/include/asm/kvm_host.h   |  1 +
 arch/x86/kvm/hyperv.c             | 20 ++++++++++++++++++++
 arch/x86/kvm/x86.c                |  6 ++++++
 include/linux/kvm_host.h          |  1 +
 include/uapi/linux/kvm.h          | 17 +++++++++++++++++
 6 files changed, 67 insertions(+)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 16096a2..abc4f48 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -3337,6 +3337,28 @@ the userspace IOAPIC should process the EOI and retrigger the interrupt if
 it is still asserted.  Vector is the LAPIC interrupt vector for which the
 EOI was received.
 
+		struct kvm_hyperv_exit {
+#define KVM_EXIT_HYPERV_SYNIC          1
+			__u32 type;
+			union {
+				struct {
+					__u32 msr;
+					__u64 control;
+					__u64 evt_page;
+					__u64 msg_page;
+				} synic;
+			} u;
+		};
+		/* KVM_EXIT_HYPERV */
+                struct kvm_hyperv_exit hyperv;
+Indicates that the VCPU exits into userspace to process some tasks
+related to Hyper-V emulation.
+Valid values for 'type' are:
+	KVM_EXIT_HYPERV_SYNIC -- synchronously notify user-space about
+Hyper-V SynIC state change. Notification is used to remap SynIC
+event/message pages and to enable/disable SynIC messages/events processing
+in userspace.
+
 		/* Fix the size of the union. */
 		char padding[256];
 	};
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index ad29e89..1cefa1e 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -393,6 +393,7 @@ struct kvm_vcpu_hv {
 	u64 hv_vapic;
 	s64 runtime_offset;
 	struct kvm_vcpu_hv_synic synic;
+	struct kvm_hyperv_exit exit;
 };
 
 struct kvm_vcpu_arch {
diff --git a/arch/x86/kvm/hyperv.c b/arch/x86/kvm/hyperv.c
index 83a3c0c..41869a9 100644
--- a/arch/x86/kvm/hyperv.c
+++ b/arch/x86/kvm/hyperv.c
@@ -130,6 +130,20 @@ static void kvm_hv_notify_acked_sint(struct kvm_vcpu *vcpu, u32 sint)
 	srcu_read_unlock(&kvm->irq_srcu, idx);
 }
 
+static void synic_exit(struct kvm_vcpu_hv_synic *synic, u32 msr)
+{
+	struct kvm_vcpu *vcpu = synic_to_vcpu(synic);
+	struct kvm_vcpu_hv *hv_vcpu = &vcpu->arch.hyperv;
+
+	hv_vcpu->exit.type = KVM_EXIT_HYPERV_SYNIC;
+	hv_vcpu->exit.u.synic.msr = msr;
+	hv_vcpu->exit.u.synic.control = synic->control;
+	hv_vcpu->exit.u.synic.evt_page = synic->evt_page;
+	hv_vcpu->exit.u.synic.msg_page = synic->msg_page;
+
+	kvm_make_request(KVM_REQ_HV_EXIT, vcpu);
+}
+
 static int synic_set_msr(struct kvm_vcpu_hv_synic *synic,
 			 u32 msr, u64 data, bool host)
 {
@@ -145,6 +159,8 @@ static int synic_set_msr(struct kvm_vcpu_hv_synic *synic,
 	switch (msr) {
 	case HV_X64_MSR_SCONTROL:
 		synic->control = data;
+		if (!host)
+			synic_exit(synic, msr);
 		break;
 	case HV_X64_MSR_SVERSION:
 		if (!host) {
@@ -161,6 +177,8 @@ static int synic_set_msr(struct kvm_vcpu_hv_synic *synic,
 				break;
 			}
 		synic->evt_page = data;
+		if (!host)
+			synic_exit(synic, msr);
 		break;
 	case HV_X64_MSR_SIMP:
 		if (data & HV_SYNIC_SIMP_ENABLE)
@@ -170,6 +188,8 @@ static int synic_set_msr(struct kvm_vcpu_hv_synic *synic,
 				break;
 			}
 		synic->msg_page = data;
+		if (!host)
+			synic_exit(synic, msr);
 		break;
 	case HV_X64_MSR_EOM: {
 		int i;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 41f3030..04daf32 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6377,6 +6377,12 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 			r = 0;
 			goto out;
 		}
+		if (kvm_check_request(KVM_REQ_HV_EXIT, vcpu)) {
+			vcpu->run->exit_reason = KVM_EXIT_HYPERV;
+			vcpu->run->hyperv = vcpu->arch.hyperv.exit;
+			r = 0;
+			goto out;
+		}
 	}
 
 	/*
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index a998e55..092bbd7 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -143,6 +143,7 @@ static inline bool is_error_page(struct page *page)
 #define KVM_REQ_HV_CRASH          27
 #define KVM_REQ_IOAPIC_EOI_EXIT   28
 #define KVM_REQ_HV_RESET          29
+#define KVM_REQ_HV_EXIT           30
 
 #define KVM_USERSPACE_IRQ_SOURCE_ID		0
 #define KVM_IRQFD_RESAMPLE_IRQ_SOURCE_ID	1
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 27ce460..6e32f75 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -154,6 +154,20 @@ struct kvm_s390_skeys {
 	__u32 flags;
 	__u32 reserved[9];
 };
+
+struct kvm_hyperv_exit {
+#define KVM_EXIT_HYPERV_SYNIC          1
+	__u32 type;
+	union {
+		struct {
+			__u32 msr;
+			__u64 control;
+			__u64 evt_page;
+			__u64 msg_page;
+		} synic;
+	} u;
+};
+
 #define KVM_S390_GET_SKEYS_NONE   1
 #define KVM_S390_SKEYS_MAX        1048576
 
@@ -184,6 +198,7 @@ struct kvm_s390_skeys {
 #define KVM_EXIT_SYSTEM_EVENT     24
 #define KVM_EXIT_S390_STSI        25
 #define KVM_EXIT_IOAPIC_EOI       26
+#define KVM_EXIT_HYPERV           27
 
 /* For KVM_EXIT_INTERNAL_ERROR */
 /* Emulate instruction failed. */
@@ -338,6 +353,8 @@ struct kvm_run {
 		struct {
 			__u8 vector;
 		} eoi;
+		/* KVM_EXIT_HYPERV */
+		struct kvm_hyperv_exit hyperv;
 		/* Fix the size of the union. */
 		char padding[256];
 	};
-- 
2.4.3

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* RE: [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit
  2015-11-10 12:36   ` [Qemu-devel] " Andrey Smetanin
@ 2015-12-18 15:19     ` Pavel Fedin
  -1 siblings, 0 replies; 38+ messages in thread
From: Pavel Fedin @ 2015-12-18 15:19 UTC (permalink / raw)
  To: 'Andrey Smetanin', kvm
  Cc: 'Denis V. Lunev', 'Gleb Natapov',
	'Paolo Bonzini', 'Roman Kagan',
	qemu-devel

 Hello!

 I realize that it's perhaps too late, because patches are already on Linux-next, but i have one concern... May be it's not too
late...

 I dislike implementing architecture-dependent exit code where we could implement an architecture-independent one.

 As far as i understand this code, KVM_EXIT_HYPERV is called when one of three MSRs are accessed. But, shouldn't we have implemented
instead something more generic, like KVM_EXIT_REG_IO, which would work similar to KVM_EXIT_PIO or KVM_EXIT_MMIO, but carry register
code and value?

 This would allow us to solve the same task which we have done here, but this solution would be reusable for other devices and other
archirectures. What if in future we have more system registers to emulate in userspace?

 I write this because at one point i suggested similar thing for ARM64 (but i never actually wrote it), to emulate physical CP15
timer. And it would require exactly the same capability - process some trapped system register accesses in userspace.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit
@ 2015-12-18 15:19     ` Pavel Fedin
  0 siblings, 0 replies; 38+ messages in thread
From: Pavel Fedin @ 2015-12-18 15:19 UTC (permalink / raw)
  To: 'Andrey Smetanin', kvm
  Cc: 'Gleb Natapov', 'Denis V. Lunev',
	qemu-devel, 'Roman Kagan', 'Paolo Bonzini'

 Hello!

 I realize that it's perhaps too late, because patches are already on Linux-next, but i have one concern... May be it's not too
late...

 I dislike implementing architecture-dependent exit code where we could implement an architecture-independent one.

 As far as i understand this code, KVM_EXIT_HYPERV is called when one of three MSRs are accessed. But, shouldn't we have implemented
instead something more generic, like KVM_EXIT_REG_IO, which would work similar to KVM_EXIT_PIO or KVM_EXIT_MMIO, but carry register
code and value?

 This would allow us to solve the same task which we have done here, but this solution would be reusable for other devices and other
archirectures. What if in future we have more system registers to emulate in userspace?

 I write this because at one point i suggested similar thing for ARM64 (but i never actually wrote it), to emulate physical CP15
timer. And it would require exactly the same capability - process some trapped system register accesses in userspace.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit
  2015-12-18 15:19     ` [Qemu-devel] " Pavel Fedin
@ 2015-12-18 15:53       ` Denis V. Lunev
  -1 siblings, 0 replies; 38+ messages in thread
From: Denis V. Lunev @ 2015-12-18 15:53 UTC (permalink / raw)
  To: Pavel Fedin, 'Andrey Smetanin', kvm
  Cc: 'Gleb Natapov', 'Paolo Bonzini',
	'Roman Kagan',
	qemu-devel

On 12/18/2015 06:19 PM, Pavel Fedin wrote:
>   Hello!
>
>   I realize that it's perhaps too late, because patches are already on Linux-next, but i have one concern... May be it's not too
> late...
>
>   I dislike implementing architecture-dependent exit code where we could implement an architecture-independent one.
>
>   As far as i understand this code, KVM_EXIT_HYPERV is called when one of three MSRs are accessed. But, shouldn't we have implemented
> instead something more generic, like KVM_EXIT_REG_IO, which would work similar to KVM_EXIT_PIO or KVM_EXIT_MMIO, but carry register
> code and value?
>
>   This would allow us to solve the same task which we have done here, but this solution would be reusable for other devices and other
> archirectures. What if in future we have more system registers to emulate in userspace?
>
>   I write this because at one point i suggested similar thing for ARM64 (but i never actually wrote it), to emulate physical CP15
> timer. And it would require exactly the same capability - process some trapped system register accesses in userspace.
>
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
>
>
we have discussed this AFAIR. HyperV guest implementation
is available in Linux kernel and thus technically we can have
this stuff on any platform.

Den

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit
@ 2015-12-18 15:53       ` Denis V. Lunev
  0 siblings, 0 replies; 38+ messages in thread
From: Denis V. Lunev @ 2015-12-18 15:53 UTC (permalink / raw)
  To: Pavel Fedin, 'Andrey Smetanin', kvm
  Cc: 'Gleb Natapov', 'Paolo Bonzini',
	'Roman Kagan',
	qemu-devel

On 12/18/2015 06:19 PM, Pavel Fedin wrote:
>   Hello!
>
>   I realize that it's perhaps too late, because patches are already on Linux-next, but i have one concern... May be it's not too
> late...
>
>   I dislike implementing architecture-dependent exit code where we could implement an architecture-independent one.
>
>   As far as i understand this code, KVM_EXIT_HYPERV is called when one of three MSRs are accessed. But, shouldn't we have implemented
> instead something more generic, like KVM_EXIT_REG_IO, which would work similar to KVM_EXIT_PIO or KVM_EXIT_MMIO, but carry register
> code and value?
>
>   This would allow us to solve the same task which we have done here, but this solution would be reusable for other devices and other
> archirectures. What if in future we have more system registers to emulate in userspace?
>
>   I write this because at one point i suggested similar thing for ARM64 (but i never actually wrote it), to emulate physical CP15
> timer. And it would require exactly the same capability - process some trapped system register accesses in userspace.
>
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
>
>
we have discussed this AFAIR. HyperV guest implementation
is available in Linux kernel and thus technically we can have
this stuff on any platform.

Den

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit
  2015-12-18 15:19     ` [Qemu-devel] " Pavel Fedin
@ 2015-12-18 16:01       ` Paolo Bonzini
  -1 siblings, 0 replies; 38+ messages in thread
From: Paolo Bonzini @ 2015-12-18 16:01 UTC (permalink / raw)
  To: Pavel Fedin, 'Andrey Smetanin', kvm
  Cc: 'Denis V. Lunev', 'Gleb Natapov',
	'Roman Kagan',
	qemu-devel, Peter Hornyack



On 18/12/2015 16:19, Pavel Fedin wrote:
> As far as i understand this code, KVM_EXIT_HYPERV is called when one
> of three MSRs are accessed. But, shouldn't we have implemented 
> instead something more generic, like KVM_EXIT_REG_IO, which would
> work similar to KVM_EXIT_PIO or KVM_EXIT_MMIO, but carry register 
> code and value?

Yes, we considered that.  There were actually patches for this as well.
 However, in this case the register is still emulated in the kernel, and
userspace just gets informed of the new value.

> This would allow us to solve the same task which we have done here,
> but this solution would be reusable for other devices and other 
> archirectures. What if in future we have more system registers to
> emulate in userspace?

If we do get that, we will just rename KVM_EXIT_HYPERV to
KVM_EXIT_MSR_ACCESS, and KVM_EXIT_HYPERV_SYNIC to
KVM_EXIT_MSR_HYPERV_SYNIC, and struct kvm_hyperv_exit to kvm_msr_exit.

Actually, we can do it now.  What do you guys think?  Peter?  I might
even be convinced to document the capability (in Documentation/ and
uapi/) even if the upstream kernel doesn't provide it.  We still have a
lot of time until 4.5 is out, it can be done after the merge window even.

Paolo

> I write this because at one point i suggested similar thing for ARM64
> (but i never actually wrote it), to emulate physical CP15 timer. And
> it would require exactly the same capability - process some trapped
> system register accesses in userspace.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit
@ 2015-12-18 16:01       ` Paolo Bonzini
  0 siblings, 0 replies; 38+ messages in thread
From: Paolo Bonzini @ 2015-12-18 16:01 UTC (permalink / raw)
  To: Pavel Fedin, 'Andrey Smetanin', kvm
  Cc: 'Gleb Natapov', 'Denis V. Lunev',
	Peter Hornyack, 'Roman Kagan',
	qemu-devel



On 18/12/2015 16:19, Pavel Fedin wrote:
> As far as i understand this code, KVM_EXIT_HYPERV is called when one
> of three MSRs are accessed. But, shouldn't we have implemented 
> instead something more generic, like KVM_EXIT_REG_IO, which would
> work similar to KVM_EXIT_PIO or KVM_EXIT_MMIO, but carry register 
> code and value?

Yes, we considered that.  There were actually patches for this as well.
 However, in this case the register is still emulated in the kernel, and
userspace just gets informed of the new value.

> This would allow us to solve the same task which we have done here,
> but this solution would be reusable for other devices and other 
> archirectures. What if in future we have more system registers to
> emulate in userspace?

If we do get that, we will just rename KVM_EXIT_HYPERV to
KVM_EXIT_MSR_ACCESS, and KVM_EXIT_HYPERV_SYNIC to
KVM_EXIT_MSR_HYPERV_SYNIC, and struct kvm_hyperv_exit to kvm_msr_exit.

Actually, we can do it now.  What do you guys think?  Peter?  I might
even be convinced to document the capability (in Documentation/ and
uapi/) even if the upstream kernel doesn't provide it.  We still have a
lot of time until 4.5 is out, it can be done after the merge window even.

Paolo

> I write this because at one point i suggested similar thing for ARM64
> (but i never actually wrote it), to emulate physical CP15 timer. And
> it would require exactly the same capability - process some trapped
> system register accesses in userspace.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit
  2015-12-18 15:19     ` [Qemu-devel] " Pavel Fedin
@ 2015-12-18 18:00       ` 'Roman Kagan'
  -1 siblings, 0 replies; 38+ messages in thread
From: 'Roman Kagan' @ 2015-12-18 18:00 UTC (permalink / raw)
  To: Pavel Fedin
  Cc: 'Andrey Smetanin', kvm, 'Denis V. Lunev',
	'Gleb Natapov', 'Paolo Bonzini',
	qemu-devel

On Fri, Dec 18, 2015 at 06:19:25PM +0300, Pavel Fedin wrote:
>  I dislike implementing architecture-dependent exit code where we could implement an architecture-independent one.
> 
>  As far as i understand this code, KVM_EXIT_HYPERV is called when one of three MSRs are accessed. But, shouldn't we have implemented
> instead something more generic, like KVM_EXIT_REG_IO, which would work similar to KVM_EXIT_PIO or KVM_EXIT_MMIO, but carry register
> code and value?
> 
>  This would allow us to solve the same task which we have done here, but this solution would be reusable for other devices and other
> archirectures. What if in future we have more system registers to emulate in userspace?
> 
>  I write this because at one point i suggested similar thing for ARM64 (but i never actually wrote it), to emulate physical CP15
> timer. And it would require exactly the same capability - process some trapped system register accesses in userspace.

Note that, as Paolo pointed out, in case of HyperV the value of the
register is of interest not only to the userspace but to the KVM, too.
Otherwise I don't see off hand why a generic infrastructure wouldn't fit
in our usecase.

Roman.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit
@ 2015-12-18 18:00       ` 'Roman Kagan'
  0 siblings, 0 replies; 38+ messages in thread
From: 'Roman Kagan' @ 2015-12-18 18:00 UTC (permalink / raw)
  To: Pavel Fedin
  Cc: kvm, 'Gleb Natapov', qemu-devel, 'Paolo Bonzini',
	'Andrey Smetanin', 'Denis V. Lunev'

On Fri, Dec 18, 2015 at 06:19:25PM +0300, Pavel Fedin wrote:
>  I dislike implementing architecture-dependent exit code where we could implement an architecture-independent one.
> 
>  As far as i understand this code, KVM_EXIT_HYPERV is called when one of three MSRs are accessed. But, shouldn't we have implemented
> instead something more generic, like KVM_EXIT_REG_IO, which would work similar to KVM_EXIT_PIO or KVM_EXIT_MMIO, but carry register
> code and value?
> 
>  This would allow us to solve the same task which we have done here, but this solution would be reusable for other devices and other
> archirectures. What if in future we have more system registers to emulate in userspace?
> 
>  I write this because at one point i suggested similar thing for ARM64 (but i never actually wrote it), to emulate physical CP15
> timer. And it would require exactly the same capability - process some trapped system register accesses in userspace.

Note that, as Paolo pointed out, in case of HyperV the value of the
register is of interest not only to the userspace but to the KVM, too.
Otherwise I don't see off hand why a generic infrastructure wouldn't fit
in our usecase.

Roman.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit
  2015-12-18 16:01       ` [Qemu-devel] " Paolo Bonzini
@ 2015-12-18 18:10         ` Peter Hornyack
  -1 siblings, 0 replies; 38+ messages in thread
From: Peter Hornyack @ 2015-12-18 18:10 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Pavel Fedin, Andrey Smetanin, kvm list, Denis V. Lunev,
	Gleb Natapov, Roman Kagan, qemu-devel

On Fri, Dec 18, 2015 at 8:01 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>
> On 18/12/2015 16:19, Pavel Fedin wrote:
>> As far as i understand this code, KVM_EXIT_HYPERV is called when one
>> of three MSRs are accessed. But, shouldn't we have implemented
>> instead something more generic, like KVM_EXIT_REG_IO, which would
>> work similar to KVM_EXIT_PIO or KVM_EXIT_MMIO, but carry register
>> code and value?
>
> Yes, we considered that.  There were actually patches for this as well.
>  However, in this case the register is still emulated in the kernel, and
> userspace just gets informed of the new value.

On brief inspection of Andrey's patch (I have not been following
closely) it looks like the kvm_hyperv_exit struct that's returned to
userspace contains more data (control, evt_page, and msg_page fields)
than simply the value of the MSR, so would the desired SynIC exit fit
into a general-purpose exit for MSR emulation?

>> This would allow us to solve the same task which we have done here,
>> but this solution would be reusable for other devices and other
>> archirectures. What if in future we have more system registers to
>> emulate in userspace?
>
> If we do get that, we will just rename KVM_EXIT_HYPERV to
> KVM_EXIT_MSR_ACCESS, and KVM_EXIT_HYPERV_SYNIC to
> KVM_EXIT_MSR_HYPERV_SYNIC, and struct kvm_hyperv_exit to kvm_msr_exit.
>
> Actually, we can do it now.  What do you guys think?  Peter?  I might
> even be convinced to document the capability (in Documentation/ and
> uapi/) even if the upstream kernel doesn't provide it.  We still have a
> lot of time until 4.5 is out, it can be done after the merge window even.

I will update and re-send that patchset for discussion.

>
> Paolo
>
>> I write this because at one point i suggested similar thing for ARM64
>> (but i never actually wrote it), to emulate physical CP15 timer. And
>> it would require exactly the same capability - process some trapped
>> system register accesses in userspace.

Peter

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit
@ 2015-12-18 18:10         ` Peter Hornyack
  0 siblings, 0 replies; 38+ messages in thread
From: Peter Hornyack @ 2015-12-18 18:10 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm list, Gleb Natapov, Pavel Fedin, qemu-devel, Roman Kagan,
	Andrey Smetanin, Denis V. Lunev

On Fri, Dec 18, 2015 at 8:01 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>
>
> On 18/12/2015 16:19, Pavel Fedin wrote:
>> As far as i understand this code, KVM_EXIT_HYPERV is called when one
>> of three MSRs are accessed. But, shouldn't we have implemented
>> instead something more generic, like KVM_EXIT_REG_IO, which would
>> work similar to KVM_EXIT_PIO or KVM_EXIT_MMIO, but carry register
>> code and value?
>
> Yes, we considered that.  There were actually patches for this as well.
>  However, in this case the register is still emulated in the kernel, and
> userspace just gets informed of the new value.

On brief inspection of Andrey's patch (I have not been following
closely) it looks like the kvm_hyperv_exit struct that's returned to
userspace contains more data (control, evt_page, and msg_page fields)
than simply the value of the MSR, so would the desired SynIC exit fit
into a general-purpose exit for MSR emulation?

>> This would allow us to solve the same task which we have done here,
>> but this solution would be reusable for other devices and other
>> archirectures. What if in future we have more system registers to
>> emulate in userspace?
>
> If we do get that, we will just rename KVM_EXIT_HYPERV to
> KVM_EXIT_MSR_ACCESS, and KVM_EXIT_HYPERV_SYNIC to
> KVM_EXIT_MSR_HYPERV_SYNIC, and struct kvm_hyperv_exit to kvm_msr_exit.
>
> Actually, we can do it now.  What do you guys think?  Peter?  I might
> even be convinced to document the capability (in Documentation/ and
> uapi/) even if the upstream kernel doesn't provide it.  We still have a
> lot of time until 4.5 is out, it can be done after the merge window even.

I will update and re-send that patchset for discussion.

>
> Paolo
>
>> I write this because at one point i suggested similar thing for ARM64
>> (but i never actually wrote it), to emulate physical CP15 timer. And
>> it would require exactly the same capability - process some trapped
>> system register accesses in userspace.

Peter

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit
  2015-12-18 18:10         ` [Qemu-devel] " Peter Hornyack
@ 2015-12-18 18:23           ` Paolo Bonzini
  -1 siblings, 0 replies; 38+ messages in thread
From: Paolo Bonzini @ 2015-12-18 18:23 UTC (permalink / raw)
  To: Peter Hornyack
  Cc: kvm list, Gleb Natapov, Pavel Fedin, qemu-devel, Roman Kagan,
	Andrey Smetanin, Denis V. Lunev



On 18/12/2015 19:10, Peter Hornyack wrote:
> On brief inspection of Andrey's patch (I have not been following
> closely) it looks like the kvm_hyperv_exit struct that's returned to
> userspace contains more data (control, evt_page, and msg_page fields)
> than simply the value of the MSR, so would the desired SynIC exit fit
> into a general-purpose exit for MSR emulation?

This would be a special case that is used even if the hyperv MSRs are
emulated in the kernel.  Other exit subcodes than
KVM_EXIT_(MSR_)HYPERV_SYNIC would be used for your usecase of for MSR
emulation in userspace.

Paolo

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit
@ 2015-12-18 18:23           ` Paolo Bonzini
  0 siblings, 0 replies; 38+ messages in thread
From: Paolo Bonzini @ 2015-12-18 18:23 UTC (permalink / raw)
  To: Peter Hornyack
  Cc: kvm list, Gleb Natapov, Pavel Fedin, qemu-devel, Roman Kagan,
	Andrey Smetanin, Denis V. Lunev



On 18/12/2015 19:10, Peter Hornyack wrote:
> On brief inspection of Andrey's patch (I have not been following
> closely) it looks like the kvm_hyperv_exit struct that's returned to
> userspace contains more data (control, evt_page, and msg_page fields)
> than simply the value of the MSR, so would the desired SynIC exit fit
> into a general-purpose exit for MSR emulation?

This would be a special case that is used even if the hyperv MSRs are
emulated in the kernel.  Other exit subcodes than
KVM_EXIT_(MSR_)HYPERV_SYNIC would be used for your usecase of for MSR
emulation in userspace.

Paolo

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit
  2015-12-18 16:01       ` [Qemu-devel] " Paolo Bonzini
@ 2015-12-18 18:25         ` 'Roman Kagan'
  -1 siblings, 0 replies; 38+ messages in thread
From: 'Roman Kagan' @ 2015-12-18 18:25 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: Pavel Fedin, 'Andrey Smetanin',
	kvm, 'Denis V. Lunev', 'Gleb Natapov',
	qemu-devel, Peter Hornyack

On Fri, Dec 18, 2015 at 05:01:59PM +0100, Paolo Bonzini wrote:
> On 18/12/2015 16:19, Pavel Fedin wrote:
> > As far as i understand this code, KVM_EXIT_HYPERV is called when one
> > of three MSRs are accessed. But, shouldn't we have implemented 
> > instead something more generic, like KVM_EXIT_REG_IO, which would
> > work similar to KVM_EXIT_PIO or KVM_EXIT_MMIO, but carry register 
> > code and value?
> 
> Yes, we considered that.  There were actually patches for this as well.
>  However, in this case the register is still emulated in the kernel, and
> userspace just gets informed of the new value.
> 
> > This would allow us to solve the same task which we have done here,
> > but this solution would be reusable for other devices and other 
> > archirectures. What if in future we have more system registers to
> > emulate in userspace?
> 
> If we do get that, we will just rename KVM_EXIT_HYPERV to
> KVM_EXIT_MSR_ACCESS, and KVM_EXIT_HYPERV_SYNIC to
> KVM_EXIT_MSR_HYPERV_SYNIC, and struct kvm_hyperv_exit to kvm_msr_exit.

A generic implemenation will probably just convey (msr#, value) pair,
and KVM_EXIT_MSR_HYPERV_SYNIC wouldn't be needed at all.

I don't immediately see why it wouldn't work for us; we'd have reused
the infrastructure if it existed when we started our work.  I didn't see
Peter's patches yet; maybe we can come up with an interim solution to
fit in the merge window but expose a sufficiently generic API.

Roman.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit
@ 2015-12-18 18:25         ` 'Roman Kagan'
  0 siblings, 0 replies; 38+ messages in thread
From: 'Roman Kagan' @ 2015-12-18 18:25 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: kvm, 'Gleb Natapov',
	Pavel Fedin, qemu-devel, Peter Hornyack,
	'Andrey Smetanin', 'Denis V. Lunev'

On Fri, Dec 18, 2015 at 05:01:59PM +0100, Paolo Bonzini wrote:
> On 18/12/2015 16:19, Pavel Fedin wrote:
> > As far as i understand this code, KVM_EXIT_HYPERV is called when one
> > of three MSRs are accessed. But, shouldn't we have implemented 
> > instead something more generic, like KVM_EXIT_REG_IO, which would
> > work similar to KVM_EXIT_PIO or KVM_EXIT_MMIO, but carry register 
> > code and value?
> 
> Yes, we considered that.  There were actually patches for this as well.
>  However, in this case the register is still emulated in the kernel, and
> userspace just gets informed of the new value.
> 
> > This would allow us to solve the same task which we have done here,
> > but this solution would be reusable for other devices and other 
> > archirectures. What if in future we have more system registers to
> > emulate in userspace?
> 
> If we do get that, we will just rename KVM_EXIT_HYPERV to
> KVM_EXIT_MSR_ACCESS, and KVM_EXIT_HYPERV_SYNIC to
> KVM_EXIT_MSR_HYPERV_SYNIC, and struct kvm_hyperv_exit to kvm_msr_exit.

A generic implemenation will probably just convey (msr#, value) pair,
and KVM_EXIT_MSR_HYPERV_SYNIC wouldn't be needed at all.

I don't immediately see why it wouldn't work for us; we'd have reused
the infrastructure if it existed when we started our work.  I didn't see
Peter's patches yet; maybe we can come up with an interim solution to
fit in the merge window but expose a sufficiently generic API.

Roman.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit
  2015-12-18 18:10         ` [Qemu-devel] " Peter Hornyack
@ 2015-12-18 18:39           ` Roman Kagan
  -1 siblings, 0 replies; 38+ messages in thread
From: Roman Kagan @ 2015-12-18 18:39 UTC (permalink / raw)
  To: Peter Hornyack
  Cc: Paolo Bonzini, Pavel Fedin, Andrey Smetanin, kvm list,
	Denis V. Lunev, Gleb Natapov, qemu-devel

On Fri, Dec 18, 2015 at 10:10:11AM -0800, Peter Hornyack wrote:
> On Fri, Dec 18, 2015 at 8:01 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> > On 18/12/2015 16:19, Pavel Fedin wrote:
> >> As far as i understand this code, KVM_EXIT_HYPERV is called when one
> >> of three MSRs are accessed. But, shouldn't we have implemented
> >> instead something more generic, like KVM_EXIT_REG_IO, which would
> >> work similar to KVM_EXIT_PIO or KVM_EXIT_MMIO, but carry register
> >> code and value?
> >
> > Yes, we considered that.  There were actually patches for this as well.
> >  However, in this case the register is still emulated in the kernel, and
> > userspace just gets informed of the new value.
> 
> On brief inspection of Andrey's patch (I have not been following
> closely) it looks like the kvm_hyperv_exit struct that's returned to
> userspace contains more data (control, evt_page, and msg_page fields)
> than simply the value of the MSR, so would the desired SynIC exit fit
> into a general-purpose exit for MSR emulation?

Frankly I'm struggling trying to recall why we implemented it this way.
Actually all three fields are the values of respective MSRs and I don't
see any necessity to pass all three at the same time when any of them
gets updated.  The patch for QEMU adds an exit handler which processes
the fields individually, so I have a strong suspicion that union was
meant here rather than struct.

I hope Andrey will help to shed some light on that when he's back in the
office on Monday; meanwhile I think this peculiarity can be ignored.

Roman.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit
@ 2015-12-18 18:39           ` Roman Kagan
  0 siblings, 0 replies; 38+ messages in thread
From: Roman Kagan @ 2015-12-18 18:39 UTC (permalink / raw)
  To: Peter Hornyack
  Cc: kvm list, Gleb Natapov, Pavel Fedin, qemu-devel, Denis V. Lunev,
	Andrey Smetanin, Paolo Bonzini

On Fri, Dec 18, 2015 at 10:10:11AM -0800, Peter Hornyack wrote:
> On Fri, Dec 18, 2015 at 8:01 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
> > On 18/12/2015 16:19, Pavel Fedin wrote:
> >> As far as i understand this code, KVM_EXIT_HYPERV is called when one
> >> of three MSRs are accessed. But, shouldn't we have implemented
> >> instead something more generic, like KVM_EXIT_REG_IO, which would
> >> work similar to KVM_EXIT_PIO or KVM_EXIT_MMIO, but carry register
> >> code and value?
> >
> > Yes, we considered that.  There were actually patches for this as well.
> >  However, in this case the register is still emulated in the kernel, and
> > userspace just gets informed of the new value.
> 
> On brief inspection of Andrey's patch (I have not been following
> closely) it looks like the kvm_hyperv_exit struct that's returned to
> userspace contains more data (control, evt_page, and msg_page fields)
> than simply the value of the MSR, so would the desired SynIC exit fit
> into a general-purpose exit for MSR emulation?

Frankly I'm struggling trying to recall why we implemented it this way.
Actually all three fields are the values of respective MSRs and I don't
see any necessity to pass all three at the same time when any of them
gets updated.  The patch for QEMU adds an exit handler which processes
the fields individually, so I have a strong suspicion that union was
meant here rather than struct.

I hope Andrey will help to shed some light on that when he's back in the
office on Monday; meanwhile I think this peculiarity can be ignored.

Roman.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit
  2015-12-18 16:01       ` [Qemu-devel] " Paolo Bonzini
@ 2015-12-21  7:44         ` Pavel Fedin
  -1 siblings, 0 replies; 38+ messages in thread
From: Pavel Fedin @ 2015-12-21  7:44 UTC (permalink / raw)
  To: 'Paolo Bonzini', 'Andrey Smetanin', kvm
  Cc: 'Denis V. Lunev', 'Gleb Natapov',
	'Roman Kagan', qemu-devel, 'Peter Hornyack'

 Hello!

 Replying to everything in one message.

> > As far as i understand this code, KVM_EXIT_HYPERV is called when one
> > of three MSRs are accessed. But, shouldn't we have implemented
> > instead something more generic, like KVM_EXIT_REG_IO, which would
> > work similar to KVM_EXIT_PIO or KVM_EXIT_MMIO, but carry register
> > code and value?
> 
> Yes, we considered that.  There were actually patches for this as well.

 Ah, i missed them, what a pity. There are lots of patches, i don't review them all. Actually i have noticed the change only after
it appeared in linux-next.

>  However, in this case the register is still emulated in the kernel, and
> userspace just gets informed of the new value.

 I see, but this doesn't change the semantic. All we need to do is to tell the userland that "register has been written".
Additionally to this we could do whatever we want, including caching the data in kernel, using it in kernel, and processing reads in
kernel.

> If we do get that, we will just rename KVM_EXIT_HYPERV to
> KVM_EXIT_MSR_ACCESS, and KVM_EXIT_HYPERV_SYNIC to
> KVM_EXIT_MSR_HYPERV_SYNIC, and struct kvm_hyperv_exit to kvm_msr_exit.

 Actually, i see this in more generic way, something like:

		/* KVM_EXIT_REG_ACCESS */
		struct {
			__u64 reg;
			__u64 data;
			__u8  is_write;
		} mmio;
 
 'data' and 'is_write' are self-explanatory, 'reg' would be generalized register code, the same as used for KVM_(GET|SET)_ONE_REG:
 - for ARM64: ARM64_SYS_REG(op0, op1, crn, crm, op2) - see
http://lxr.free-electrons.com/source/arch/arm64/include/uapi/asm/kvm.h#L189
 - for x86  : to be defined (i know, we don't use ..._ONE_REG operations here yet), like X86_MSR_REG(id), where the macro itself is:

	#define X86_MSR_REG(id) (KVM_REG_X86 | KVM_REG_X86_MSR | KVM_REG_SIZE_U64 | (id))

 - for other architectures: to be defined in a similar way, once needed.

> On brief inspection of Andrey's patch (I have not been following
> closely) it looks like the kvm_hyperv_exit struct that's returned to
> userspace contains more data (control, evt_page, and msg_page fields)
> than simply the value of the MSR, so would the desired SynIC exit fit
> into a general-purpose exit for MSR emulation?

 I have looked at the code too, and these three fields are nothing more than values of respective MSR's:

	case HV_X64_MSR_SCONTROL:
		synic->control = data;
		if (!host)
			synic_exit(synic, msr);
		break;

....

	case HV_X64_MSR_SIEFP:
		if (data & HV_SYNIC_SIEFP_ENABLE)
			if (kvm_clear_guest(vcpu->kvm,
					    data & PAGE_MASK, PAGE_SIZE)) {
				ret = 1;
				break;
			}
		synic->evt_page = data;
		if (!host)
			synic_exit(synic, msr);
		break;
	case HV_X64_MSR_SIMP:
		if (data & HV_SYNIC_SIMP_ENABLE)
			if (kvm_clear_guest(vcpu->kvm,
					    data & PAGE_MASK, PAGE_SIZE)) {
				ret = 1;
				break;
			}
		synic->msg_page = data;
		if (!host)
			synic_exit(synic, msr);
		break;

 So, every time one of these thee MSRs is written, we get a vmexit with values of all three registers, and that's all. We could
easily have 'synic_exit(synic, msr, data)' in all three cases, and i think the userspace could easily deal with proposed
KVM_EXIT_REG_ACCESS, just cache these values internally if needed.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit
@ 2015-12-21  7:44         ` Pavel Fedin
  0 siblings, 0 replies; 38+ messages in thread
From: Pavel Fedin @ 2015-12-21  7:44 UTC (permalink / raw)
  To: 'Paolo Bonzini', 'Andrey Smetanin', kvm
  Cc: 'Gleb Natapov', 'Denis V. Lunev',
	'Peter Hornyack', 'Roman Kagan',
	qemu-devel

 Hello!

 Replying to everything in one message.

> > As far as i understand this code, KVM_EXIT_HYPERV is called when one
> > of three MSRs are accessed. But, shouldn't we have implemented
> > instead something more generic, like KVM_EXIT_REG_IO, which would
> > work similar to KVM_EXIT_PIO or KVM_EXIT_MMIO, but carry register
> > code and value?
> 
> Yes, we considered that.  There were actually patches for this as well.

 Ah, i missed them, what a pity. There are lots of patches, i don't review them all. Actually i have noticed the change only after
it appeared in linux-next.

>  However, in this case the register is still emulated in the kernel, and
> userspace just gets informed of the new value.

 I see, but this doesn't change the semantic. All we need to do is to tell the userland that "register has been written".
Additionally to this we could do whatever we want, including caching the data in kernel, using it in kernel, and processing reads in
kernel.

> If we do get that, we will just rename KVM_EXIT_HYPERV to
> KVM_EXIT_MSR_ACCESS, and KVM_EXIT_HYPERV_SYNIC to
> KVM_EXIT_MSR_HYPERV_SYNIC, and struct kvm_hyperv_exit to kvm_msr_exit.

 Actually, i see this in more generic way, something like:

		/* KVM_EXIT_REG_ACCESS */
		struct {
			__u64 reg;
			__u64 data;
			__u8  is_write;
		} mmio;
 
 'data' and 'is_write' are self-explanatory, 'reg' would be generalized register code, the same as used for KVM_(GET|SET)_ONE_REG:
 - for ARM64: ARM64_SYS_REG(op0, op1, crn, crm, op2) - see
http://lxr.free-electrons.com/source/arch/arm64/include/uapi/asm/kvm.h#L189
 - for x86  : to be defined (i know, we don't use ..._ONE_REG operations here yet), like X86_MSR_REG(id), where the macro itself is:

	#define X86_MSR_REG(id) (KVM_REG_X86 | KVM_REG_X86_MSR | KVM_REG_SIZE_U64 | (id))

 - for other architectures: to be defined in a similar way, once needed.

> On brief inspection of Andrey's patch (I have not been following
> closely) it looks like the kvm_hyperv_exit struct that's returned to
> userspace contains more data (control, evt_page, and msg_page fields)
> than simply the value of the MSR, so would the desired SynIC exit fit
> into a general-purpose exit for MSR emulation?

 I have looked at the code too, and these three fields are nothing more than values of respective MSR's:

	case HV_X64_MSR_SCONTROL:
		synic->control = data;
		if (!host)
			synic_exit(synic, msr);
		break;

....

	case HV_X64_MSR_SIEFP:
		if (data & HV_SYNIC_SIEFP_ENABLE)
			if (kvm_clear_guest(vcpu->kvm,
					    data & PAGE_MASK, PAGE_SIZE)) {
				ret = 1;
				break;
			}
		synic->evt_page = data;
		if (!host)
			synic_exit(synic, msr);
		break;
	case HV_X64_MSR_SIMP:
		if (data & HV_SYNIC_SIMP_ENABLE)
			if (kvm_clear_guest(vcpu->kvm,
					    data & PAGE_MASK, PAGE_SIZE)) {
				ret = 1;
				break;
			}
		synic->msg_page = data;
		if (!host)
			synic_exit(synic, msr);
		break;

 So, every time one of these thee MSRs is written, we get a vmexit with values of all three registers, and that's all. We could
easily have 'synic_exit(synic, msr, data)' in all three cases, and i think the userspace could easily deal with proposed
KVM_EXIT_REG_ACCESS, just cache these values internally if needed.

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit
  2015-12-18 18:39           ` [Qemu-devel] " Roman Kagan
@ 2015-12-21 12:59             ` Andrey Smetanin
  -1 siblings, 0 replies; 38+ messages in thread
From: Andrey Smetanin @ 2015-12-21 12:59 UTC (permalink / raw)
  To: Roman Kagan, Peter Hornyack, Paolo Bonzini, Pavel Fedin,
	kvm list, Denis V. Lunev, Gleb Natapov, qemu-devel



On 12/18/2015 09:39 PM, Roman Kagan wrote:
> On Fri, Dec 18, 2015 at 10:10:11AM -0800, Peter Hornyack wrote:
>> On Fri, Dec 18, 2015 at 8:01 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>> On 18/12/2015 16:19, Pavel Fedin wrote:
>>>> As far as i understand this code, KVM_EXIT_HYPERV is called when one
>>>> of three MSRs are accessed. But, shouldn't we have implemented
>>>> instead something more generic, like KVM_EXIT_REG_IO, which would
>>>> work similar to KVM_EXIT_PIO or KVM_EXIT_MMIO, but carry register
>>>> code and value?
>>>
>>> Yes, we considered that.  There were actually patches for this as well.
>>>   However, in this case the register is still emulated in the kernel, and
>>> userspace just gets informed of the new value.
>>
>> On brief inspection of Andrey's patch (I have not been following
>> closely) it looks like the kvm_hyperv_exit struct that's returned to
>> userspace contains more data (control, evt_page, and msg_page fields)
>> than simply the value of the MSR, so would the desired SynIC exit fit
>> into a general-purpose exit for MSR emulation?
>
> Frankly I'm struggling trying to recall why we implemented it this way.
> Actually all three fields are the values of respective MSRs and I don't
> see any necessity to pass all three at the same time when any of them
> gets updated.  The patch for QEMU adds an exit handler which processes
> the fields individually, so I have a strong suspicion that union was
> meant here rather than struct.
>
> I hope Andrey will help to shed some light on that when he's back in the
> office on Monday; meanwhile I think this peculiarity can be ignored.
Hello!

We have implemented Hyper-V related Vcpu exit not only for Hyper-V SynIC 
MSR's changes but also to provide future interface to transfer guest 
VMBus hypercalls parameters into QEMU.

Yes, we can use  KVM_EXIT_REG_IO/MSR_IO for Hyper-V SynIC MSRS's changes 
and can even use only one MSR value . So union inside struct 
kvm_hyperv_exit is excessive.

But we still need Vcpu exit to handle VMBus hypercalls by QEMU to 
emulate VMBus devices inside QEMU.

And currently we are going to extend struct kvm_hyperv_exit
to store Hyper-V VMBus hypercall parameters.

SynIC MSR's changes could be replaced by KVM_EXIT_REG_IO/MSR_IO
but could we replace Hyper-V VMBus hypercall and it's parameters
by KVM_EXIT_REG_IO/MSR_IO too?

>
> Roman.
>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit
@ 2015-12-21 12:59             ` Andrey Smetanin
  0 siblings, 0 replies; 38+ messages in thread
From: Andrey Smetanin @ 2015-12-21 12:59 UTC (permalink / raw)
  To: Roman Kagan, Peter Hornyack, Paolo Bonzini, Pavel Fedin,
	kvm list, Denis V. Lunev, Gleb Natapov, qemu-devel



On 12/18/2015 09:39 PM, Roman Kagan wrote:
> On Fri, Dec 18, 2015 at 10:10:11AM -0800, Peter Hornyack wrote:
>> On Fri, Dec 18, 2015 at 8:01 AM, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>> On 18/12/2015 16:19, Pavel Fedin wrote:
>>>> As far as i understand this code, KVM_EXIT_HYPERV is called when one
>>>> of three MSRs are accessed. But, shouldn't we have implemented
>>>> instead something more generic, like KVM_EXIT_REG_IO, which would
>>>> work similar to KVM_EXIT_PIO or KVM_EXIT_MMIO, but carry register
>>>> code and value?
>>>
>>> Yes, we considered that.  There were actually patches for this as well.
>>>   However, in this case the register is still emulated in the kernel, and
>>> userspace just gets informed of the new value.
>>
>> On brief inspection of Andrey's patch (I have not been following
>> closely) it looks like the kvm_hyperv_exit struct that's returned to
>> userspace contains more data (control, evt_page, and msg_page fields)
>> than simply the value of the MSR, so would the desired SynIC exit fit
>> into a general-purpose exit for MSR emulation?
>
> Frankly I'm struggling trying to recall why we implemented it this way.
> Actually all three fields are the values of respective MSRs and I don't
> see any necessity to pass all three at the same time when any of them
> gets updated.  The patch for QEMU adds an exit handler which processes
> the fields individually, so I have a strong suspicion that union was
> meant here rather than struct.
>
> I hope Andrey will help to shed some light on that when he's back in the
> office on Monday; meanwhile I think this peculiarity can be ignored.
Hello!

We have implemented Hyper-V related Vcpu exit not only for Hyper-V SynIC 
MSR's changes but also to provide future interface to transfer guest 
VMBus hypercalls parameters into QEMU.

Yes, we can use  KVM_EXIT_REG_IO/MSR_IO for Hyper-V SynIC MSRS's changes 
and can even use only one MSR value . So union inside struct 
kvm_hyperv_exit is excessive.

But we still need Vcpu exit to handle VMBus hypercalls by QEMU to 
emulate VMBus devices inside QEMU.

And currently we are going to extend struct kvm_hyperv_exit
to store Hyper-V VMBus hypercall parameters.

SynIC MSR's changes could be replaced by KVM_EXIT_REG_IO/MSR_IO
but could we replace Hyper-V VMBus hypercall and it's parameters
by KVM_EXIT_REG_IO/MSR_IO too?

>
> Roman.
>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit
  2015-12-21 12:59             ` [Qemu-devel] " Andrey Smetanin
@ 2015-12-21 13:28               ` Pavel Fedin
  -1 siblings, 0 replies; 38+ messages in thread
From: Pavel Fedin @ 2015-12-21 13:28 UTC (permalink / raw)
  To: asmetanin, 'Roman Kagan', 'Peter Hornyack',
	'Paolo Bonzini', 'kvm list',
	'Denis V. Lunev', 'Gleb Natapov',
	qemu-devel

 Hello!

> Yes, we can use  KVM_EXIT_REG_IO/MSR_IO for Hyper-V SynIC MSRS's changes
> and can even use only one MSR value . So union inside struct
> kvm_hyperv_exit is excessive.
> 
> But we still need Vcpu exit to handle VMBus hypercalls by QEMU to
> emulate VMBus devices inside QEMU.
> 
> And currently we are going to extend struct kvm_hyperv_exit
> to store Hyper-V VMBus hypercall parameters.

 Hm... Hypercalls, you say?
 We already have KVM_EXIT_HYPERCALL. Documentation says it's currently unused. Is it a leftover from ia64 KVM? Could we reuse it for
the purpose?

> but could we replace Hyper-V VMBus hypercall and it's parameters
> by KVM_EXIT_REG_IO/MSR_IO too?

 It depends. Can i read about these hypercalls somewhere? Is there any documentation?

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit
@ 2015-12-21 13:28               ` Pavel Fedin
  0 siblings, 0 replies; 38+ messages in thread
From: Pavel Fedin @ 2015-12-21 13:28 UTC (permalink / raw)
  To: asmetanin, 'Roman Kagan', 'Peter Hornyack',
	'Paolo Bonzini', 'kvm list',
	'Denis V. Lunev', 'Gleb Natapov',
	qemu-devel

 Hello!

> Yes, we can use  KVM_EXIT_REG_IO/MSR_IO for Hyper-V SynIC MSRS's changes
> and can even use only one MSR value . So union inside struct
> kvm_hyperv_exit is excessive.
> 
> But we still need Vcpu exit to handle VMBus hypercalls by QEMU to
> emulate VMBus devices inside QEMU.
> 
> And currently we are going to extend struct kvm_hyperv_exit
> to store Hyper-V VMBus hypercall parameters.

 Hm... Hypercalls, you say?
 We already have KVM_EXIT_HYPERCALL. Documentation says it's currently unused. Is it a leftover from ia64 KVM? Could we reuse it for
the purpose?

> but could we replace Hyper-V VMBus hypercall and it's parameters
> by KVM_EXIT_REG_IO/MSR_IO too?

 It depends. Can i read about these hypercalls somewhere? Is there any documentation?

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit
  2015-12-21 13:28               ` [Qemu-devel] " Pavel Fedin
@ 2015-12-21 13:48                 ` Andrey Smetanin
  -1 siblings, 0 replies; 38+ messages in thread
From: Andrey Smetanin @ 2015-12-21 13:48 UTC (permalink / raw)
  To: Pavel Fedin, 'Roman Kagan', 'Peter Hornyack',
	'Paolo Bonzini', 'kvm list',
	'Denis V. Lunev', 'Gleb Natapov',
	qemu-devel



On 12/21/2015 04:28 PM, Pavel Fedin wrote:
>   Hello!
>
>> Yes, we can use  KVM_EXIT_REG_IO/MSR_IO for Hyper-V SynIC MSRS's changes
>> and can even use only one MSR value . So union inside struct
>> kvm_hyperv_exit is excessive.
>>
>> But we still need Vcpu exit to handle VMBus hypercalls by QEMU to
>> emulate VMBus devices inside QEMU.
>>
>> And currently we are going to extend struct kvm_hyperv_exit
>> to store Hyper-V VMBus hypercall parameters.
>
>   Hm... Hypercalls, you say?
>   We already have KVM_EXIT_HYPERCALL. Documentation says it's currently unused. Is it a leftover from ia64 KVM? Could we reuse it for
> the purpose?
>
>> but could we replace Hyper-V VMBus hypercall and it's parameters
>> by KVM_EXIT_REG_IO/MSR_IO too?
>
>   It depends. Can i read about these hypercalls somewhere? Is there any documentation?
I don't know about a documentation, but you can look at the code of 
Hyper-V hypercall handling inside KVM:

https://github.com/torvalds/linux/blob/master/arch/x86/kvm/hyperv.c#L346

The code simply decodes hypercall parameters from vcpu registers then 
handle hypercall code in switch and encode return code inside vcpu 
registers. Probably encode and decode of hypercall parameters/return 
code can be done in QEMU so we need only some exit with parameter that 
this is Hyper-V hypercall and probably KVM_EXIT_HYPERCALL is good for it.

But KVM_EXIT_HYPERCALL is not used inside KVM/QEMU so requires
implementation.

>
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
>
>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit
@ 2015-12-21 13:48                 ` Andrey Smetanin
  0 siblings, 0 replies; 38+ messages in thread
From: Andrey Smetanin @ 2015-12-21 13:48 UTC (permalink / raw)
  To: Pavel Fedin, 'Roman Kagan', 'Peter Hornyack',
	'Paolo Bonzini', 'kvm list',
	'Denis V. Lunev', 'Gleb Natapov',
	qemu-devel



On 12/21/2015 04:28 PM, Pavel Fedin wrote:
>   Hello!
>
>> Yes, we can use  KVM_EXIT_REG_IO/MSR_IO for Hyper-V SynIC MSRS's changes
>> and can even use only one MSR value . So union inside struct
>> kvm_hyperv_exit is excessive.
>>
>> But we still need Vcpu exit to handle VMBus hypercalls by QEMU to
>> emulate VMBus devices inside QEMU.
>>
>> And currently we are going to extend struct kvm_hyperv_exit
>> to store Hyper-V VMBus hypercall parameters.
>
>   Hm... Hypercalls, you say?
>   We already have KVM_EXIT_HYPERCALL. Documentation says it's currently unused. Is it a leftover from ia64 KVM? Could we reuse it for
> the purpose?
>
>> but could we replace Hyper-V VMBus hypercall and it's parameters
>> by KVM_EXIT_REG_IO/MSR_IO too?
>
>   It depends. Can i read about these hypercalls somewhere? Is there any documentation?
I don't know about a documentation, but you can look at the code of 
Hyper-V hypercall handling inside KVM:

https://github.com/torvalds/linux/blob/master/arch/x86/kvm/hyperv.c#L346

The code simply decodes hypercall parameters from vcpu registers then 
handle hypercall code in switch and encode return code inside vcpu 
registers. Probably encode and decode of hypercall parameters/return 
code can be done in QEMU so we need only some exit with parameter that 
this is Hyper-V hypercall and probably KVM_EXIT_HYPERCALL is good for it.

But KVM_EXIT_HYPERCALL is not used inside KVM/QEMU so requires
implementation.

>
> Kind regards,
> Pavel Fedin
> Expert Engineer
> Samsung Electronics Research center Russia
>
>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* RE: [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit
  2015-12-21 13:48                 ` [Qemu-devel] " Andrey Smetanin
@ 2015-12-21 15:21                   ` Pavel Fedin
  -1 siblings, 0 replies; 38+ messages in thread
From: Pavel Fedin @ 2015-12-21 15:21 UTC (permalink / raw)
  To: asmetanin, 'Roman Kagan', 'Peter Hornyack',
	'Paolo Bonzini', 'kvm list',
	'Denis V. Lunev', 'Gleb Natapov',
	qemu-devel

 Hello!

> >   It depends. Can i read about these hypercalls somewhere? Is there any documentation?
> I don't know about a documentation, but you can look at the code of
> Hyper-V hypercall handling inside KVM:
> 
> https://github.com/torvalds/linux/blob/master/arch/x86/kvm/hyperv.c#L346

 Aha, i see, so vmmcall CPU instruction is employed. Well, i believe this very well fits into the sematics of KVM_EXIT_HYPERCALL,
because it's a true hypercall.

> The code simply decodes hypercall parameters from vcpu registers then
> handle hypercall code in switch and encode return code inside vcpu
> registers. Probably encode and decode of hypercall parameters/return
> code can be done in QEMU so we need only some exit with parameter that
> this is Hyper-V hypercall and probably KVM_EXIT_HYPERCALL is good for it.

 Or you could even reuse the whole structure, it has all you need:

 			__u64 nr;		/* Reserved for x86, other architectures can use it, for example ARM "hvc #nr" */
			__u64 args[6];  /* rax, rbx, rcx, rdx, rdi, rsi */
			__u64 ret;
			__u32 longmode; /* longmode; other architectures (like ARM64) can also make sense of it */

 Or you could put in struct kvm_regs instead of args and ret, and allow the userspace to manipulate it.

> But KVM_EXIT_HYPERCALL is not used inside KVM/QEMU so requires
> implementation.

 I guess your hypercalls to be introduced using KVM_EXIT_HYPERV are also not used inside qemu so require implementation :)

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia



^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit
@ 2015-12-21 15:21                   ` Pavel Fedin
  0 siblings, 0 replies; 38+ messages in thread
From: Pavel Fedin @ 2015-12-21 15:21 UTC (permalink / raw)
  To: asmetanin, 'Roman Kagan', 'Peter Hornyack',
	'Paolo Bonzini', 'kvm list',
	'Denis V. Lunev', 'Gleb Natapov',
	qemu-devel

 Hello!

> >   It depends. Can i read about these hypercalls somewhere? Is there any documentation?
> I don't know about a documentation, but you can look at the code of
> Hyper-V hypercall handling inside KVM:
> 
> https://github.com/torvalds/linux/blob/master/arch/x86/kvm/hyperv.c#L346

 Aha, i see, so vmmcall CPU instruction is employed. Well, i believe this very well fits into the sematics of KVM_EXIT_HYPERCALL,
because it's a true hypercall.

> The code simply decodes hypercall parameters from vcpu registers then
> handle hypercall code in switch and encode return code inside vcpu
> registers. Probably encode and decode of hypercall parameters/return
> code can be done in QEMU so we need only some exit with parameter that
> this is Hyper-V hypercall and probably KVM_EXIT_HYPERCALL is good for it.

 Or you could even reuse the whole structure, it has all you need:

 			__u64 nr;		/* Reserved for x86, other architectures can use it, for example ARM "hvc #nr" */
			__u64 args[6];  /* rax, rbx, rcx, rdx, rdi, rsi */
			__u64 ret;
			__u32 longmode; /* longmode; other architectures (like ARM64) can also make sense of it */

 Or you could put in struct kvm_regs instead of args and ret, and allow the userspace to manipulate it.

> But KVM_EXIT_HYPERCALL is not used inside KVM/QEMU so requires
> implementation.

 I guess your hypercalls to be introduced using KVM_EXIT_HYPERV are also not used inside qemu so require implementation :)

Kind regards,
Pavel Fedin
Expert Engineer
Samsung Electronics Research center Russia

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2015-12-21 15:21 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-10 12:36 [PATCH v4 0/5] KVM: Hyper-V synthetic interrupt controller Andrey Smetanin
2015-11-10 12:36 ` [Qemu-devel] " Andrey Smetanin
2015-11-10 12:36 ` [PATCH v4 1/5] kvm/irqchip: kvm_arch_irq_routing_update renaming split Andrey Smetanin
2015-11-10 12:36   ` [Qemu-devel] " Andrey Smetanin
2015-11-10 12:36 ` [PATCH v4 2/5] kvm/x86: split ioapic-handled and EOI exit bitmaps Andrey Smetanin
2015-11-10 12:36   ` [Qemu-devel] " Andrey Smetanin
2015-11-10 12:36 ` [PATCH v4 3/5] kvm/x86: per-vcpu apicv deactivation support Andrey Smetanin
2015-11-10 12:36   ` [Qemu-devel] " Andrey Smetanin
2015-11-10 12:36 ` [PATCH v4 4/5] kvm/x86: Hyper-V synthetic interrupt controller Andrey Smetanin
2015-11-10 12:36   ` [Qemu-devel] " Andrey Smetanin
2015-11-10 12:36 ` [PATCH v4 5/5] kvm/x86: Hyper-V kvm exit Andrey Smetanin
2015-11-10 12:36   ` [Qemu-devel] " Andrey Smetanin
2015-12-18 15:19   ` Pavel Fedin
2015-12-18 15:19     ` [Qemu-devel] " Pavel Fedin
2015-12-18 15:53     ` Denis V. Lunev
2015-12-18 15:53       ` [Qemu-devel] " Denis V. Lunev
2015-12-18 16:01     ` Paolo Bonzini
2015-12-18 16:01       ` [Qemu-devel] " Paolo Bonzini
2015-12-18 18:10       ` Peter Hornyack
2015-12-18 18:10         ` [Qemu-devel] " Peter Hornyack
2015-12-18 18:23         ` Paolo Bonzini
2015-12-18 18:23           ` [Qemu-devel] " Paolo Bonzini
2015-12-18 18:39         ` Roman Kagan
2015-12-18 18:39           ` [Qemu-devel] " Roman Kagan
2015-12-21 12:59           ` Andrey Smetanin
2015-12-21 12:59             ` [Qemu-devel] " Andrey Smetanin
2015-12-21 13:28             ` Pavel Fedin
2015-12-21 13:28               ` [Qemu-devel] " Pavel Fedin
2015-12-21 13:48               ` Andrey Smetanin
2015-12-21 13:48                 ` [Qemu-devel] " Andrey Smetanin
2015-12-21 15:21                 ` Pavel Fedin
2015-12-21 15:21                   ` [Qemu-devel] " Pavel Fedin
2015-12-18 18:25       ` 'Roman Kagan'
2015-12-18 18:25         ` [Qemu-devel] " 'Roman Kagan'
2015-12-21  7:44       ` Pavel Fedin
2015-12-21  7:44         ` [Qemu-devel] " Pavel Fedin
2015-12-18 18:00     ` 'Roman Kagan'
2015-12-18 18:00       ` [Qemu-devel] " 'Roman Kagan'

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.