linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [v7 0/4] prerequisite changes for VT-d posted-interrupts
@ 2015-05-19  9:07 Feng Wu
  2015-05-19  9:07 ` [v7 1/4] genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a VCPU Feng Wu
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Feng Wu @ 2015-05-19  9:07 UTC (permalink / raw)
  To: tglx, mingo, hpa; +Cc: linux-kernel, jiang.liu, Feng Wu

VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt.
With VT-d Posted-Interrupts enabled, external interrupts from
direct-assigned devices can be delivered to guests without VMM
intervention when guest is running in non-root mode.

You can find the VT-d Posted-Interrtups Spec. in the following URL:
http://www.intel.com/content/www/us/en/intelligent-systems/intel-technology/vt-directed-io-spec.html

This series implement some prerequisite parts for VT-d posted-interrupts. It was part of
http://thread.gmane.org/gmane.linux.kernel.iommu/7708. To make things clear, I will divide
the whole series which contain multiple components into three parts:
- prerequisite changes (included in this series)
- IOMMU part (v4 was reviewed, some comments need to be addressed)
- KVM and VFIO parts (will send out this part once the first two parts are accepted)

This series is rebased on the x86-apic branch of tip tree.

v6 --> v7:
[1/4]:
- Add a KernelDoc comment for function irq_set_vcpu_affinity().

v5 --> v6:
[3/4]:
- Avoid the conditional in the exception handler smp_kvm_posted_intr_wakeup_ipi().
- Rename "wakeup_handler_callback" to "kvm_posted_intr_wakeup_handler".

[4/4]
- Newly added in this series, show the statistics information for posted-interrupts.


v4 --> v5:
- Move the declaration of "irq_chip_set_vcpu_affinity_parent()" to [1/3].
- Use the accessor to get "struct irq_data", "struct irq_chip".
- Use "irq_get_desc_lock()" instead of "irq_to_desc()".
- Declare "wakeup_handler_callback" in "asm/irq.h".
- Use entering_ack_irq()/exiting_irq() in smp_kvm_posted_intr_wakeup_ipi().


Feng Wu (3):
  x86/irq: Implement irq_set_vcpu_affinity for pci_msi_ir_controller
  x86/irq: Define a global vector for VT-d Posted-Interrupts
  x86/irq: Show statistics information for posted-interrupts

Jiang Liu (1):
  genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a
    VCPU

 arch/x86/include/asm/entry_arch.h  |  2 ++
 arch/x86/include/asm/hardirq.h     |  1 +
 arch/x86/include/asm/hw_irq.h      |  2 ++
 arch/x86/include/asm/irq.h         |  4 ++++
 arch/x86/include/asm/irq_vectors.h |  1 +
 arch/x86/kernel/apic/msi.c         |  1 +
 arch/x86/kernel/entry_64.S         |  2 ++
 arch/x86/kernel/irq.c              | 43 ++++++++++++++++++++++++++++++++++++++
 arch/x86/kernel/irqinit.c          |  2 ++
 include/linux/irq.h                |  6 ++++++
 kernel/irq/chip.c                  | 14 +++++++++++++
 kernel/irq/manage.c                | 31 +++++++++++++++++++++++++++
 12 files changed, 109 insertions(+)

-- 
2.1.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [v7 1/4] genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a VCPU
  2015-05-19  9:07 [v7 0/4] prerequisite changes for VT-d posted-interrupts Feng Wu
@ 2015-05-19  9:07 ` Feng Wu
  2015-05-19 13:45   ` [tip:irq/core] " tip-bot for Jiang Liu
  2015-05-19  9:07 ` [v7 2/4] x86/irq: Implement irq_set_vcpu_affinity for pci_msi_ir_controller Feng Wu
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Feng Wu @ 2015-05-19  9:07 UTC (permalink / raw)
  To: tglx, mingo, hpa; +Cc: linux-kernel, jiang.liu, Feng Wu

From: Jiang Liu <jiang.liu@linux.intel.com>

With Posted-Interrupts support in Intel CPU and IOMMU, an external
interrupt from assigned-devices could be directly delivered to a
virtual CPU in a virtual machine. Instead of hacking KVM and Intel
IOMMU drivers, we propose a platform independent interface to target
an interrupt to a specific virtual CPU in a virtual machine, or set
virtual CPU affinity for an interrupt.

By adopting this new interface and the hierarchy irqdomain, we could
easily support posted-interrupts on Intel platforms, and also provide
flexible enough interfaces for other platforms to support similar
features.

Here is the usage scenario for this interface:
Guest update MSI/MSI-X interrupt configuration
        -->QEMU and KVM handle this
        -->KVM call this interface (passing posted interrupts descriptor
           and guest vector)
        -->irq core will transfer the control to IOMMU
        -->IOMMU will do the real work of updating IRTE (IRTE has new
           format for VT-d Posted-Interrupts)

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Feng Wu <feng.wu@intel.com>
---
 include/linux/irq.h |  6 ++++++
 kernel/irq/chip.c   | 14 ++++++++++++++
 kernel/irq/manage.c | 31 +++++++++++++++++++++++++++++++
 3 files changed, 51 insertions(+)

diff --git a/include/linux/irq.h b/include/linux/irq.h
index 62c6901..48cb7d1 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -327,6 +327,7 @@ static inline irq_hw_number_t irqd_to_hwirq(struct irq_data *d)
  * @irq_write_msi_msg:	optional to write message content for MSI
  * @irq_get_irqchip_state:	return the internal state of an interrupt
  * @irq_set_irqchip_state:	set the internal state of a interrupt
+ * @irq_set_vcpu_affinity:	optional to target a vCPU in a virtual machine
  * @flags:		chip specific flags
  */
 struct irq_chip {
@@ -369,6 +370,8 @@ struct irq_chip {
 	int		(*irq_get_irqchip_state)(struct irq_data *data, enum irqchip_irq_state which, bool *state);
 	int		(*irq_set_irqchip_state)(struct irq_data *data, enum irqchip_irq_state which, bool state);
 
+	int		(*irq_set_vcpu_affinity)(struct irq_data *data, void *vcpu_info);
+
 	unsigned long	flags;
 };
 
@@ -422,6 +425,7 @@ extern void irq_cpu_online(void);
 extern void irq_cpu_offline(void);
 extern int irq_set_affinity_locked(struct irq_data *data,
 				   const struct cpumask *cpumask, bool force);
+extern int irq_set_vcpu_affinity(unsigned int irq, void *vcpu_info);
 
 #if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_PENDING_IRQ)
 void irq_move_irq(struct irq_data *data);
@@ -467,6 +471,8 @@ extern int irq_chip_set_affinity_parent(struct irq_data *data,
 					const struct cpumask *dest,
 					bool force);
 extern int irq_chip_set_wake_parent(struct irq_data *data, unsigned int on);
+extern int irq_chip_set_vcpu_affinity_parent(struct irq_data *data,
+					     void *vcpu_info);
 #endif
 
 /* Handling of unhandled and spurious interrupts: */
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index eb9a4ea..55016b2 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -950,6 +950,20 @@ int irq_chip_retrigger_hierarchy(struct irq_data *data)
 }
 
 /**
+ * irq_chip_set_vcpu_affinity_parent - Set vcpu affinity on the parent interrupt
+ * @data:	Pointer to interrupt specific data
+ * @dest:	The vcpu affinity information
+ */
+int irq_chip_set_vcpu_affinity_parent(struct irq_data *data, void *vcpu_info)
+{
+	data = data->parent_data;
+	if (data->chip->irq_set_vcpu_affinity)
+		return data->chip->irq_set_vcpu_affinity(data, vcpu_info);
+
+	return -ENOSYS;
+}
+
+/**
  * irq_chip_set_wake_parent - Set/reset wake-up on the parent interrupt
  * @data:	Pointer to interrupt specific data
  * @on:		Whether to set or reset the wake-up capability of this irq
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index e68932b..b1c7e8f 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -256,6 +256,37 @@ int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m)
 }
 EXPORT_SYMBOL_GPL(irq_set_affinity_hint);
 
+/**
+ *	irq_set_vcpu_affinity - Set vcpu affinity for the interrupt
+ *	@irq: interrupt number to set affinity
+ *	@vcpu_info: vCPU specific data
+ *
+ *	This function uses the vCPU specific data to set the vCPU
+ *	affinity for an irq. The vCPU specific data is passed from
+ *	outside, such as KVM. One example code path is as below:
+ *	KVM -> IOMMU -> irq_set_vcpu_affinity().
+ */
+int irq_set_vcpu_affinity(unsigned int irq, void *vcpu_info)
+{
+	unsigned long flags;
+	struct irq_desc *desc = irq_get_desc_lock(irq, &flags, 0);
+	struct irq_data *data;
+	struct irq_chip *chip;
+	int ret = -ENOSYS;
+
+	if (!desc)
+		return -EINVAL;
+
+	data = irq_desc_get_irq_data(desc);
+	chip = irq_data_get_irq_chip(data);
+	if (chip && chip->irq_set_vcpu_affinity)
+		ret = chip->irq_set_vcpu_affinity(data, vcpu_info);
+	irq_put_desc_unlock(desc, flags);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(irq_set_vcpu_affinity);
+
 static void irq_affinity_notify(struct work_struct *work)
 {
 	struct irq_affinity_notify *notify =
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [v7 2/4] x86/irq: Implement irq_set_vcpu_affinity for pci_msi_ir_controller
  2015-05-19  9:07 [v7 0/4] prerequisite changes for VT-d posted-interrupts Feng Wu
  2015-05-19  9:07 ` [v7 1/4] genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a VCPU Feng Wu
@ 2015-05-19  9:07 ` Feng Wu
  2015-05-19 13:54   ` [tip:x86/apic] x86/irq/msi: Implement irq_set_vcpu_affinity for remapped MSI irqs tip-bot for Feng Wu
  2015-05-19  9:07 ` [v7 3/4] x86/irq: Define a global vector for VT-d Posted-Interrupts Feng Wu
  2015-05-19  9:07 ` [v7 4/4] x86/irq: Show statistics information for posted-interrupts Feng Wu
  3 siblings, 1 reply; 9+ messages in thread
From: Feng Wu @ 2015-05-19  9:07 UTC (permalink / raw)
  To: tglx, mingo, hpa; +Cc: linux-kernel, jiang.liu, Feng Wu

Implement irq_set_vcpu_affinity for pci_msi_ir_controller.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/kernel/apic/msi.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index 58fde66..d2d95e2 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -152,6 +152,7 @@ static struct irq_chip pci_msi_ir_controller = {
 	.irq_mask		= pci_msi_mask_irq,
 	.irq_ack		= irq_chip_ack_parent,
 	.irq_retrigger		= irq_chip_retrigger_hierarchy,
+	.irq_set_vcpu_affinity	= irq_chip_set_vcpu_affinity_parent,
 	.flags			= IRQCHIP_SKIP_SET_WAKE,
 };
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [v7 3/4] x86/irq: Define a global vector for VT-d Posted-Interrupts
  2015-05-19  9:07 [v7 0/4] prerequisite changes for VT-d posted-interrupts Feng Wu
  2015-05-19  9:07 ` [v7 1/4] genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a VCPU Feng Wu
  2015-05-19  9:07 ` [v7 2/4] x86/irq: Implement irq_set_vcpu_affinity for pci_msi_ir_controller Feng Wu
@ 2015-05-19  9:07 ` Feng Wu
  2015-05-19 13:54   ` [tip:x86/apic] " tip-bot for Feng Wu
  2015-05-19  9:07 ` [v7 4/4] x86/irq: Show statistics information for posted-interrupts Feng Wu
  3 siblings, 1 reply; 9+ messages in thread
From: Feng Wu @ 2015-05-19  9:07 UTC (permalink / raw)
  To: tglx, mingo, hpa; +Cc: linux-kernel, jiang.liu, Feng Wu

Currently, we use a global vector as the Posted-Interrupts
Notification Event for all the vCPUs in the system. We need
to introduce another global vector for VT-d Posted-Interrtups,
which will be used to wakeup the sleep vCPU when an external
interrupt from a direct-assigned device happens for that vCPU.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Suggested-by: Yang Zhang <yang.z.zhang@intel.com>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
---
 arch/x86/include/asm/entry_arch.h  |  2 ++
 arch/x86/include/asm/hardirq.h     |  1 +
 arch/x86/include/asm/hw_irq.h      |  2 ++
 arch/x86/include/asm/irq.h         |  4 ++++
 arch/x86/include/asm/irq_vectors.h |  1 +
 arch/x86/kernel/entry_64.S         |  2 ++
 arch/x86/kernel/irq.c              | 31 +++++++++++++++++++++++++++++++
 arch/x86/kernel/irqinit.c          |  2 ++
 8 files changed, 45 insertions(+)

diff --git a/arch/x86/include/asm/entry_arch.h b/arch/x86/include/asm/entry_arch.h
index dc5fa66..27ca0af 100644
--- a/arch/x86/include/asm/entry_arch.h
+++ b/arch/x86/include/asm/entry_arch.h
@@ -23,6 +23,8 @@ BUILD_INTERRUPT(x86_platform_ipi, X86_PLATFORM_IPI_VECTOR)
 #ifdef CONFIG_HAVE_KVM
 BUILD_INTERRUPT3(kvm_posted_intr_ipi, POSTED_INTR_VECTOR,
 		 smp_kvm_posted_intr_ipi)
+BUILD_INTERRUPT3(kvm_posted_intr_wakeup_ipi, POSTED_INTR_WAKEUP_VECTOR,
+		 smp_kvm_posted_intr_wakeup_ipi)
 #endif
 
 /*
diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h
index 0f5fb6b..9866065 100644
--- a/arch/x86/include/asm/hardirq.h
+++ b/arch/x86/include/asm/hardirq.h
@@ -14,6 +14,7 @@ typedef struct {
 #endif
 #ifdef CONFIG_HAVE_KVM
 	unsigned int kvm_posted_intr_ipis;
+	unsigned int kvm_posted_intr_wakeup_ipis;
 #endif
 	unsigned int x86_platform_ipis;	/* arch dependent */
 	unsigned int apic_perf_irqs;
diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
index 1f88e71..6ffc847 100644
--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -29,6 +29,7 @@
 extern asmlinkage void apic_timer_interrupt(void);
 extern asmlinkage void x86_platform_ipi(void);
 extern asmlinkage void kvm_posted_intr_ipi(void);
+extern asmlinkage void kvm_posted_intr_wakeup_ipi(void);
 extern asmlinkage void error_interrupt(void);
 extern asmlinkage void irq_work_interrupt(void);
 
@@ -92,6 +93,7 @@ extern void trace_call_function_single_interrupt(void);
 #define trace_irq_move_cleanup_interrupt  irq_move_cleanup_interrupt
 #define trace_reboot_interrupt  reboot_interrupt
 #define trace_kvm_posted_intr_ipi kvm_posted_intr_ipi
+#define trace_kvm_posted_intr_wakeup_ipi kvm_posted_intr_wakeup_ipi
 #endif /* CONFIG_TRACING */
 
 #ifdef	CONFIG_X86_LOCAL_APIC
diff --git a/arch/x86/include/asm/irq.h b/arch/x86/include/asm/irq.h
index a80cbb8..8008d06 100644
--- a/arch/x86/include/asm/irq.h
+++ b/arch/x86/include/asm/irq.h
@@ -30,6 +30,10 @@ extern void fixup_irqs(void);
 extern void irq_force_complete_move(int);
 #endif
 
+#ifdef CONFIG_HAVE_KVM
+extern void kvm_set_posted_intr_wakeup_handler(void (*handler)(void));
+#endif
+
 extern void (*x86_platform_ipi_callback)(void);
 extern void native_init_IRQ(void);
 extern bool handle_irq(unsigned irq, struct pt_regs *regs);
diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_vectors.h
index b26cb12..dca94f2 100644
--- a/arch/x86/include/asm/irq_vectors.h
+++ b/arch/x86/include/asm/irq_vectors.h
@@ -105,6 +105,7 @@
 /* Vector for KVM to deliver posted interrupt IPI */
 #ifdef CONFIG_HAVE_KVM
 #define POSTED_INTR_VECTOR		0xf2
+#define POSTED_INTR_WAKEUP_VECTOR	0xf1
 #endif
 
 /*
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index c7b2384..177feec 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -919,6 +919,8 @@ apicinterrupt X86_PLATFORM_IPI_VECTOR \
 #ifdef CONFIG_HAVE_KVM
 apicinterrupt3 POSTED_INTR_VECTOR \
 	kvm_posted_intr_ipi smp_kvm_posted_intr_ipi
+apicinterrupt3 POSTED_INTR_WAKEUP_VECTOR \
+	kvm_posted_intr_wakeup_ipi smp_kvm_posted_intr_wakeup_ipi
 #endif
 
 #ifdef CONFIG_X86_MCE_THRESHOLD
diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index e5952c2..2ec339a 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -237,6 +237,18 @@ __visible void smp_x86_platform_ipi(struct pt_regs *regs)
 }
 
 #ifdef CONFIG_HAVE_KVM
+static void dummy_handler(void) {}
+static void (*kvm_posted_intr_wakeup_handler)(void) = dummy_handler;
+
+void kvm_set_posted_intr_wakeup_handler(void (*handler)(void))
+{
+	if (handler)
+		kvm_posted_intr_wakeup_handler = handler;
+	else
+		kvm_posted_intr_wakeup_handler = dummy_handler;
+}
+EXPORT_SYMBOL_GPL(kvm_set_posted_intr_wakeup_handler);
+
 /*
  * Handler for POSTED_INTERRUPT_VECTOR.
  */
@@ -256,6 +268,25 @@ __visible void smp_kvm_posted_intr_ipi(struct pt_regs *regs)
 
 	set_irq_regs(old_regs);
 }
+
+/*
+ * Handler for POSTED_INTERRUPT_WAKEUP_VECTOR.
+ */
+__visible void smp_kvm_posted_intr_wakeup_ipi(struct pt_regs *regs)
+{
+	struct pt_regs *old_regs = set_irq_regs(regs);
+
+	entering_ack_irq();
+
+	inc_irq_stat(kvm_posted_intr_wakeup_ipis);
+
+	kvm_posted_intr_wakeup_handler();
+
+	exiting_irq();
+
+	set_irq_regs(old_regs);
+}
+
 #endif
 
 __visible void smp_trace_x86_platform_ipi(struct pt_regs *regs)
diff --git a/arch/x86/kernel/irqinit.c b/arch/x86/kernel/irqinit.c
index cd10a64..895941d 100644
--- a/arch/x86/kernel/irqinit.c
+++ b/arch/x86/kernel/irqinit.c
@@ -144,6 +144,8 @@ static void __init apic_intr_init(void)
 #ifdef CONFIG_HAVE_KVM
 	/* IPI for KVM to deliver posted interrupt */
 	alloc_intr_gate(POSTED_INTR_VECTOR, kvm_posted_intr_ipi);
+	/* IPI for KVM to deliver interrupt to wake up tasks */
+	alloc_intr_gate(POSTED_INTR_WAKEUP_VECTOR, kvm_posted_intr_wakeup_ipi);
 #endif
 
 	/* IPI vectors for APIC spurious and error interrupts */
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [v7 4/4] x86/irq: Show statistics information for posted-interrupts
  2015-05-19  9:07 [v7 0/4] prerequisite changes for VT-d posted-interrupts Feng Wu
                   ` (2 preceding siblings ...)
  2015-05-19  9:07 ` [v7 3/4] x86/irq: Define a global vector for VT-d Posted-Interrupts Feng Wu
@ 2015-05-19  9:07 ` Feng Wu
  2015-05-19 13:55   ` [tip:x86/apic] " tip-bot for Feng Wu
  3 siblings, 1 reply; 9+ messages in thread
From: Feng Wu @ 2015-05-19  9:07 UTC (permalink / raw)
  To: tglx, mingo, hpa; +Cc: linux-kernel, jiang.liu, Feng Wu

Show the statistics information for notification event
and wakeup event for posted-interrupt in /proc/interrupts.

Signed-off-by: Feng Wu <feng.wu@intel.com>
---
 arch/x86/kernel/irq.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index 2ec339a..be466ff 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -136,6 +136,18 @@ int arch_show_interrupts(struct seq_file *p, int prec)
 #if defined(CONFIG_X86_IO_APIC)
 	seq_printf(p, "%*s: %10u\n", prec, "MIS", atomic_read(&irq_mis_count));
 #endif
+#ifdef CONFIG_HAVE_KVM
+	seq_printf(p, "%*s: ", prec, "NEV");
+	for_each_online_cpu(j)
+		seq_printf(p, "%10u ", irq_stats(j)->kvm_posted_intr_ipis);
+	seq_puts(p, "  Posted-interrupt notification event\n");
+
+	seq_printf(p, "%*s: ", prec, "WEV");
+	for_each_online_cpu(j)
+		seq_printf(p, "%10u ",
+			   irq_stats(j)->kvm_posted_intr_wakeup_ipis);
+	seq_puts(p, "  Posted-interrupt wakeup event\n");
+#endif
 	return 0;
 }
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [tip:irq/core] genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a VCPU
  2015-05-19  9:07 ` [v7 1/4] genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a VCPU Feng Wu
@ 2015-05-19 13:45   ` tip-bot for Jiang Liu
  0 siblings, 0 replies; 9+ messages in thread
From: tip-bot for Jiang Liu @ 2015-05-19 13:45 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: feng.wu, hpa, tglx, mingo, linux-kernel, jiang.liu

Commit-ID:  0a4377de305684c883bf90ad21e3cbdeead70f5c
Gitweb:     http://git.kernel.org/tip/0a4377de305684c883bf90ad21e3cbdeead70f5c
Author:     Jiang Liu <jiang.liu@linux.intel.com>
AuthorDate: Tue, 19 May 2015 17:07:14 +0800
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Tue, 19 May 2015 15:41:19 +0200

genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a VCPU

With Posted-Interrupts support in Intel CPU and IOMMU, an external
interrupt from assigned-devices could be directly delivered to a
virtual CPU in a virtual machine. Instead of hacking KVM and Intel
IOMMU drivers, we propose a platform independent interface to target
an interrupt to a specific virtual CPU in a virtual machine, or set
virtual CPU affinity for an interrupt.

By adopting this new interface and the hierarchy irqdomain, we could
easily support posted-interrupts on Intel platforms, and also provide
flexible enough interfaces for other platforms to support similar
features.

Here is the usage scenario for this interface:
Guest update MSI/MSI-X interrupt configuration
        -->QEMU and KVM handle this
        -->KVM call this interface (passing posted interrupts descriptor
           and guest vector)
        -->irq core will transfer the control to IOMMU
        -->IOMMU will do the real work of updating IRTE (IRTE has new
           format for VT-d Posted-Interrupts)

Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
Signed-off-by: Feng Wu <feng.wu@intel.com>
Link: http://lkml.kernel.org/r/1432026437-16560-2-git-send-email-feng.wu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 include/linux/irq.h |  6 ++++++
 kernel/irq/chip.c   | 14 ++++++++++++++
 kernel/irq/manage.c | 31 +++++++++++++++++++++++++++++++
 3 files changed, 51 insertions(+)

diff --git a/include/linux/irq.h b/include/linux/irq.h
index 62c6901..48cb7d1 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -327,6 +327,7 @@ static inline irq_hw_number_t irqd_to_hwirq(struct irq_data *d)
  * @irq_write_msi_msg:	optional to write message content for MSI
  * @irq_get_irqchip_state:	return the internal state of an interrupt
  * @irq_set_irqchip_state:	set the internal state of a interrupt
+ * @irq_set_vcpu_affinity:	optional to target a vCPU in a virtual machine
  * @flags:		chip specific flags
  */
 struct irq_chip {
@@ -369,6 +370,8 @@ struct irq_chip {
 	int		(*irq_get_irqchip_state)(struct irq_data *data, enum irqchip_irq_state which, bool *state);
 	int		(*irq_set_irqchip_state)(struct irq_data *data, enum irqchip_irq_state which, bool state);
 
+	int		(*irq_set_vcpu_affinity)(struct irq_data *data, void *vcpu_info);
+
 	unsigned long	flags;
 };
 
@@ -422,6 +425,7 @@ extern void irq_cpu_online(void);
 extern void irq_cpu_offline(void);
 extern int irq_set_affinity_locked(struct irq_data *data,
 				   const struct cpumask *cpumask, bool force);
+extern int irq_set_vcpu_affinity(unsigned int irq, void *vcpu_info);
 
 #if defined(CONFIG_SMP) && defined(CONFIG_GENERIC_PENDING_IRQ)
 void irq_move_irq(struct irq_data *data);
@@ -467,6 +471,8 @@ extern int irq_chip_set_affinity_parent(struct irq_data *data,
 					const struct cpumask *dest,
 					bool force);
 extern int irq_chip_set_wake_parent(struct irq_data *data, unsigned int on);
+extern int irq_chip_set_vcpu_affinity_parent(struct irq_data *data,
+					     void *vcpu_info);
 #endif
 
 /* Handling of unhandled and spurious interrupts: */
diff --git a/kernel/irq/chip.c b/kernel/irq/chip.c
index eb9a4ea..55016b2 100644
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -950,6 +950,20 @@ int irq_chip_retrigger_hierarchy(struct irq_data *data)
 }
 
 /**
+ * irq_chip_set_vcpu_affinity_parent - Set vcpu affinity on the parent interrupt
+ * @data:	Pointer to interrupt specific data
+ * @dest:	The vcpu affinity information
+ */
+int irq_chip_set_vcpu_affinity_parent(struct irq_data *data, void *vcpu_info)
+{
+	data = data->parent_data;
+	if (data->chip->irq_set_vcpu_affinity)
+		return data->chip->irq_set_vcpu_affinity(data, vcpu_info);
+
+	return -ENOSYS;
+}
+
+/**
  * irq_chip_set_wake_parent - Set/reset wake-up on the parent interrupt
  * @data:	Pointer to interrupt specific data
  * @on:		Whether to set or reset the wake-up capability of this irq
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index e68932b..b1c7e8f 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -256,6 +256,37 @@ int irq_set_affinity_hint(unsigned int irq, const struct cpumask *m)
 }
 EXPORT_SYMBOL_GPL(irq_set_affinity_hint);
 
+/**
+ *	irq_set_vcpu_affinity - Set vcpu affinity for the interrupt
+ *	@irq: interrupt number to set affinity
+ *	@vcpu_info: vCPU specific data
+ *
+ *	This function uses the vCPU specific data to set the vCPU
+ *	affinity for an irq. The vCPU specific data is passed from
+ *	outside, such as KVM. One example code path is as below:
+ *	KVM -> IOMMU -> irq_set_vcpu_affinity().
+ */
+int irq_set_vcpu_affinity(unsigned int irq, void *vcpu_info)
+{
+	unsigned long flags;
+	struct irq_desc *desc = irq_get_desc_lock(irq, &flags, 0);
+	struct irq_data *data;
+	struct irq_chip *chip;
+	int ret = -ENOSYS;
+
+	if (!desc)
+		return -EINVAL;
+
+	data = irq_desc_get_irq_data(desc);
+	chip = irq_data_get_irq_chip(data);
+	if (chip && chip->irq_set_vcpu_affinity)
+		ret = chip->irq_set_vcpu_affinity(data, vcpu_info);
+	irq_put_desc_unlock(desc, flags);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(irq_set_vcpu_affinity);
+
 static void irq_affinity_notify(struct work_struct *work)
 {
 	struct irq_affinity_notify *notify =

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [tip:x86/apic] x86/irq/msi: Implement irq_set_vcpu_affinity for remapped MSI irqs
  2015-05-19  9:07 ` [v7 2/4] x86/irq: Implement irq_set_vcpu_affinity for pci_msi_ir_controller Feng Wu
@ 2015-05-19 13:54   ` tip-bot for Feng Wu
  0 siblings, 0 replies; 9+ messages in thread
From: tip-bot for Feng Wu @ 2015-05-19 13:54 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: hpa, feng.wu, jiang.liu, tglx, linux-kernel, mingo

Commit-ID:  a2f1c8bdc02bfcaa5a658283b883fdb54e328b36
Gitweb:     http://git.kernel.org/tip/a2f1c8bdc02bfcaa5a658283b883fdb54e328b36
Author:     Feng Wu <feng.wu@intel.com>
AuthorDate: Tue, 19 May 2015 17:07:15 +0800
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Tue, 19 May 2015 15:51:17 +0200

x86/irq/msi: Implement irq_set_vcpu_affinity for remapped MSI irqs

Implement irq_set_vcpu_affinity for pci_msi_ir_controller.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Jiang Liu <jiang.liu@linux.intel.com>
Link: http://lkml.kernel.org/r/1432026437-16560-3-git-send-email-feng.wu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/apic/msi.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/x86/kernel/apic/msi.c b/arch/x86/kernel/apic/msi.c
index ef516af..1a9d735 100644
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -152,6 +152,7 @@ static struct irq_chip pci_msi_ir_controller = {
 	.irq_mask		= pci_msi_mask_irq,
 	.irq_ack		= irq_chip_ack_parent,
 	.irq_retrigger		= irq_chip_retrigger_hierarchy,
+	.irq_set_vcpu_affinity	= irq_chip_set_vcpu_affinity_parent,
 	.flags			= IRQCHIP_SKIP_SET_WAKE,
 };
 

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [tip:x86/apic] x86/irq: Define a global vector for VT-d Posted-Interrupts
  2015-05-19  9:07 ` [v7 3/4] x86/irq: Define a global vector for VT-d Posted-Interrupts Feng Wu
@ 2015-05-19 13:54   ` tip-bot for Feng Wu
  0 siblings, 0 replies; 9+ messages in thread
From: tip-bot for Feng Wu @ 2015-05-19 13:54 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: tglx, hpa, linux-kernel, yang.z.zhang, mingo, feng.wu, hpa

Commit-ID:  f6b3c72c23661e5534cd2eede16e9bac7ebb761c
Gitweb:     http://git.kernel.org/tip/f6b3c72c23661e5534cd2eede16e9bac7ebb761c
Author:     Feng Wu <feng.wu@intel.com>
AuthorDate: Tue, 19 May 2015 17:07:16 +0800
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Tue, 19 May 2015 15:51:17 +0200

x86/irq: Define a global vector for VT-d Posted-Interrupts

Currently, we use a global vector as the Posted-Interrupts
Notification Event for all the vCPUs in the system. We need
to introduce another global vector for VT-d Posted-Interrtups,
which will be used to wakeup the sleep vCPU when an external
interrupt from a direct-assigned device happens for that vCPU.

[ tglx: Removed a gazillion of extra newlines ]

Signed-off-by: Feng Wu <feng.wu@intel.com>
Cc: jiang.liu@linux.intel.com
Link: http://lkml.kernel.org/r/1432026437-16560-4-git-send-email-feng.wu@intel.com
Suggested-by: Yang Zhang <yang.z.zhang@intel.com>
Acked-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/include/asm/entry_arch.h  |  2 ++
 arch/x86/include/asm/hardirq.h     |  1 +
 arch/x86/include/asm/hw_irq.h      |  2 ++
 arch/x86/include/asm/irq.h         |  4 ++++
 arch/x86/include/asm/irq_vectors.h |  1 +
 arch/x86/kernel/entry_64.S         |  2 ++
 arch/x86/kernel/irq.c              | 26 ++++++++++++++++++++++++++
 arch/x86/kernel/irqinit.c          |  2 ++
 8 files changed, 40 insertions(+)

diff --git a/arch/x86/include/asm/entry_arch.h b/arch/x86/include/asm/entry_arch.h
index dc5fa66..27ca0af 100644
--- a/arch/x86/include/asm/entry_arch.h
+++ b/arch/x86/include/asm/entry_arch.h
@@ -23,6 +23,8 @@ BUILD_INTERRUPT(x86_platform_ipi, X86_PLATFORM_IPI_VECTOR)
 #ifdef CONFIG_HAVE_KVM
 BUILD_INTERRUPT3(kvm_posted_intr_ipi, POSTED_INTR_VECTOR,
 		 smp_kvm_posted_intr_ipi)
+BUILD_INTERRUPT3(kvm_posted_intr_wakeup_ipi, POSTED_INTR_WAKEUP_VECTOR,
+		 smp_kvm_posted_intr_wakeup_ipi)
 #endif
 
 /*
diff --git a/arch/x86/include/asm/hardirq.h b/arch/x86/include/asm/hardirq.h
index 0f5fb6b..9866065 100644
--- a/arch/x86/include/asm/hardirq.h
+++ b/arch/x86/include/asm/hardirq.h
@@ -14,6 +14,7 @@ typedef struct {
 #endif
 #ifdef CONFIG_HAVE_KVM
 	unsigned int kvm_posted_intr_ipis;
+	unsigned int kvm_posted_intr_wakeup_ipis;
 #endif
 	unsigned int x86_platform_ipis;	/* arch dependent */
 	unsigned int apic_perf_irqs;
diff --git a/arch/x86/include/asm/hw_irq.h b/arch/x86/include/asm/hw_irq.h
index 9ec5d37..10c80d4 100644
--- a/arch/x86/include/asm/hw_irq.h
+++ b/arch/x86/include/asm/hw_irq.h
@@ -29,6 +29,7 @@
 extern asmlinkage void apic_timer_interrupt(void);
 extern asmlinkage void x86_platform_ipi(void);
 extern asmlinkage void kvm_posted_intr_ipi(void);
+extern asmlinkage void kvm_posted_intr_wakeup_ipi(void);
 extern asmlinkage void error_interrupt(void);
 extern asmlinkage void irq_work_interrupt(void);
 
@@ -58,6 +59,7 @@ extern void trace_call_function_single_interrupt(void);
 #define trace_irq_move_cleanup_interrupt  irq_move_cleanup_interrupt
 #define trace_reboot_interrupt  reboot_interrupt
 #define trace_kvm_posted_intr_ipi kvm_posted_intr_ipi
+#define trace_kvm_posted_intr_wakeup_ipi kvm_posted_intr_wakeup_ipi
 #endif /* CONFIG_TRACING */
 
 #ifdef	CONFIG_X86_LOCAL_APIC
diff --git a/arch/x86/include/asm/irq.h b/arch/x86/include/asm/irq.h
index a80cbb8..8008d06 100644
--- a/arch/x86/include/asm/irq.h
+++ b/arch/x86/include/asm/irq.h
@@ -30,6 +30,10 @@ extern void fixup_irqs(void);
 extern void irq_force_complete_move(int);
 #endif
 
+#ifdef CONFIG_HAVE_KVM
+extern void kvm_set_posted_intr_wakeup_handler(void (*handler)(void));
+#endif
+
 extern void (*x86_platform_ipi_callback)(void);
 extern void native_init_IRQ(void);
 extern bool handle_irq(unsigned irq, struct pt_regs *regs);
diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_vectors.h
index bf55235..0ed29ac 100644
--- a/arch/x86/include/asm/irq_vectors.h
+++ b/arch/x86/include/asm/irq_vectors.h
@@ -86,6 +86,7 @@
 /* Vector for KVM to deliver posted interrupt IPI */
 #ifdef CONFIG_HAVE_KVM
 #define POSTED_INTR_VECTOR		0xf2
+#define POSTED_INTR_WAKEUP_VECTOR	0xf1
 #endif
 
 /*
diff --git a/arch/x86/kernel/entry_64.S b/arch/x86/kernel/entry_64.S
index 47b9581..22aadc9 100644
--- a/arch/x86/kernel/entry_64.S
+++ b/arch/x86/kernel/entry_64.S
@@ -916,6 +916,8 @@ apicinterrupt X86_PLATFORM_IPI_VECTOR \
 #ifdef CONFIG_HAVE_KVM
 apicinterrupt3 POSTED_INTR_VECTOR \
 	kvm_posted_intr_ipi smp_kvm_posted_intr_ipi
+apicinterrupt3 POSTED_INTR_WAKEUP_VECTOR \
+	kvm_posted_intr_wakeup_ipi smp_kvm_posted_intr_wakeup_ipi
 #endif
 
 #ifdef CONFIG_X86_MCE_THRESHOLD
diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index be38945..90b2f705 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -242,6 +242,18 @@ __visible void smp_x86_platform_ipi(struct pt_regs *regs)
 }
 
 #ifdef CONFIG_HAVE_KVM
+static void dummy_handler(void) {}
+static void (*kvm_posted_intr_wakeup_handler)(void) = dummy_handler;
+
+void kvm_set_posted_intr_wakeup_handler(void (*handler)(void))
+{
+	if (handler)
+		kvm_posted_intr_wakeup_handler = handler;
+	else
+		kvm_posted_intr_wakeup_handler = dummy_handler;
+}
+EXPORT_SYMBOL_GPL(kvm_set_posted_intr_wakeup_handler);
+
 /*
  * Handler for POSTED_INTERRUPT_VECTOR.
  */
@@ -254,6 +266,20 @@ __visible void smp_kvm_posted_intr_ipi(struct pt_regs *regs)
 	exiting_irq();
 	set_irq_regs(old_regs);
 }
+
+/*
+ * Handler for POSTED_INTERRUPT_WAKEUP_VECTOR.
+ */
+__visible void smp_kvm_posted_intr_wakeup_ipi(struct pt_regs *regs)
+{
+	struct pt_regs *old_regs = set_irq_regs(regs);
+
+	entering_ack_irq();
+	inc_irq_stat(kvm_posted_intr_wakeup_ipis);
+	kvm_posted_intr_wakeup_handler();
+	exiting_irq();
+	set_irq_regs(old_regs);
+}
 #endif
 
 __visible void smp_trace_x86_platform_ipi(struct pt_regs *regs)
diff --git a/arch/x86/kernel/irqinit.c b/arch/x86/kernel/irqinit.c
index dc1e08d..680723a 100644
--- a/arch/x86/kernel/irqinit.c
+++ b/arch/x86/kernel/irqinit.c
@@ -144,6 +144,8 @@ static void __init apic_intr_init(void)
 #ifdef CONFIG_HAVE_KVM
 	/* IPI for KVM to deliver posted interrupt */
 	alloc_intr_gate(POSTED_INTR_VECTOR, kvm_posted_intr_ipi);
+	/* IPI for KVM to deliver interrupt to wake up tasks */
+	alloc_intr_gate(POSTED_INTR_WAKEUP_VECTOR, kvm_posted_intr_wakeup_ipi);
 #endif
 
 	/* IPI vectors for APIC spurious and error interrupts */

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [tip:x86/apic] x86/irq: Show statistics information for posted-interrupts
  2015-05-19  9:07 ` [v7 4/4] x86/irq: Show statistics information for posted-interrupts Feng Wu
@ 2015-05-19 13:55   ` tip-bot for Feng Wu
  0 siblings, 0 replies; 9+ messages in thread
From: tip-bot for Feng Wu @ 2015-05-19 13:55 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: linux-kernel, mingo, feng.wu, tglx, hpa

Commit-ID:  501b32653ebf49114cccb9afbf9150cf18fd8700
Gitweb:     http://git.kernel.org/tip/501b32653ebf49114cccb9afbf9150cf18fd8700
Author:     Feng Wu <feng.wu@intel.com>
AuthorDate: Tue, 19 May 2015 17:07:17 +0800
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Tue, 19 May 2015 15:51:17 +0200

x86/irq: Show statistics information for posted-interrupts

Show the statistics information for notification event
and wakeup event for posted-interrupt in /proc/interrupts.

[ tglx: Named the short identifiers PIN and PIW to match the long
  	identifiers ]

Signed-off-by: Feng Wu <feng.wu@intel.com>
Cc: jiang.liu@linux.intel.com
Link: http://lkml.kernel.org/r/1432026437-16560-5-git-send-email-feng.wu@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 arch/x86/kernel/irq.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

diff --git a/arch/x86/kernel/irq.c b/arch/x86/kernel/irq.c
index 90b2f705..7e10c8b 100644
--- a/arch/x86/kernel/irq.c
+++ b/arch/x86/kernel/irq.c
@@ -142,6 +142,18 @@ int arch_show_interrupts(struct seq_file *p, int prec)
 #if defined(CONFIG_X86_IO_APIC)
 	seq_printf(p, "%*s: %10u\n", prec, "MIS", atomic_read(&irq_mis_count));
 #endif
+#ifdef CONFIG_HAVE_KVM
+	seq_printf(p, "%*s: ", prec, "PIN");
+	for_each_online_cpu(j)
+		seq_printf(p, "%10u ", irq_stats(j)->kvm_posted_intr_ipis);
+	seq_puts(p, "  Posted-interrupt notification event\n");
+
+	seq_printf(p, "%*s: ", prec, "PIW");
+	for_each_online_cpu(j)
+		seq_printf(p, "%10u ",
+			   irq_stats(j)->kvm_posted_intr_wakeup_ipis);
+	seq_puts(p, "  Posted-interrupt wakeup event\n");
+#endif
 	return 0;
 }
 

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-05-19 13:55 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-19  9:07 [v7 0/4] prerequisite changes for VT-d posted-interrupts Feng Wu
2015-05-19  9:07 ` [v7 1/4] genirq: Introduce irq_set_vcpu_affinity() to target an interrupt to a VCPU Feng Wu
2015-05-19 13:45   ` [tip:irq/core] " tip-bot for Jiang Liu
2015-05-19  9:07 ` [v7 2/4] x86/irq: Implement irq_set_vcpu_affinity for pci_msi_ir_controller Feng Wu
2015-05-19 13:54   ` [tip:x86/apic] x86/irq/msi: Implement irq_set_vcpu_affinity for remapped MSI irqs tip-bot for Feng Wu
2015-05-19  9:07 ` [v7 3/4] x86/irq: Define a global vector for VT-d Posted-Interrupts Feng Wu
2015-05-19 13:54   ` [tip:x86/apic] " tip-bot for Feng Wu
2015-05-19  9:07 ` [v7 4/4] x86/irq: Show statistics information for posted-interrupts Feng Wu
2015-05-19 13:55   ` [tip:x86/apic] " tip-bot for Feng Wu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).