linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [v7 0/8] Add VT-d Posted-Interrupts support - IOMMU part
@ 2015-05-25  5:28 Feng Wu
  2015-05-25  5:28 ` [v7 1/8] iommu: Add new member capability to struct irq_remap_ops Feng Wu
                   ` (7 more replies)
  0 siblings, 8 replies; 13+ messages in thread
From: Feng Wu @ 2015-05-25  5:28 UTC (permalink / raw)
  To: joro, dwmw2; +Cc: tglx, jiang.liu, iommu, linux-kernel, Feng Wu

VT-d Posted-Interrupts is an enhancement to CPU side Posted-Interrupt.
With VT-d Posted-Interrupts enabled, external interrupts from
direct-assigned devices can be delivered to guests without VMM
intervention when guest is running in non-root mode.

You can find the VT-d Posted-Interrtups Spec. in the following URL:
http://www.intel.com/content/www/us/en/intelligent-systems/intel-technology/vt-directed-io-spec.html

This series was part of http://thread.gmane.org/gmane.linux.kernel.iommu/7708. To make things clear, send out IOMMU part here.

This patch-set is based on the lastest x86/apic branch of tip tree.

Divide the whole series which contain multiple components into three parts:
- Prerequisite changes to irq subsystem (already merged in tip tree x86/apic branch)
- IOMMU part (in this series)
- KVM and VFIO parts (will send out this part once the first two parts are accepted)

v6->v7:
* Add an static inline helper function set_irq_posting_cap() to set
the PI capability.
* Add some comments for the new member "ir_data->irte_pi_entry".

v5->v6:
* Extend 'struct irte' for VT-d Posted-Interrupts, combine remapped
and posted mode into one irte structure.

v4->v5:
* Abstract modify_irte() to accept two format of irte.

v3->v4:
* Change capability to a int variant flags instead of a function call.
* Add hotplug case for VT-d PI.

Feng Wu (7):
  iommu: Add new member capability to struct irq_remap_ops
  iommu, x86: Implement irq_set_vcpu_affinity for intel_ir_chip
  iommu, x86: No need to migrating irq for VT-d Posted-Interrupts
  iommu, x86: Add cap_pi_support() to detect VT-d PI capability
  iommu, x86: Setup Posted-Interrupts capability for Intel iommu
  iommu, x86: define irq_remapping_cap()
  iommu, x86: Properly handler PI for IOMMU hotplug

Thomas Gleixner (1):
  iommu: dmar: Extend struct irte for VT-d Posted-Interrupts

 arch/x86/include/asm/irq_remapping.h | 11 +++++
 drivers/iommu/intel_irq_remapping.c  | 84 +++++++++++++++++++++++++++++++++++-
 drivers/iommu/irq_remapping.c        | 11 +++++
 drivers/iommu/irq_remapping.h        |  6 +++
 include/linux/dmar.h                 | 70 +++++++++++++++++++++++-------
 include/linux/intel-iommu.h          |  1 +
 6 files changed, 167 insertions(+), 16 deletions(-)

-- 
2.1.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [v7 1/8] iommu: Add new member capability to struct irq_remap_ops
  2015-05-25  5:28 [v7 0/8] Add VT-d Posted-Interrupts support - IOMMU part Feng Wu
@ 2015-05-25  5:28 ` Feng Wu
  2015-05-25  5:28 ` [v7 2/8] iommu: dmar: Extend struct irte for VT-d Posted-Interrupts Feng Wu
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Feng Wu @ 2015-05-25  5:28 UTC (permalink / raw)
  To: joro, dwmw2; +Cc: tglx, jiang.liu, iommu, linux-kernel, Feng Wu

This patch adds a new member capability to struct irq_remap_ops,
this new function ops can be used to check whether some
features are supported, such as VT-d Posted-Interrupts.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/include/asm/irq_remapping.h | 4 ++++
 drivers/iommu/irq_remapping.h        | 3 +++
 2 files changed, 7 insertions(+)

diff --git a/arch/x86/include/asm/irq_remapping.h b/arch/x86/include/asm/irq_remapping.h
index 78974fb..0953723 100644
--- a/arch/x86/include/asm/irq_remapping.h
+++ b/arch/x86/include/asm/irq_remapping.h
@@ -31,6 +31,10 @@ struct irq_alloc_info;
 
 #ifdef CONFIG_IRQ_REMAP
 
+enum irq_remap_cap {
+	IRQ_POSTING_CAP = 0,
+};
+
 extern void set_irq_remapping_broken(void);
 extern int irq_remapping_prepare(void);
 extern int irq_remapping_enable(void);
diff --git a/drivers/iommu/irq_remapping.h b/drivers/iommu/irq_remapping.h
index 91d5a11..b6ca30d 100644
--- a/drivers/iommu/irq_remapping.h
+++ b/drivers/iommu/irq_remapping.h
@@ -35,6 +35,9 @@ extern int no_x2apic_optout;
 extern int irq_remapping_enabled;
 
 struct irq_remap_ops {
+	/* The supported capabilities */
+	int capability;
+
 	/* Initializes hardware and makes it ready for remapping interrupts */
 	int  (*prepare)(void);
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [v7 2/8] iommu: dmar: Extend struct irte for VT-d Posted-Interrupts
  2015-05-25  5:28 [v7 0/8] Add VT-d Posted-Interrupts support - IOMMU part Feng Wu
  2015-05-25  5:28 ` [v7 1/8] iommu: Add new member capability to struct irq_remap_ops Feng Wu
@ 2015-05-25  5:28 ` Feng Wu
  2015-05-25  5:28 ` [v7 3/8] iommu, x86: Implement irq_set_vcpu_affinity for intel_ir_chip Feng Wu
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Feng Wu @ 2015-05-25  5:28 UTC (permalink / raw)
  To: joro, dwmw2; +Cc: tglx, jiang.liu, iommu, linux-kernel, Feng Wu

From: Thomas Gleixner <tglx@linutronix.de>

The IRTE (Interrupt Remapping Table Entry) is either an entry for
remapped or for posted interrupts. The hardware distiguishes between
remapped and posted entries by bit 15 in the low 64 bit of the
IRTE. If cleared the entry is remapped, if set it's posted.

The entries have common fields and dependent on the posted bit fields
with different meanings.

Extend struct irte to handle the differences between remap and posted
mode by having three structs in the unions:

	- Shared
	- Remapped
	- Posted

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Feng Wu <feng.wu@intel.com>
---
 include/linux/dmar.h | 70 +++++++++++++++++++++++++++++++++++++++++-----------
 1 file changed, 55 insertions(+), 15 deletions(-)

diff --git a/include/linux/dmar.h b/include/linux/dmar.h
index 8473756..0dbcabc 100644
--- a/include/linux/dmar.h
+++ b/include/linux/dmar.h
@@ -185,33 +185,73 @@ static inline int dmar_device_remove(void *handle)
 
 struct irte {
 	union {
+		/* Shared between remapped and posted mode*/
 		struct {
-			__u64	present 	: 1,
-				fpd		: 1,
-				dst_mode	: 1,
-				redir_hint	: 1,
-				trigger_mode	: 1,
-				dlvry_mode	: 3,
-				avail		: 4,
-				__reserved_1	: 4,
-				vector		: 8,
-				__reserved_2	: 8,
-				dest_id		: 32;
+			__u64	present		: 1,  /*  0      */
+				fpd		: 1,  /*  1      */
+				__res0		: 6,  /*  2 -  6 */
+				avail		: 4,  /*  8 - 11 */
+				__res1		: 3,  /* 12 - 14 */
+				pst		: 1,  /* 15      */
+				vector		: 8,  /* 16 - 23 */
+				__res2		: 40; /* 24 - 63 */
+		};
+
+		/* Remapped mode */
+		struct {
+			__u64	r_present	: 1,  /*  0      */
+				r_fpd		: 1,  /*  1      */
+				dst_mode	: 1,  /*  2      */
+				redir_hint	: 1,  /*  3      */
+				trigger_mode	: 1,  /*  4      */
+				dlvry_mode	: 3,  /*  5 -  7 */
+				r_avail		: 4,  /*  8 - 11 */
+				r_res0		: 4,  /* 12 - 15 */
+				r_vector	: 8,  /* 16 - 23 */
+				r_res1		: 8,  /* 24 - 31 */
+				dest_id		: 32; /* 32 - 63 */
+		};
+
+		/* Posted mode */
+		struct {
+			__u64	p_present	: 1,  /*  0      */
+				p_fpd		: 1,  /*  1      */
+				p_res0		: 6,  /*  2 -  7 */
+				p_avail		: 4,  /*  8 - 11 */
+				p_res1		: 2,  /* 12 - 13 */
+				p_urgent	: 1,  /* 14      */
+				p_pst		: 1,  /* 15      */
+				p_vector	: 8,  /* 16 - 23 */
+				p_res2		: 14, /* 24 - 37 */
+				pda_l		: 26; /* 38 - 63 */
 		};
 		__u64 low;
 	};
 
 	union {
+		/* Shared between remapped and posted mode*/
 		struct {
-			__u64	sid		: 16,
-				sq		: 2,
-				svt		: 2,
-				__reserved_3	: 44;
+			__u64	sid		: 16,  /* 64 - 79  */
+				sq		: 2,   /* 80 - 81  */
+				svt		: 2,   /* 82 - 83  */
+				__res3		: 44;  /* 84 - 127 */
+		};
+
+		/* Posted mode*/
+		struct {
+			__u64	p_sid		: 16,  /* 64 - 79  */
+				p_sq		: 2,   /* 80 - 81  */
+				p_svt		: 2,   /* 82 - 83  */
+				p_res3		: 12,  /* 84 - 95  */
+				pda_h		: 32;  /* 96 - 127 */
 		};
 		__u64 high;
 	};
 };
 
+#define PDA_LOW_BIT    26
+#define PDA_HIGH_BIT   32
+
 enum {
 	IRQ_REMAP_XAPIC_MODE,
 	IRQ_REMAP_X2APIC_MODE,
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [v7 3/8] iommu, x86: Implement irq_set_vcpu_affinity for intel_ir_chip
  2015-05-25  5:28 [v7 0/8] Add VT-d Posted-Interrupts support - IOMMU part Feng Wu
  2015-05-25  5:28 ` [v7 1/8] iommu: Add new member capability to struct irq_remap_ops Feng Wu
  2015-05-25  5:28 ` [v7 2/8] iommu: dmar: Extend struct irte for VT-d Posted-Interrupts Feng Wu
@ 2015-05-25  5:28 ` Feng Wu
  2015-05-25  5:28 ` [v7 4/8] iommu, x86: No need to migrating irq for VT-d Posted-Interrupts Feng Wu
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Feng Wu @ 2015-05-25  5:28 UTC (permalink / raw)
  To: joro, dwmw2; +Cc: tglx, jiang.liu, iommu, linux-kernel, Feng Wu

Implement irq_set_vcpu_affinity for intel_ir_chip.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Jiang Liu <jiang.liu@linux.intel.com>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
---
 arch/x86/include/asm/irq_remapping.h |  5 ++++
 drivers/iommu/intel_irq_remapping.c  | 46 ++++++++++++++++++++++++++++++++++++
 2 files changed, 51 insertions(+)

diff --git a/arch/x86/include/asm/irq_remapping.h b/arch/x86/include/asm/irq_remapping.h
index 0953723..202e040 100644
--- a/arch/x86/include/asm/irq_remapping.h
+++ b/arch/x86/include/asm/irq_remapping.h
@@ -57,6 +57,11 @@ static inline struct irq_domain *arch_get_ir_parent_domain(void)
 	return x86_vector_domain;
 }
 
+struct vcpu_data {
+	u64 pi_desc_addr;	/* Physical address of PI Descriptor */
+	u32 vector;		/* Guest vector of the interrupt */
+};
+
 #else  /* CONFIG_IRQ_REMAP */
 
 static inline void set_irq_remapping_broken(void) { }
diff --git a/drivers/iommu/intel_irq_remapping.c b/drivers/iommu/intel_irq_remapping.c
index 8fad71c..1955b09 100644
--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -42,6 +42,7 @@ struct irq_2_iommu {
 struct intel_ir_data {
 	struct irq_2_iommu			irq_2_iommu;
 	struct irte				irte_entry;
+	struct irte				irte_pi_entry;
 	union {
 		struct msi_msg			msi_entry;
 	};
@@ -1013,10 +1014,55 @@ static void intel_ir_compose_msi_msg(struct irq_data *irq_data,
 	*msg = ir_data->msi_entry;
 }
 
+static int intel_ir_set_vcpu_affinity(struct irq_data *data, void *vcpu_info)
+{
+	struct intel_ir_data *ir_data = data->chip_data;
+	struct irte *irte_pi = &ir_data->irte_pi_entry;
+	struct vcpu_data *vcpu_pi_info;
+
+	/* stop posting interrupts, back to remapping mode */
+	if (!vcpu_info) {
+		modify_irte(&ir_data->irq_2_iommu, &ir_data->irte_entry);
+	} else {
+		vcpu_pi_info = (struct vcpu_data *)vcpu_info;
+
+		/*
+		 * "ir_data->irte_entry" saves the remapped format of IRTE,
+		 * which being a cached irte is still updated when setting
+		 * the affinity even when we are in posted mode. So this make
+		 * it possible to switch back to remapped mode from posted mode,
+		 * we can just set "ir_data->irte_entry" to hardware for that
+		 * purpose. Here we store the posted format of IRTE in another
+		 * new member "ir_data->irte_pi_entry" to not corrupt
+		 * "ir_data->irte_entry".
+		 */
+		memcpy(irte_pi, &ir_data->irte_entry, sizeof(struct irte));
+
+		irte_pi->p_urgent = 0;
+		irte_pi->p_vector = vcpu_pi_info->vector;
+		irte_pi->pda_l = (vcpu_pi_info->pi_desc_addr >>
+				 (32 - PDA_LOW_BIT)) & ~(-1UL << PDA_LOW_BIT);
+		irte_pi->pda_h = (vcpu_pi_info->pi_desc_addr >> 32) &
+				 ~(-1UL << PDA_HIGH_BIT);
+
+		irte_pi->p_res0 = 0;
+		irte_pi->p_res1 = 0;
+		irte_pi->p_res2 = 0;
+		irte_pi->p_res3 = 0;
+
+		irte_pi->p_pst = 1;
+
+		modify_irte(&ir_data->irq_2_iommu, irte_pi);
+	}
+
+	return 0;
+}
+
 static struct irq_chip intel_ir_chip = {
 	.irq_ack = ir_ack_apic_edge,
 	.irq_set_affinity = intel_ir_set_affinity,
 	.irq_compose_msi_msg = intel_ir_compose_msi_msg,
+	.irq_set_vcpu_affinity = intel_ir_set_vcpu_affinity,
 };
 
 static void intel_irq_remapping_prepare_irte(struct intel_ir_data *data,
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [v7 4/8] iommu, x86: No need to migrating irq for VT-d Posted-Interrupts
  2015-05-25  5:28 [v7 0/8] Add VT-d Posted-Interrupts support - IOMMU part Feng Wu
                   ` (2 preceding siblings ...)
  2015-05-25  5:28 ` [v7 3/8] iommu, x86: Implement irq_set_vcpu_affinity for intel_ir_chip Feng Wu
@ 2015-05-25  5:28 ` Feng Wu
  2015-05-25  8:38   ` Thomas Gleixner
  2015-05-25  5:28 ` [v7 5/8] iommu, x86: Add cap_pi_support() to detect VT-d PI capability Feng Wu
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 13+ messages in thread
From: Feng Wu @ 2015-05-25  5:28 UTC (permalink / raw)
  To: joro, dwmw2; +Cc: tglx, jiang.liu, iommu, linux-kernel, Feng Wu

We don't need to migrate the irqs for VT-d Posted-Interrupts here.
When 'pst' is set in IRTE, the associated irq will be posted to
guests instead of interrupt remapping. The destination of the
interrupt is set in Posted-Interrupts Descriptor, and the migration
happens during vCPU scheduling.

However, we still update the cached irte here, which can be used
when changing back to remapping mode.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Jiang Liu <jiang.liu@linux.intel.com>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
---
 drivers/iommu/intel_irq_remapping.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel_irq_remapping.c b/drivers/iommu/intel_irq_remapping.c
index 1955b09..646f4cf 100644
--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -994,7 +994,10 @@ intel_ir_set_affinity(struct irq_data *data, const struct cpumask *mask,
 	 */
 	irte->vector = cfg->vector;
 	irte->dest_id = IRTE_DEST(cfg->dest_apicid);
-	modify_irte(&ir_data->irq_2_iommu, irte);
+
+	/* We don't need to modify irte if the interrupt is for posting. */
+	if (irte->pst != 1)
+		modify_irte(&ir_data->irq_2_iommu, irte);
 
 	/*
 	 * After this point, all the interrupts will start arriving
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [v7 5/8] iommu, x86: Add cap_pi_support() to detect VT-d PI capability
  2015-05-25  5:28 [v7 0/8] Add VT-d Posted-Interrupts support - IOMMU part Feng Wu
                   ` (3 preceding siblings ...)
  2015-05-25  5:28 ` [v7 4/8] iommu, x86: No need to migrating irq for VT-d Posted-Interrupts Feng Wu
@ 2015-05-25  5:28 ` Feng Wu
  2015-05-25  5:28 ` [v7 6/8] iommu, x86: Setup Posted-Interrupts capability for Intel iommu Feng Wu
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 13+ messages in thread
From: Feng Wu @ 2015-05-25  5:28 UTC (permalink / raw)
  To: joro, dwmw2; +Cc: tglx, jiang.liu, iommu, linux-kernel, Feng Wu

Add helper function to detect VT-d Posted-Interrupts capability.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Jiang Liu <jiang.liu@linux.intel.com>
Acked-by: David Woodhouse <David.Woodhouse@intel.com>
---
 include/linux/intel-iommu.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 0af9b03..0c251be 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -87,6 +87,7 @@ static inline void dmar_writeq(void __iomem *addr, u64 val)
 /*
  * Decoding Capability Register
  */
+#define cap_pi_support(c)	(((c) >> 59) & 1)
 #define cap_read_drain(c)	(((c) >> 55) & 1)
 #define cap_write_drain(c)	(((c) >> 54) & 1)
 #define cap_max_amask_val(c)	(((c) >> 48) & 0x3f)
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [v7 6/8] iommu, x86: Setup Posted-Interrupts capability for Intel iommu
  2015-05-25  5:28 [v7 0/8] Add VT-d Posted-Interrupts support - IOMMU part Feng Wu
                   ` (4 preceding siblings ...)
  2015-05-25  5:28 ` [v7 5/8] iommu, x86: Add cap_pi_support() to detect VT-d PI capability Feng Wu
@ 2015-05-25  5:28 ` Feng Wu
  2015-05-25  5:28 ` [v7 7/8] iommu, x86: define irq_remapping_cap() Feng Wu
  2015-05-25  5:28 ` [v7 8/8] iommu, x86: Properly handler PI for IOMMU hotplug Feng Wu
  7 siblings, 0 replies; 13+ messages in thread
From: Feng Wu @ 2015-05-25  5:28 UTC (permalink / raw)
  To: joro, dwmw2; +Cc: tglx, jiang.liu, iommu, linux-kernel, Feng Wu

Set Posted-Interrupts capability for Intel iommu when IR is enabled,
clear it when IR is disabled.

Signed-off-by: Feng Wu <feng.wu@intel.com>
---
 drivers/iommu/intel_irq_remapping.c | 30 ++++++++++++++++++++++++++++++
 drivers/iommu/irq_remapping.c       |  2 ++
 drivers/iommu/irq_remapping.h       |  3 +++
 3 files changed, 35 insertions(+)

diff --git a/drivers/iommu/intel_irq_remapping.c b/drivers/iommu/intel_irq_remapping.c
index 646f4cf..9f7f378 100644
--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -572,6 +572,26 @@ error:
 	return -ENODEV;
 }
 
+/*
+ * Set Posted-Interrupts capability.
+ */
+static inline void set_irq_posting_cap(void)
+{
+	struct dmar_drhd_unit *drhd;
+	struct intel_iommu *iommu;
+
+	if (!disable_irq_post) {
+		intel_irq_remap_ops.capability |= 1 << IRQ_POSTING_CAP;
+
+		for_each_iommu(iommu, drhd)
+			if (!cap_pi_support(iommu->cap)) {
+				intel_irq_remap_ops.capability &=
+						~(1 << IRQ_POSTING_CAP);
+				break;
+			}
+	}
+}
+
 static int __init intel_enable_irq_remapping(void)
 {
 	struct dmar_drhd_unit *drhd;
@@ -647,6 +667,8 @@ static int __init intel_enable_irq_remapping(void)
 
 	irq_remapping_enabled = 1;
 
+	set_irq_posting_cap();
+
 	pr_info("Enabled IRQ remapping in %s mode\n", eim ? "x2apic" : "xapic");
 
 	return eim ? IRQ_REMAP_X2APIC_MODE : IRQ_REMAP_XAPIC_MODE;
@@ -847,6 +869,12 @@ static void disable_irq_remapping(void)
 
 		iommu_disable_irq_remapping(iommu);
 	}
+
+	/*
+	 * Clear Posted-Interrupts capability.
+	 */
+	if (!disable_irq_post)
+		intel_irq_remap_ops.capability &= ~(1 << IRQ_POSTING_CAP);
 }
 
 static int reenable_irq_remapping(int eim)
@@ -874,6 +902,8 @@ static int reenable_irq_remapping(int eim)
 	if (!setup)
 		goto error;
 
+	set_irq_posting_cap();
+
 	return 0;
 
 error:
diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index fc78b0d..ed605a9 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -22,6 +22,8 @@ int irq_remap_broken;
 int disable_sourceid_checking;
 int no_x2apic_optout;
 
+int disable_irq_post = 1;
+
 static int disable_irq_remap;
 static struct irq_remap_ops *remap_ops;
 
diff --git a/drivers/iommu/irq_remapping.h b/drivers/iommu/irq_remapping.h
index b6ca30d..039c7af 100644
--- a/drivers/iommu/irq_remapping.h
+++ b/drivers/iommu/irq_remapping.h
@@ -34,6 +34,8 @@ extern int disable_sourceid_checking;
 extern int no_x2apic_optout;
 extern int irq_remapping_enabled;
 
+extern int disable_irq_post;
+
 struct irq_remap_ops {
 	/* The supported capabilities */
 	int capability;
@@ -69,6 +71,7 @@ extern void ir_ack_apic_edge(struct irq_data *data);
 
 #define irq_remapping_enabled 0
 #define irq_remap_broken      0
+#define disable_irq_post      1
 
 #endif /* CONFIG_IRQ_REMAP */
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [v7 7/8] iommu, x86: define irq_remapping_cap()
  2015-05-25  5:28 [v7 0/8] Add VT-d Posted-Interrupts support - IOMMU part Feng Wu
                   ` (5 preceding siblings ...)
  2015-05-25  5:28 ` [v7 6/8] iommu, x86: Setup Posted-Interrupts capability for Intel iommu Feng Wu
@ 2015-05-25  5:28 ` Feng Wu
  2015-05-25  5:28 ` [v7 8/8] iommu, x86: Properly handler PI for IOMMU hotplug Feng Wu
  7 siblings, 0 replies; 13+ messages in thread
From: Feng Wu @ 2015-05-25  5:28 UTC (permalink / raw)
  To: joro, dwmw2; +Cc: tglx, jiang.liu, iommu, linux-kernel, Feng Wu

This patch adds a new interface irq_remapping_cap() to detect
whether irq remapping supports new features, such as VT-d
Posted-Interrupts. We export this function out, so that KVM
code can check this and use this mechanism properly.

Signed-off-by: Feng Wu <feng.wu@intel.com>
Reviewed-by: Jiang Liu <jiang.liu@linux.intel.com>
---
 arch/x86/include/asm/irq_remapping.h | 2 ++
 drivers/iommu/irq_remapping.c        | 9 +++++++++
 2 files changed, 11 insertions(+)

diff --git a/arch/x86/include/asm/irq_remapping.h b/arch/x86/include/asm/irq_remapping.h
index 202e040..61aa8ad 100644
--- a/arch/x86/include/asm/irq_remapping.h
+++ b/arch/x86/include/asm/irq_remapping.h
@@ -35,6 +35,7 @@ enum irq_remap_cap {
 	IRQ_POSTING_CAP = 0,
 };
 
+extern bool irq_remapping_cap(enum irq_remap_cap cap);
 extern void set_irq_remapping_broken(void);
 extern int irq_remapping_prepare(void);
 extern int irq_remapping_enable(void);
@@ -64,6 +65,7 @@ struct vcpu_data {
 
 #else  /* CONFIG_IRQ_REMAP */
 
+static bool irq_remapping_cap(enum irq_remap_cap cap) { return 0; }
 static inline void set_irq_remapping_broken(void) { }
 static inline int irq_remapping_prepare(void) { return -ENODEV; }
 static inline int irq_remapping_enable(void) { return -ENODEV; }
diff --git a/drivers/iommu/irq_remapping.c b/drivers/iommu/irq_remapping.c
index ed605a9..2d99930 100644
--- a/drivers/iommu/irq_remapping.c
+++ b/drivers/iommu/irq_remapping.c
@@ -81,6 +81,15 @@ void set_irq_remapping_broken(void)
 	irq_remap_broken = 1;
 }
 
+bool irq_remapping_cap(enum irq_remap_cap cap)
+{
+	if (!remap_ops || disable_irq_post)
+		return 0;
+
+	return (remap_ops->capability & (1 << cap));
+}
+EXPORT_SYMBOL_GPL(irq_remapping_cap);
+
 int __init irq_remapping_prepare(void)
 {
 	if (disable_irq_remap)
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [v7 8/8] iommu, x86: Properly handler PI for IOMMU hotplug
  2015-05-25  5:28 [v7 0/8] Add VT-d Posted-Interrupts support - IOMMU part Feng Wu
                   ` (6 preceding siblings ...)
  2015-05-25  5:28 ` [v7 7/8] iommu, x86: define irq_remapping_cap() Feng Wu
@ 2015-05-25  5:28 ` Feng Wu
  7 siblings, 0 replies; 13+ messages in thread
From: Feng Wu @ 2015-05-25  5:28 UTC (permalink / raw)
  To: joro, dwmw2; +Cc: tglx, jiang.liu, iommu, linux-kernel, Feng Wu

Return error when inserting a new IOMMU which doesn't support PI
if PI is currently in use.

Signed-off-by: Feng Wu <feng.wu@intel.com>
---
 drivers/iommu/intel_irq_remapping.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/iommu/intel_irq_remapping.c b/drivers/iommu/intel_irq_remapping.c
index 9f7f378..79ca56e 100644
--- a/drivers/iommu/intel_irq_remapping.c
+++ b/drivers/iommu/intel_irq_remapping.c
@@ -1354,6 +1354,9 @@ int dmar_ir_hotplug(struct dmar_drhd_unit *dmaru, bool insert)
 		return -EINVAL;
 	if (!ecap_ir_support(iommu->ecap))
 		return 0;
+	if (irq_remapping_cap(IRQ_POSTING_CAP) &&
+	    !cap_pi_support(iommu->cap))
+		return -EBUSY;
 
 	if (insert) {
 		if (!iommu->ir_table)
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [v7 4/8] iommu, x86: No need to migrating irq for VT-d Posted-Interrupts
  2015-05-25  5:28 ` [v7 4/8] iommu, x86: No need to migrating irq for VT-d Posted-Interrupts Feng Wu
@ 2015-05-25  8:38   ` Thomas Gleixner
  2015-05-26  2:53     ` Wu, Feng
  0 siblings, 1 reply; 13+ messages in thread
From: Thomas Gleixner @ 2015-05-25  8:38 UTC (permalink / raw)
  To: Feng Wu; +Cc: joro, dwmw2, jiang.liu, iommu, linux-kernel

On Mon, 25 May 2015, Feng Wu wrote:

> We don't need to migrate the irqs for VT-d Posted-Interrupts here.
> When 'pst' is set in IRTE, the associated irq will be posted to
> guests instead of interrupt remapping. The destination of the
> interrupt is set in Posted-Interrupts Descriptor, and the migration
> happens during vCPU scheduling.
> 
> However, we still update the cached irte here, which can be used
> when changing back to remapping mode.
> 
> Signed-off-by: Feng Wu <feng.wu@intel.com>
> Reviewed-by: Jiang Liu <jiang.liu@linux.intel.com>
> Acked-by: David Woodhouse <David.Woodhouse@intel.com>
> ---
>  drivers/iommu/intel_irq_remapping.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/iommu/intel_irq_remapping.c b/drivers/iommu/intel_irq_remapping.c
> index 1955b09..646f4cf 100644
> --- a/drivers/iommu/intel_irq_remapping.c
> +++ b/drivers/iommu/intel_irq_remapping.c
> @@ -994,7 +994,10 @@ intel_ir_set_affinity(struct irq_data *data, const struct cpumask *mask,
>  	 */
>  	irte->vector = cfg->vector;
>  	irte->dest_id = IRTE_DEST(cfg->dest_apicid);
> -	modify_irte(&ir_data->irq_2_iommu, irte);
> +
> +	/* We don't need to modify irte if the interrupt is for posting. */
> +	if (irte->pst != 1)
> +		modify_irte(&ir_data->irq_2_iommu, irte);

I don't think this is correct. ir_data->irte_entry contains the non
posted version, which has pst == 0.

You need some other way to store whether you are in posted mode or
not.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [v7 4/8] iommu, x86: No need to migrating irq for VT-d Posted-Interrupts
  2015-05-25  8:38   ` Thomas Gleixner
@ 2015-05-26  2:53     ` Wu, Feng
  2015-05-26 10:00       ` Thomas Gleixner
  0 siblings, 1 reply; 13+ messages in thread
From: Wu, Feng @ 2015-05-26  2:53 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: joro, dwmw2, jiang.liu, iommu, linux-kernel, Wu, Feng



> -----Original Message-----
> From: Thomas Gleixner [mailto:tglx@linutronix.de]
> Sent: Monday, May 25, 2015 4:38 PM
> To: Wu, Feng
> Cc: joro@8bytes.org; dwmw2@infradead.org; jiang.liu@linux.intel.com;
> iommu@lists.linux-foundation.org; linux-kernel@vger.kernel.org
> Subject: Re: [v7 4/8] iommu, x86: No need to migrating irq for VT-d
> Posted-Interrupts
> 
> On Mon, 25 May 2015, Feng Wu wrote:
> 
> > We don't need to migrate the irqs for VT-d Posted-Interrupts here.
> > When 'pst' is set in IRTE, the associated irq will be posted to
> > guests instead of interrupt remapping. The destination of the
> > interrupt is set in Posted-Interrupts Descriptor, and the migration
> > happens during vCPU scheduling.
> >
> > However, we still update the cached irte here, which can be used
> > when changing back to remapping mode.
> >
> > Signed-off-by: Feng Wu <feng.wu@intel.com>
> > Reviewed-by: Jiang Liu <jiang.liu@linux.intel.com>
> > Acked-by: David Woodhouse <David.Woodhouse@intel.com>
> > ---
> >  drivers/iommu/intel_irq_remapping.c | 5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/iommu/intel_irq_remapping.c
> b/drivers/iommu/intel_irq_remapping.c
> > index 1955b09..646f4cf 100644
> > --- a/drivers/iommu/intel_irq_remapping.c
> > +++ b/drivers/iommu/intel_irq_remapping.c
> > @@ -994,7 +994,10 @@ intel_ir_set_affinity(struct irq_data *data, const
> struct cpumask *mask,
> >  	 */
> >  	irte->vector = cfg->vector;
> >  	irte->dest_id = IRTE_DEST(cfg->dest_apicid);
> > -	modify_irte(&ir_data->irq_2_iommu, irte);
> > +
> > +	/* We don't need to modify irte if the interrupt is for posting. */
> > +	if (irte->pst != 1)
> > +		modify_irte(&ir_data->irq_2_iommu, irte);
> 
> I don't think this is correct. ir_data->irte_entry contains the non
> posted version, which has pst == 0.
> 
> You need some other way to store whether you are in posted mode or
> not.

Yes, seems this is incorrect. Thank you for pointing this out. After more
thinking about this, I think I can do it this way:
#1. Check the 'pst' field in hardware
#2. If 'pst' is 1, we don't update the IRTE in hardware.

However, the question is the check and update operations should be protected
by the same spinlock ' irq_2_ir_lock ', otherwise, race condition may happen.

Based on the above idea, I have two solutions for this, do you think which one
is better or you have other better suggestions? It is highly appreciated if you
can give comments about them!

Solution 1:
Introduction a new function test_and_modify_irte() which is called by intel_ir_set_affinity
in place of the original modify_irte().
Here is the changes:

+static int test_and_modify_irte(struct irq_2_iommu *irq_iommu,
+                               struct irte *irte_modified)
+{
+       struct intel_iommu *iommu;
+       unsigned long flags;
+       struct irte *irte;
+       int rc, index;
+
+       if (!irq_iommu)
+               return -1;
+
+       raw_spin_lock_irqsave(&irq_2_ir_lock, flags);
+
+       iommu = irq_iommu->iommu;
+
+       index = irq_iommu->irte_index + irq_iommu->sub_handle;
+       irte = &iommu->ir_table->base[index];
+
+       if (irte->pst)
+               goto unlock;
+
+       set_64bit(&irte->low, irte_modified->low);
+       set_64bit(&irte->high, irte_modified->high);
+       __iommu_flush_cache(iommu, irte, sizeof(*irte));
+
+       rc = qi_flush_iec(iommu, index, 0);
+unlock:
+       raw_spin_unlock_irqrestore(&irq_2_ir_lock, flags);
+
+       return rc;
+}
+

Soluation 2:
Instead of introducing a new function, add a flag in the original modify_irte()
function to indicate that whether we need to check and return before updating
the real hardware, add pass 1 to return_on_pst in intel_ir_set_affinity()
Here is the changes:
static int modify_irte(struct irq_2_iommu *irq_iommu,
-                      struct irte *irte_modified)
+                      struct irte *irte_modified
+                      bool return_on_pst)
 {
        struct intel_iommu *iommu;
        unsigned long flags;
@@ -140,11 +173,15 @@ static int modify_irte(struct irq_2_iommu *irq_iommu,
        index = irq_iommu->irte_index + irq_iommu->sub_handle;
        irte = &iommu->ir_table->base[index];

+       if (return_on_pst && irte->pst)
+               goto unlock;
+
        set_64bit(&irte->low, irte_modified->low);
        set_64bit(&irte->high, irte_modified->high);
        __iommu_flush_cache(iommu, irte, sizeof(*irte));

        rc = qi_flush_iec(iommu, index, 0);
+unlock:
        raw_spin_unlock_irqrestore(&irq_2_ir_lock, flags);

        return rc;

Thanks,
Feng

> 
> Thanks,
> 
> 	tglx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [v7 4/8] iommu, x86: No need to migrating irq for VT-d Posted-Interrupts
  2015-05-26  2:53     ` Wu, Feng
@ 2015-05-26 10:00       ` Thomas Gleixner
  2015-05-26 13:59         ` Wu, Feng
  0 siblings, 1 reply; 13+ messages in thread
From: Thomas Gleixner @ 2015-05-26 10:00 UTC (permalink / raw)
  To: Wu, Feng; +Cc: joro, dwmw2, jiang.liu, iommu, linux-kernel

On Tue, 26 May 2015, Wu, Feng wrote:
> > On Mon, 25 May 2015, Feng Wu wrote:
> > > +
> > > +	/* We don't need to modify irte if the interrupt is for posting. */
> > > +	if (irte->pst != 1)
> > > +		modify_irte(&ir_data->irq_2_iommu, irte);
> > 
> > I don't think this is correct. ir_data->irte_entry contains the non
> > posted version, which has pst == 0.
> > 
> > You need some other way to store whether you are in posted mode or
> > not.
> 
> Yes, seems this is incorrect. Thank you for pointing this out. After more
> thinking about this, I think I can do it this way:
> #1. Check the 'pst' field in hardware
> #2. If 'pst' is 1, we don't update the IRTE in hardware.
> 
> However, the question is the check and update operations should be protected
> by the same spinlock ' irq_2_ir_lock ', otherwise, race condition may happen.

Why? 

set_affinity() and vcpu_set_affinity() are serialized via
irq_desc->lock. And vcpu_set_affinity() is the only way to switch from
and to posted mode.

So all you need is a field in intel_irq_data which captures whether
posted is enabled or not.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [v7 4/8] iommu, x86: No need to migrating irq for VT-d Posted-Interrupts
  2015-05-26 10:00       ` Thomas Gleixner
@ 2015-05-26 13:59         ` Wu, Feng
  0 siblings, 0 replies; 13+ messages in thread
From: Wu, Feng @ 2015-05-26 13:59 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: joro, dwmw2, jiang.liu, iommu, linux-kernel, Wu, Feng



> -----Original Message-----
> From: Thomas Gleixner [mailto:tglx@linutronix.de]
> Sent: Tuesday, May 26, 2015 6:00 PM
> To: Wu, Feng
> Cc: joro@8bytes.org; dwmw2@infradead.org; jiang.liu@linux.intel.com;
> iommu@lists.linux-foundation.org; linux-kernel@vger.kernel.org
> Subject: RE: [v7 4/8] iommu, x86: No need to migrating irq for VT-d
> Posted-Interrupts
> 
> On Tue, 26 May 2015, Wu, Feng wrote:
> > > On Mon, 25 May 2015, Feng Wu wrote:
> > > > +
> > > > +	/* We don't need to modify irte if the interrupt is for posting. */
> > > > +	if (irte->pst != 1)
> > > > +		modify_irte(&ir_data->irq_2_iommu, irte);
> > >
> > > I don't think this is correct. ir_data->irte_entry contains the non
> > > posted version, which has pst == 0.
> > >
> > > You need some other way to store whether you are in posted mode or
> > > not.
> >
> > Yes, seems this is incorrect. Thank you for pointing this out. After more
> > thinking about this, I think I can do it this way:
> > #1. Check the 'pst' field in hardware
> > #2. If 'pst' is 1, we don't update the IRTE in hardware.
> >
> > However, the question is the check and update operations should be
> protected
> > by the same spinlock ' irq_2_ir_lock ', otherwise, race condition may happen.
> 
> Why?
> 
> set_affinity() and vcpu_set_affinity() are serialized via
> irq_desc->lock. And vcpu_set_affinity() is the only way to switch from
> and to posted mode.

Oh, Yes, I didn't notice that they are both protected by that lock. In that case,
I can just add a filed like you mentioned below. Thanks for the comments!

Thanks,
Feng

> 
> So all you need is a field in intel_irq_data which captures whether
> posted is enabled or not.
> 
> Thanks,
> 
> 	tglx

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2015-05-26 14:02 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-05-25  5:28 [v7 0/8] Add VT-d Posted-Interrupts support - IOMMU part Feng Wu
2015-05-25  5:28 ` [v7 1/8] iommu: Add new member capability to struct irq_remap_ops Feng Wu
2015-05-25  5:28 ` [v7 2/8] iommu: dmar: Extend struct irte for VT-d Posted-Interrupts Feng Wu
2015-05-25  5:28 ` [v7 3/8] iommu, x86: Implement irq_set_vcpu_affinity for intel_ir_chip Feng Wu
2015-05-25  5:28 ` [v7 4/8] iommu, x86: No need to migrating irq for VT-d Posted-Interrupts Feng Wu
2015-05-25  8:38   ` Thomas Gleixner
2015-05-26  2:53     ` Wu, Feng
2015-05-26 10:00       ` Thomas Gleixner
2015-05-26 13:59         ` Wu, Feng
2015-05-25  5:28 ` [v7 5/8] iommu, x86: Add cap_pi_support() to detect VT-d PI capability Feng Wu
2015-05-25  5:28 ` [v7 6/8] iommu, x86: Setup Posted-Interrupts capability for Intel iommu Feng Wu
2015-05-25  5:28 ` [v7 7/8] iommu, x86: define irq_remapping_cap() Feng Wu
2015-05-25  5:28 ` [v7 8/8] iommu, x86: Properly handler PI for IOMMU hotplug Feng Wu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).