All of lore.kernel.org
 help / color / mirror / Atom feed
* [PART2 RFC v1 0/9] iommu/AMD: Introduce IOMMU AVIC support
@ 2016-04-08 12:49 Suravee Suthikulpanit
  2016-04-08 12:49 ` [PART2 RFC v1 1/9] iommu/amd: Detect and enable guest vAPIC support Suravee Suthikulpanit
                   ` (8 more replies)
  0 siblings, 9 replies; 14+ messages in thread
From: Suravee Suthikulpanit @ 2016-04-08 12:49 UTC (permalink / raw)
  To: pbonzini, rkrcmar, joro, bp, gleb, alex.williamson
  Cc: kvm, linux-kernel, wei, sherry.hurwitz, Suravee Suthikulpanit

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

OVERVIEW
========
This patch set is the second part of the two-part patch series to introduce
the new AMD Advance Virtual Interrupt Controller (AVIC) support.

In addition to the SVM AVIC, AMD IOMMU also extends the AVIC capability
to allow I/O interrupts injection directly into the virtualized guest
local APIC without the need for hypervisor intervention.

This patch series introduces a new hardware interrupt remapping (IR) mode
in AMD IOMMU driver, the Guest Virtual APIC (GA) mode. This is in contrast
to the existing "legacy" mode. The IR mode can be specified with a new
kernel parameter:

    amd_iommu_guest_ir=[ ga | legacy(default) ]

When enabling GA mode, the AMD IOMMU driver will configure device interrupt
remapping in GA mode when possible (i.e. SVM AVIC must be enabled, and if
the interrupt types are supported). Otherewise, the driver will fallback
to using the legacy IR mode.

This patch series also introduces new interfaces between SVM and IOMMU
to allow:
  * SVM driver to communicate to IOMMU with updated vcpu scheduling
    information.
  * IOMMU driver to notify SVM driver to schedule vcpu on to physical core
    handle IOMMU GALog entry.

DOCUMENTATIONS
==============
More information about SVM AVIC can be found in the
AMD64 Architecture Programmer’s Manual Volume 2 - System Programming.

    http://support.amd.com/TechDocs/24593.pdf

More information about IOMMU AVIC can be found int the
AMD I/O Virtualization Technology (IOMMU) Specification - Rev 2.62.

    http://support.amd.com/TechDocs/48882_IOMMU.pdf

GITHUB
======
Latest git tree can be found at:
    http://github.com/ssuthiku/linux.git    avic_part2_rfc_v1

Any feedback and comments are very much appreciated.

Thank you,
Suravee

Suravee Suthikulpanit (9):
  iommu/amd: Detect and enable guest vAPIC support
  iommu/amd: Add data structure for guest vAPIC support
  iommu/amd: Detect and initialize guest vAPIC log
  iommu/amd: Adding GALOG interrupt handler
  iommu/amd: Introduce amd_iommu_update_ga()
  iommu/amd: Implements irq_set_vcpu_affinity hook to setup GA mode for
    pass-through devices
  svm: Introduce AMD IOMMU avic_ga_log_notifier
  svm: Implements update_pi_irte hook to setup posted interrupt
  svm: Update AMD IOMMU IRTE with vcpu scheduling information when
    enable AVIC

 arch/x86/include/asm/kvm_host.h |   2 +
 arch/x86/kvm/svm.c              | 212 ++++++++++++++++++-
 drivers/iommu/amd_iommu.c       | 453 +++++++++++++++++++++++++++++++++++-----
 drivers/iommu/amd_iommu_init.c  | 160 +++++++++++++-
 drivers/iommu/amd_iommu_proto.h |   1 +
 drivers/iommu/amd_iommu_types.h | 134 ++++++++++++
 include/linux/amd-iommu.h       |  24 +++
 7 files changed, 923 insertions(+), 63 deletions(-)

-- 
1.9.1

^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PART2 RFC v1 1/9] iommu/amd: Detect and enable guest vAPIC support
  2016-04-08 12:49 [PART2 RFC v1 0/9] iommu/AMD: Introduce IOMMU AVIC support Suravee Suthikulpanit
@ 2016-04-08 12:49 ` Suravee Suthikulpanit
  2016-05-09 11:49   ` Joerg Roedel
  2016-04-08 12:49 ` [PART2 RFC v1 2/9] iommu/amd: Add data structure for " Suravee Suthikulpanit
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 14+ messages in thread
From: Suravee Suthikulpanit @ 2016-04-08 12:49 UTC (permalink / raw)
  To: pbonzini, rkrcmar, joro, bp, gleb, alex.williamson
  Cc: kvm, linux-kernel, wei, sherry.hurwitz, Suravee Suthikulpanit

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

This patch introduces a new IOMMU driver parameter, amd_iommu_guest_ir,
which can be used to specify different interrupt remapping mode for
passthrough devices to VM guest:
    * legacy: Legacy interrupt remapping mode (w/ 32-bit IRTE)
    * ga    : Guest vAPIC interrupt remapping mode (w/ 128-bit IRTE)

Note that the GA mode also supports legacy interrupt remapping
for non-passthrough devices with the 128-bit IRTE.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd_iommu_init.c  | 74 +++++++++++++++++++++++++++++++++++++----
 drivers/iommu/amd_iommu_proto.h |  1 +
 drivers/iommu/amd_iommu_types.h | 14 ++++++++
 3 files changed, 83 insertions(+), 6 deletions(-)

diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index bf4959f..83a5300 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -131,6 +131,7 @@ struct ivmd_header {
 bool amd_iommu_dump;
 bool amd_iommu_irq_remap __read_mostly;
 
+int amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_GA;
 static bool amd_iommu_detected;
 static bool __initdata amd_iommu_disabled;
 
@@ -1087,6 +1088,9 @@ static int __init init_iommu_one(struct amd_iommu *iommu, struct ivhd_header *h)
 		iommu->mmio_phys_end = MMIO_CNTR_CONF_OFFSET;
 	}
 
+	if (((h->efr & (0x1 << 6)) == 0))
+		amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY;
+
 	iommu->mmio_base = iommu_map_mmio_space(iommu->mmio_phys,
 						iommu->mmio_phys_end);
 	if (!iommu->mmio_base)
@@ -1283,6 +1287,14 @@ static int iommu_init_pci(struct amd_iommu *iommu)
 	if (iommu_feature(iommu, FEATURE_PPR) && alloc_ppr_log(iommu))
 		return -ENOMEM;
 
+	/* Note: We have already checked GASup from IVRS table.
+	 *       Now, we need to make sure that GAMSup is set.
+	 */
+	if (amd_iommu_guest_ir >= AMD_IOMMU_GUEST_IR_GA &&
+	    !iommu_feature(iommu, FEATURE_GAM_VAPIC))
+		amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY_GA;
+
+
 	if (iommu->cap & (1UL << IOMMU_CAP_NPCACHE))
 		amd_iommu_np_cache = true;
 
@@ -1340,16 +1352,23 @@ static void print_iommu_info(void)
 			dev_name(&iommu->dev->dev), iommu->cap_ptr);
 
 		if (iommu->cap & (1 << IOMMU_CAP_EFR)) {
-			pr_info("AMD-Vi:  Extended features: ");
+			pr_info("AMD-Vi: Extended features (%#llx):\n",
+				iommu->features);
 			for (i = 0; i < ARRAY_SIZE(feat_str); ++i) {
 				if (iommu_feature(iommu, (1ULL << i)))
 					pr_cont(" %s", feat_str[i]);
 			}
+
+			if (iommu->features & FEATURE_GAM_VAPIC)
+				pr_cont(" GA_vAPIC");
+
 			pr_cont("\n");
 		}
 	}
 	if (irq_remapping_enabled)
 		pr_info("AMD-Vi: Interrupt remapping enabled\n");
+	if (amd_iommu_guest_ir)
+		pr_info("AMD-Vi: AVIC enabled (%#x)\n", amd_iommu_guest_ir);
 }
 
 static int __init amd_iommu_init_pci(void)
@@ -1647,6 +1666,20 @@ static void iommu_apply_resume_quirks(struct amd_iommu *iommu)
 			       iommu->stored_addr_lo | 1);
 }
 
+static void iommu_enable_ga(struct amd_iommu *iommu)
+{
+	switch (amd_iommu_guest_ir) {
+	case AMD_IOMMU_GUEST_IR_GA:
+		iommu_feature_enable(iommu, CONTROL_GAM_EN);
+		/* Fall through */
+	case AMD_IOMMU_GUEST_IR_LEGACY_GA:
+		iommu_feature_enable(iommu, CONTROL_GA_EN);
+		break;
+	default:
+		break;
+	}
+}
+
 /*
  * This function finally enables all IOMMUs found in the system after
  * they have been initialized
@@ -1662,9 +1695,13 @@ static void early_enable_iommus(void)
 		iommu_enable_command_buffer(iommu);
 		iommu_enable_event_buffer(iommu);
 		iommu_set_exclusion_range(iommu);
+		iommu_enable_ga(iommu);
 		iommu_enable(iommu);
 		iommu_flush_all_caches(iommu);
 	}
+
+	if (amd_iommu_guest_ir >= AMD_IOMMU_GUEST_IR_GA)
+		amd_iommu_irq_ops.capability |= (1 << IRQ_POSTING_CAP);
 }
 
 static void enable_iommus_v2(void)
@@ -1690,6 +1727,9 @@ static void disable_iommus(void)
 
 	for_each_iommu(iommu)
 		iommu_disable(iommu);
+
+	if (amd_iommu_guest_ir >= AMD_IOMMU_GUEST_IR_GA)
+		amd_iommu_irq_ops.capability &= ~(1 << IRQ_POSTING_CAP);
 }
 
 /*
@@ -1929,10 +1969,16 @@ static int __init early_amd_iommu_init(void)
 		 * remapping tables.
 		 */
 		ret = -ENOMEM;
-		amd_iommu_irq_cache = kmem_cache_create("irq_remap_cache",
-				MAX_IRQS_PER_TABLE * sizeof(u32),
-				IRQ_TABLE_ALIGNMENT,
-				0, NULL);
+		if (amd_iommu_guest_ir == AMD_IOMMU_GUEST_IR_LEGACY)
+			amd_iommu_irq_cache = kmem_cache_create("irq_remap_cache",
+					MAX_IRQS_PER_TABLE * sizeof(u32),
+					IRQ_TABLE_ALIGNMENT,
+					0, NULL);
+		else
+			amd_iommu_irq_cache = kmem_cache_create("irq_remap_cache",
+					MAX_IRQS_PER_TABLE * (sizeof(u64) * 2),
+					IRQ_TABLE_ALIGNMENT,
+					0, NULL);
 		if (!amd_iommu_irq_cache)
 			goto out;
 
@@ -2128,7 +2174,7 @@ static int __init amd_iommu_init(void)
 	ret = iommu_go_to_state(IOMMU_INITIALIZED);
 	if (ret) {
 		free_dma_resources();
-		if (!irq_remapping_enabled) {
+		if (!irq_remapping_enabled && !amd_iommu_guest_ir) {
 			disable_iommus();
 			free_on_init_error();
 		} else {
@@ -2185,6 +2231,21 @@ static int __init parse_amd_iommu_dump(char *str)
 	return 1;
 }
 
+static int __init parse_amd_iommu_intr(char *str)
+{
+	for (; *str; ++str) {
+		if (strncmp(str, "legacy", 6) == 0) {
+			amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY;
+			break;
+		}
+		if (strncmp(str, "ga", 2) == 0) {
+			amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_GA;
+			break;
+		}
+	}
+	return 1;
+}
+
 static int __init parse_amd_iommu_options(char *str)
 {
 	for (; *str; ++str) {
@@ -2261,6 +2322,7 @@ static int __init parse_ivrs_hpet(char *str)
 
 __setup("amd_iommu_dump",	parse_amd_iommu_dump);
 __setup("amd_iommu=",		parse_amd_iommu_options);
+__setup("amd_iommu_intr=",	parse_amd_iommu_intr);
 __setup("ivrs_ioapic",		parse_ivrs_ioapic);
 __setup("ivrs_hpet",		parse_ivrs_hpet);
 
diff --git a/drivers/iommu/amd_iommu_proto.h b/drivers/iommu/amd_iommu_proto.h
index 0bd9eb3..faa3b48 100644
--- a/drivers/iommu/amd_iommu_proto.h
+++ b/drivers/iommu/amd_iommu_proto.h
@@ -38,6 +38,7 @@ extern int amd_iommu_enable(void);
 extern void amd_iommu_disable(void);
 extern int amd_iommu_reenable(int);
 extern int amd_iommu_enable_faulting(void);
+extern int amd_iommu_guest_ir;
 
 /* IOMMUv2 specific functions */
 struct iommu_domain;
diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index 9d32b20..95414120 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -92,6 +92,7 @@
 #define FEATURE_GA		(1ULL<<7)
 #define FEATURE_HE		(1ULL<<8)
 #define FEATURE_PC		(1ULL<<9)
+#define FEATURE_GAM_VAPIC	(1ULL<<21)
 
 #define FEATURE_PASID_SHIFT	32
 #define FEATURE_PASID_MASK	(0x1fULL << FEATURE_PASID_SHIFT)
@@ -146,6 +147,8 @@
 #define CONTROL_PPFINT_EN       0x0eULL
 #define CONTROL_PPR_EN          0x0fULL
 #define CONTROL_GT_EN           0x10ULL
+#define CONTROL_GA_EN           0x11ULL
+#define CONTROL_GAM_EN          0x19ULL
 
 #define CTRL_INV_TO_MASK	(7 << CONTROL_INV_TIMEOUT)
 #define CTRL_INV_TO_NONE	0
@@ -694,4 +697,15 @@ struct __iommu_counter {
 
 #endif /* CONFIG_AMD_IOMMU_STATS */
 
+enum amd_iommu_intr_mode_type {
+	AMD_IOMMU_GUEST_IR_LEGACY,
+
+	/* This mode is not visible to users. It is used when
+	 * we cannot fully enable GA and fallback to only support
+	 * legacy interrupt remapping via 128-bit IRTE.
+	 */
+	AMD_IOMMU_GUEST_IR_LEGACY_GA,
+	AMD_IOMMU_GUEST_IR_GA,
+};
+
 #endif /* _ASM_X86_AMD_IOMMU_TYPES_H */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PART2 RFC v1 2/9] iommu/amd: Add data structure for guest vAPIC support
  2016-04-08 12:49 [PART2 RFC v1 0/9] iommu/AMD: Introduce IOMMU AVIC support Suravee Suthikulpanit
  2016-04-08 12:49 ` [PART2 RFC v1 1/9] iommu/amd: Detect and enable guest vAPIC support Suravee Suthikulpanit
@ 2016-04-08 12:49 ` Suravee Suthikulpanit
  2016-04-08 12:49 ` [PART2 RFC v1 3/9] iommu/amd: Detect and initialize guest vAPIC log Suravee Suthikulpanit
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Suravee Suthikulpanit @ 2016-04-08 12:49 UTC (permalink / raw)
  To: pbonzini, rkrcmar, joro, bp, gleb, alex.williamson
  Cc: kvm, linux-kernel, wei, sherry.hurwitz, Suravee Suthikulpanit

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

This patch adds new data structure for the 128-bit IOMMU IRTE format,
which can support both legacy and GA interrupt remapping modes.
It also provides helper functions for setting up, accessing, and
updating interrupt remapping table entries in different mode.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd_iommu.c       | 188 ++++++++++++++++++++++++++++++++--------
 drivers/iommu/amd_iommu_types.h |  63 ++++++++++++++
 2 files changed, 217 insertions(+), 34 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 374c129..af6079a 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -3601,21 +3601,6 @@ EXPORT_SYMBOL(amd_iommu_device_info);
  *
  *****************************************************************************/
 
-union irte {
-	u32 val;
-	struct {
-		u32 valid	: 1,
-		    no_fault	: 1,
-		    int_type	: 3,
-		    rq_eoi	: 1,
-		    dm		: 1,
-		    rsvd_1	: 1,
-		    destination	: 8,
-		    vector	: 8,
-		    rsvd_2	: 8;
-	} fields;
-};
-
 struct irq_2_irte {
 	u16 devid; /* Device ID for IRTE table */
 	u16 index; /* Index into IRTE table*/
@@ -3624,6 +3609,7 @@ struct irq_2_irte {
 struct amd_ir_data {
 	struct irq_2_irte			irq_2_irte;
 	union irte				irte_entry;
+	struct irte_ga				irte_ga_entry;
 	union {
 		struct msi_msg			msi_entry;
 	};
@@ -3650,7 +3636,60 @@ static void set_dte_irq_entry(u16 devid, struct irq_remap_table *table)
 	amd_iommu_dev_table[devid].data[2] = dte;
 }
 
+void *amd_iommu_get_irte(struct irq_remap_table *table, int index)
+{
+	void *ret = NULL;
+
+	if (!amd_iommu_guest_ir) {
+		union irte *ptr = (union irte *)table->table;
+
+		ret = &ptr[index];
+	} else {
+		struct irte_ga *ptr = (struct irte_ga *)table->table;
+
+		ret = &ptr[index];
+	}
+	return ret;
+}
+
 #define IRTE_ALLOCATED (~1U)
+static void set_irte_allocated(struct irq_remap_table *table, int index)
+{
+	if (!amd_iommu_guest_ir) {
+		table->table[index] = IRTE_ALLOCATED;
+	} else {
+		struct irte_ga *irte = amd_iommu_get_irte(table, index);
+
+		memset(&irte->lo.val, 0, sizeof(u64));
+		memset(&irte->hi.val, 0, sizeof(u64));
+		irte->hi.fields.vector = 0xff;
+	}
+}
+
+static bool is_irte_allocated(struct irq_remap_table *table, int index)
+{
+	if (!amd_iommu_guest_ir) {
+		union irte *irte = amd_iommu_get_irte(table, index);
+
+		return irte->val != 0;
+	} else {
+		struct irte_ga *irte = amd_iommu_get_irte(table, index);
+
+		return irte->hi.fields.vector != 0;
+	}
+}
+
+static void clear_irte(struct irq_remap_table *table, int index)
+{
+	if (!amd_iommu_guest_ir) {
+		table->table[index] = 0;
+	} else {
+		struct irte_ga *irte = amd_iommu_get_irte(table, index);
+
+		memset(&irte->lo.val, 0, sizeof(u64));
+		memset(&irte->hi.val, 0, sizeof(u64));
+	}
+}
 
 static struct irq_remap_table *get_irq_table(u16 devid, bool ioapic)
 {
@@ -3697,13 +3736,18 @@ static struct irq_remap_table *get_irq_table(u16 devid, bool ioapic)
 		goto out;
 	}
 
-	memset(table->table, 0, MAX_IRQS_PER_TABLE * sizeof(u32));
+	if (!amd_iommu_guest_ir)
+		memset(table->table, 0,
+		       MAX_IRQS_PER_TABLE * sizeof(u32));
+	else
+		memset(table->table, 0,
+		       (MAX_IRQS_PER_TABLE * (sizeof(u64) * 2)));
 
 	if (ioapic) {
 		int i;
 
 		for (i = 0; i < 32; ++i)
-			table->table[i] = IRTE_ALLOCATED;
+			set_irte_allocated(table, i);
 	}
 
 	irq_lookup_table[devid] = table;
@@ -3740,14 +3784,14 @@ static int alloc_irq_index(u16 devid, int count)
 	for (c = 0, index = table->min_index;
 	     index < MAX_IRQS_PER_TABLE;
 	     ++index) {
-		if (table->table[index] == 0)
+		if (!is_irte_allocated(table, index))
 			c += 1;
 		else
 			c = 0;
 
 		if (c == count)	{
 			for (; c != 0; --c)
-				table->table[index - c + 1] = IRTE_ALLOCATED;
+				set_irte_allocated(table, index - c + 1);
 
 			index -= count - 1;
 			goto out;
@@ -3762,6 +3806,42 @@ out:
 	return index;
 }
 
+static int modify_irte_ga(u16 devid, int index, struct irte_ga *irte)
+{
+	struct irq_remap_table *table;
+	struct amd_iommu *iommu;
+	unsigned long flags;
+	struct irte_ga *entry;
+	struct irte_ga tmp;
+
+	iommu = amd_iommu_rlookup_table[devid];
+	if (iommu == NULL)
+		return -EINVAL;
+
+	table = get_irq_table(devid, false);
+	if (!table)
+		return -ENOMEM;
+
+	spin_lock_irqsave(&table->lock, flags);
+
+	entry = amd_iommu_get_irte(table, index);
+
+	memcpy(&tmp, entry, sizeof(struct irte_ga));
+
+	entry->lo.fields_remap.valid = 0;
+	entry->hi.val = irte->hi.val;
+	entry->hi.fields.ga_root_ptr = tmp.hi.fields.ga_root_ptr;
+	entry->lo.val = irte->lo.val;
+	entry->lo.fields_remap.valid = 1;
+
+	spin_unlock_irqrestore(&table->lock, flags);
+
+	iommu_flush_irt(iommu, devid);
+	iommu_completion_wait(iommu);
+
+	return 0;
+}
+
 static int modify_irte(u16 devid, int index, union irte irte)
 {
 	struct irq_remap_table *table;
@@ -3801,7 +3881,7 @@ static void free_irte(u16 devid, int index)
 		return;
 
 	spin_lock_irqsave(&table->lock, flags);
-	table->table[index] = 0;
+	clear_irte(table, index);
 	spin_unlock_irqrestore(&table->lock, flags);
 
 	iommu_flush_irt(iommu, devid);
@@ -3889,19 +3969,33 @@ static void irq_remapping_prepare_irte(struct amd_ir_data *data,
 {
 	struct irq_2_irte *irte_info = &data->irq_2_irte;
 	struct msi_msg *msg = &data->msi_entry;
-	union irte *irte = &data->irte_entry;
 	struct IO_APIC_route_entry *entry;
 
 	data->irq_2_irte.devid = devid;
 	data->irq_2_irte.index = index + sub_handle;
 
 	/* Setup IRTE for IOMMU */
-	irte->val = 0;
-	irte->fields.vector      = irq_cfg->vector;
-	irte->fields.int_type    = apic->irq_delivery_mode;
-	irte->fields.destination = irq_cfg->dest_apicid;
-	irte->fields.dm          = apic->irq_dest_mode;
-	irte->fields.valid       = 1;
+	if (!amd_iommu_guest_ir) {
+		union irte *irte = &data->irte_entry;
+
+		irte->val                = 0;
+		irte->fields.vector      = irq_cfg->vector;
+		irte->fields.int_type    = apic->irq_delivery_mode;
+		irte->fields.destination = irq_cfg->dest_apicid;
+		irte->fields.dm          = apic->irq_dest_mode;
+		irte->fields.valid       = 1;
+	} else {
+		struct irte_ga *irte = &data->irte_ga_entry;
+
+		irte->lo.val                      = 0;
+		irte->hi.val                      = 0;
+		irte->lo.fields_remap.guest_mode  = 0;
+		irte->lo.fields_remap.int_type    = apic->irq_delivery_mode;
+		irte->lo.fields_remap.dm          = apic->irq_dest_mode;
+		irte->hi.fields.vector            = irq_cfg->vector;
+		irte->lo.fields_remap.destination = irq_cfg->dest_apicid;
+		irte->lo.fields_remap.valid       = 1;
+	}
 
 	switch (info->type) {
 	case X86_IRQ_ALLOC_TYPE_IOAPIC:
@@ -4037,7 +4131,13 @@ static void irq_remapping_activate(struct irq_domain *domain,
 	struct amd_ir_data *data = irq_data->chip_data;
 	struct irq_2_irte *irte_info = &data->irq_2_irte;
 
-	modify_irte(irte_info->devid, irte_info->index, data->irte_entry);
+	if (!amd_iommu_guest_ir) {
+		data->irte_entry.fields.valid = 1;
+		modify_irte(irte_info->devid, irte_info->index, data->irte_entry);
+	} else if (amd_iommu_guest_ir >= AMD_IOMMU_GUEST_IR_LEGACY_GA) {
+		data->irte_ga_entry.lo.fields_remap.valid = 1;
+		modify_irte_ga(irte_info->devid, irte_info->index, &data->irte_ga_entry);
+	}
 }
 
 static void irq_remapping_deactivate(struct irq_domain *domain,
@@ -4045,10 +4145,14 @@ static void irq_remapping_deactivate(struct irq_domain *domain,
 {
 	struct amd_ir_data *data = irq_data->chip_data;
 	struct irq_2_irte *irte_info = &data->irq_2_irte;
-	union irte entry;
 
-	entry.val = 0;
-	modify_irte(irte_info->devid, irte_info->index, data->irte_entry);
+	if (!amd_iommu_guest_ir) {
+		data->irte_entry.fields.valid = 0;
+		modify_irte(irte_info->devid, irte_info->index, data->irte_entry);
+	} else {
+		data->irte_ga_entry.lo.fields_remap.valid = 0;
+		modify_irte_ga(irte_info->devid, irte_info->index, &data->irte_ga_entry);
+	}
 }
 
 static struct irq_domain_ops amd_ir_domain_ops = {
@@ -4075,9 +4179,19 @@ static int amd_ir_set_affinity(struct irq_data *data,
 	 * Atomically updates the IRTE with the new destination, vector
 	 * and flushes the interrupt entry cache.
 	 */
-	ir_data->irte_entry.fields.vector = cfg->vector;
-	ir_data->irte_entry.fields.destination = cfg->dest_apicid;
-	modify_irte(irte_info->devid, irte_info->index, ir_data->irte_entry);
+	if (!amd_iommu_guest_ir) {
+		ir_data->irte_entry.fields.vector = cfg->vector;
+		ir_data->irte_entry.fields.destination = cfg->dest_apicid;
+		modify_irte(irte_info->devid, irte_info->index,
+			    ir_data->irte_entry);
+	} else {
+		struct irte_ga *entry = &ir_data->irte_ga_entry;
+
+		entry->hi.fields.vector = cfg->vector;
+		entry->lo.fields_remap.destination = cfg->dest_apicid;
+		entry->lo.fields_remap.guest_mode = 0;
+		modify_irte_ga(irte_info->devid, irte_info->index, entry);
+	}
 
 	/*
 	 * After this point, all the interrupts will start arriving
@@ -4097,19 +4211,25 @@ static void ir_compose_msi_msg(struct irq_data *irq_data, struct msi_msg *msg)
 }
 
 static struct irq_chip amd_ir_chip = {
+	.name = "AMD-IR-IRQ-CHIP",
 	.irq_ack = ir_ack_apic_edge,
 	.irq_set_affinity = amd_ir_set_affinity,
 	.irq_compose_msi_msg = ir_compose_msi_msg,
 };
 
+static const char amd_iommu_ir_domain_name[] = "AMD-IOMMU-IR-DOMAIN";
+static const char amd_iommu_msi_domain_name[] = "AMD-IOMMU-MSI-DOMAIN";
+
 int amd_iommu_create_irq_domain(struct amd_iommu *iommu)
 {
 	iommu->ir_domain = irq_domain_add_tree(NULL, &amd_ir_domain_ops, iommu);
 	if (!iommu->ir_domain)
 		return -ENOMEM;
+	iommu->ir_domain->name = amd_iommu_ir_domain_name;
 
 	iommu->ir_domain->parent = arch_get_ir_parent_domain();
 	iommu->msi_domain = arch_create_msi_irq_domain(iommu->ir_domain);
+	iommu->msi_domain->name = amd_iommu_msi_domain_name;
 
 	return 0;
 }
diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index 95414120..ec546f3 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -708,4 +708,67 @@ enum amd_iommu_intr_mode_type {
 	AMD_IOMMU_GUEST_IR_GA,
 };
 
+union irte {
+	u32 val;
+	struct {
+		u32 valid	: 1,
+		    no_fault	: 1,
+		    int_type	: 3,
+		    rq_eoi	: 1,
+		    dm		: 1,
+		    rsvd_1	: 1,
+		    destination	: 8,
+		    vector	: 8,
+		    rsvd_2	: 8;
+	} fields;
+};
+
+union irte_ga_lo {
+	u64 val;
+
+	/* For int remapping */
+	struct {
+		u64 valid	: 1,
+		    no_fault	: 1,
+		    /* ------ */
+		    int_type	: 3,
+		    rq_eoi	: 1,
+		    dm		: 1,
+		    /* ------ */
+		    guest_mode	: 1,
+		    destination	: 8,
+		    rsvd	: 48;
+	} fields_remap;
+
+	/* For guest vAPIC */
+	struct {
+		u64 valid	: 1,
+		    no_fault	: 1,
+		    /* ------ */
+		    ga_log_intr	: 1,
+		    rsvd1	: 3,
+		    is_run	: 1,
+		    /* ------ */
+		    guest_mode	: 1,
+		    destination	: 8,
+		    rsvd2	: 16,
+		    ga_tag	: 32;
+	} fields_vapic;
+};
+
+union irte_ga_hi {
+	u64 val;
+	struct {
+		u64 vector	: 8,
+		    rsvd_1	: 4,
+		    ga_root_ptr	: 40,
+		    rsvd_2	: 12;
+	} fields;
+};
+
+struct irte_ga {
+	union irte_ga_lo lo;
+	union irte_ga_hi hi;
+};
+
 #endif /* _ASM_X86_AMD_IOMMU_TYPES_H */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PART2 RFC v1 3/9] iommu/amd: Detect and initialize guest vAPIC log
  2016-04-08 12:49 [PART2 RFC v1 0/9] iommu/AMD: Introduce IOMMU AVIC support Suravee Suthikulpanit
  2016-04-08 12:49 ` [PART2 RFC v1 1/9] iommu/amd: Detect and enable guest vAPIC support Suravee Suthikulpanit
  2016-04-08 12:49 ` [PART2 RFC v1 2/9] iommu/amd: Add data structure for " Suravee Suthikulpanit
@ 2016-04-08 12:49 ` Suravee Suthikulpanit
  2016-04-08 12:49 ` [PART2 RFC v1 4/9] iommu/amd: Adding GALOG interrupt handler Suravee Suthikulpanit
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Suravee Suthikulpanit @ 2016-04-08 12:49 UTC (permalink / raw)
  To: pbonzini, rkrcmar, joro, bp, gleb, alex.williamson
  Cc: kvm, linux-kernel, wei, sherry.hurwitz, Suravee Suthikulpanit

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

This patch adds support to detect and initialize IOMMU Guest vAPIC log
(GALOG). By default, it also enable GALog interrupt to notify IOMMU driver
when GA Log entry is created.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd_iommu_init.c  | 82 +++++++++++++++++++++++++++++++++++++++++
 drivers/iommu/amd_iommu_types.h | 30 +++++++++++++++
 2 files changed, 112 insertions(+)

diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 83a5300..5b783af 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -79,6 +79,7 @@
 #define ACPI_DEVFLAG_LINT1              0x80
 #define ACPI_DEVFLAG_ATSDIS             0x10000000
 
+#define LOOP_TIMEOUT	100000
 /*
  * ACPI table definitions
  *
@@ -368,6 +369,10 @@ static void iommu_disable(struct amd_iommu *iommu)
 	iommu_feature_disable(iommu, CONTROL_EVT_INT_EN);
 	iommu_feature_disable(iommu, CONTROL_EVT_LOG_EN);
 
+	/* Disable IOMMU GA_LOG */
+	iommu_feature_disable(iommu, CONTROL_GALOG_EN);
+	iommu_feature_disable(iommu, CONTROL_GAINT_EN);
+
 	/* Disable IOMMU hardware itself */
 	iommu_feature_disable(iommu, CONTROL_IOMMU_EN);
 }
@@ -618,6 +623,75 @@ static void __init free_ppr_log(struct amd_iommu *iommu)
 	free_pages((unsigned long)iommu->ppr_log, get_order(PPR_LOG_SIZE));
 }
 
+static void __init free_ga_log(struct amd_iommu *iommu)
+{
+	if (iommu->ga_log)
+		free_pages((unsigned long)iommu->ga_log,
+			    get_order(GA_LOG_SIZE));
+	if (iommu->ga_log_tail)
+		free_pages((unsigned long)iommu->ga_log_tail,
+			    get_order(8));
+}
+
+static int iommu_ga_log_enable(struct amd_iommu *iommu)
+{
+	u32 status, i;
+
+	if (!iommu->ga_log)
+		return -EINVAL;
+
+	status = readl(iommu->mmio_base + MMIO_STATUS_OFFSET);
+
+	/* Check if already running */
+	if (status & (MMIO_STATUS_GALOG_RUN_MASK))
+		return 0;
+
+	iommu_feature_enable(iommu, CONTROL_GAINT_EN);
+	iommu_feature_enable(iommu, CONTROL_GALOG_EN);
+
+	for (i = 0; i < LOOP_TIMEOUT; ++i) {
+		status = readl(iommu->mmio_base + MMIO_STATUS_OFFSET);
+		if (status & (MMIO_STATUS_GALOG_RUN_MASK))
+			break;
+	}
+
+	if (i >= LOOP_TIMEOUT)
+		return -EINVAL;
+	return 0;
+}
+
+static int iommu_init_ga_log(struct amd_iommu *iommu)
+{
+	u64 entry;
+
+	if (amd_iommu_guest_ir < AMD_IOMMU_GUEST_IR_GA)
+		return 0;
+
+	iommu->ga_log = (u8 *)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
+					get_order(GA_LOG_SIZE));
+	if (!iommu->ga_log)
+		goto err_out;
+
+	iommu->ga_log_tail = (u8 *)__get_free_pages(GFP_KERNEL | __GFP_ZERO,
+					get_order(8));
+	if (!iommu->ga_log_tail)
+		goto err_out;
+
+	entry = (u64)virt_to_phys(iommu->ga_log) | GA_LOG_SIZE_512;
+	memcpy_toio(iommu->mmio_base + MMIO_GA_LOG_BASE_OFFSET,
+		    &entry, sizeof(entry));
+	entry = ((u64)virt_to_phys(iommu->ga_log) & 0xFFFFFFFFFFFFFULL) & ~7ULL;
+	memcpy_toio(iommu->mmio_base + MMIO_GA_LOG_TAIL_OFFSET,
+		    &entry, sizeof(entry));
+	writel(0x00, iommu->mmio_base + MMIO_GA_HEAD_OFFSET);
+	writel(0x00, iommu->mmio_base + MMIO_GA_TAIL_OFFSET);
+
+	return 0;
+err_out:
+	free_ga_log(iommu);
+	return -EINVAL;
+}
+
 static void iommu_enable_gt(struct amd_iommu *iommu)
 {
 	if (!iommu_feature(iommu, FEATURE_GT))
@@ -974,6 +1048,7 @@ static void __init free_iommu_one(struct amd_iommu *iommu)
 	free_command_buffer(iommu);
 	free_event_buffer(iommu);
 	free_ppr_log(iommu);
+	free_ga_log(iommu);
 	iommu_unmap_mmio_space(iommu);
 }
 
@@ -1231,6 +1306,7 @@ static int iommu_init_pci(struct amd_iommu *iommu)
 {
 	int cap_ptr = iommu->cap_ptr;
 	u32 range, misc, low, high;
+	int ret;
 
 	iommu->dev = pci_get_bus_and_slot(PCI_BUS_NUM(iommu->devid),
 					  iommu->devid & 0xff);
@@ -1295,6 +1371,10 @@ static int iommu_init_pci(struct amd_iommu *iommu)
 		amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY_GA;
 
 
+	ret = iommu_init_ga_log(iommu);
+	if (ret)
+		return ret;
+
 	if (iommu->cap & (1UL << IOMMU_CAP_NPCACHE))
 		amd_iommu_np_cache = true;
 
@@ -1449,6 +1529,8 @@ enable_faults:
 	if (iommu->ppr_log != NULL)
 		iommu_feature_enable(iommu, CONTROL_PPFINT_EN);
 
+	iommu_ga_log_enable(iommu);
+
 	return 0;
 }
 
diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index ec546f3..d528a46 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -69,6 +69,8 @@
 #define MMIO_EXCL_LIMIT_OFFSET  0x0028
 #define MMIO_EXT_FEATURES	0x0030
 #define MMIO_PPR_LOG_OFFSET	0x0038
+#define MMIO_GA_LOG_BASE_OFFSET	0x00e0
+#define MMIO_GA_LOG_TAIL_OFFSET	0x00e8
 #define MMIO_CMD_HEAD_OFFSET	0x2000
 #define MMIO_CMD_TAIL_OFFSET	0x2008
 #define MMIO_EVT_HEAD_OFFSET	0x2010
@@ -76,6 +78,8 @@
 #define MMIO_STATUS_OFFSET	0x2020
 #define MMIO_PPR_HEAD_OFFSET	0x2030
 #define MMIO_PPR_TAIL_OFFSET	0x2038
+#define MMIO_GA_HEAD_OFFSET	0x2040
+#define MMIO_GA_TAIL_OFFSET	0x2048
 #define MMIO_CNTR_CONF_OFFSET	0x4000
 #define MMIO_CNTR_REG_OFFSET	0x40000
 #define MMIO_REG_END_OFFSET	0x80000
@@ -111,6 +115,9 @@
 #define MMIO_STATUS_EVT_INT_MASK	(1 << 1)
 #define MMIO_STATUS_COM_WAIT_INT_MASK	(1 << 2)
 #define MMIO_STATUS_PPR_INT_MASK	(1 << 6)
+#define MMIO_STATUS_GALOG_RUN_MASK	(1 << 8)
+#define MMIO_STATUS_GALOG_OVERFLOW_MASK	(1 << 9)
+#define MMIO_STATUS_GALOG_INT_MASK	(1 << 10)
 
 /* event logging constants */
 #define EVENT_ENTRY_SIZE	0x10
@@ -149,6 +156,8 @@
 #define CONTROL_GT_EN           0x10ULL
 #define CONTROL_GA_EN           0x11ULL
 #define CONTROL_GAM_EN          0x19ULL
+#define CONTROL_GALOG_EN        0x1CULL
+#define CONTROL_GAINT_EN        0x1DULL
 
 #define CTRL_INV_TO_MASK	(7 << CONTROL_INV_TIMEOUT)
 #define CTRL_INV_TO_NONE	0
@@ -227,6 +236,19 @@
 
 #define PPR_REQ_FAULT		0x01
 
+/* Constants for GA Log handling */
+#define GA_LOG_ENTRIES		512
+#define GA_LOG_SIZE_SHIFT	56
+#define GA_LOG_SIZE_512		(0x8ULL << GA_LOG_SIZE_SHIFT)
+#define GA_ENTRY_SIZE		8
+#define GA_LOG_SIZE		(GA_ENTRY_SIZE * GA_LOG_ENTRIES)
+
+#define GA_TAG(x)		(u32)(x & 0xffffffffULL)
+#define GA_DEVID(x)		(u16)(((x) >> 32) & 0xffffULL)
+#define GA_REQ_TYPE(x)		(((x) >> 60) & 0xfULL)
+
+#define GA_GUEST_NR		0x1
+
 #define PAGE_MODE_NONE    0x00
 #define PAGE_MODE_1_LEVEL 0x01
 #define PAGE_MODE_2_LEVEL 0x02
@@ -494,6 +516,12 @@ struct amd_iommu {
 	/* Base of the PPR log, if present */
 	u8 *ppr_log;
 
+	/* Base of the GA log, if present */
+	u8 *ga_log;
+
+	/* Tail of the GA log, if present */
+	u8 *ga_log_tail;
+
 	/* true if interrupts for this IOMMU are already enabled */
 	bool int_enabled;
 
@@ -687,6 +715,8 @@ struct __iommu_counter {
 #define INC_STATS_COUNTER(name)		name.value += 1
 #define ADD_STATS_COUNTER(name, x)	name.value += (x)
 #define SUB_STATS_COUNTER(name, x)	name.value -= (x)
+#define SET_STATS_COUNTER(name, x)	name.value = (x)
+#define STATS_COUNTER(name)		(name.value)
 
 #else /* CONFIG_AMD_IOMMU_STATS */
 
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PART2 RFC v1 4/9] iommu/amd: Adding GALOG interrupt handler
  2016-04-08 12:49 [PART2 RFC v1 0/9] iommu/AMD: Introduce IOMMU AVIC support Suravee Suthikulpanit
                   ` (2 preceding siblings ...)
  2016-04-08 12:49 ` [PART2 RFC v1 3/9] iommu/amd: Detect and initialize guest vAPIC log Suravee Suthikulpanit
@ 2016-04-08 12:49 ` Suravee Suthikulpanit
  2016-04-08 12:49 ` [PART2 RFC v1 5/9] iommu/amd: Introduce amd_iommu_update_ga() Suravee Suthikulpanit
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Suravee Suthikulpanit @ 2016-04-08 12:49 UTC (permalink / raw)
  To: pbonzini, rkrcmar, joro, bp, gleb, alex.williamson
  Cc: kvm, linux-kernel, wei, sherry.hurwitz, Suravee Suthikulpanit

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

This patch adds AMD IOMMU guest virtual APIC log (GALOG) handler.
When IOMMU hardware receives an interrupt targeting a blocking vcpu,
it creates an entry in the GALOG, and generates an interrupt to notify
the AMD IOMMU driver.

At this point, the driver processes the log entry, and notify the SVM
driver via the registered iommu_ga_log_notifier function.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd_iommu.c | 103 ++++++++++++++++++++++++++++++++++++++++++++--
 include/linux/amd-iommu.h |  10 +++++
 2 files changed, 110 insertions(+), 3 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index af6079a..8cdde339 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -424,6 +424,8 @@ DECLARE_STATS_COUNTER(complete_ppr);
 DECLARE_STATS_COUNTER(invalidate_iotlb);
 DECLARE_STATS_COUNTER(invalidate_iotlb_all);
 DECLARE_STATS_COUNTER(pri_requests);
+DECLARE_STATS_COUNTER(galog_max);
+DECLARE_STATS_COUNTER(galog_total);
 
 static struct dentry *stats_dir;
 static struct dentry *de_fflush;
@@ -462,6 +464,8 @@ static void amd_iommu_stats_init(void)
 	amd_iommu_stats_add(&invalidate_iotlb);
 	amd_iommu_stats_add(&invalidate_iotlb_all);
 	amd_iommu_stats_add(&pri_requests);
+	amd_iommu_stats_add(&galog_max);
+	amd_iommu_stats_add(&galog_total);
 }
 
 #endif
@@ -655,14 +659,102 @@ static void iommu_poll_ppr_log(struct amd_iommu *iommu)
 	}
 }
 
+static int (*iommu_ga_log_notifier)(int, int, int);
+
+int amd_iommu_register_ga_log_notifier(int (*notifier)(int, int, int))
+{
+	iommu_ga_log_notifier = notifier;
+
+	return 0;
+}
+EXPORT_SYMBOL(amd_iommu_register_ga_log_notifier);
+
+static void iommu_handle_ga_guest_nr_entry(struct amd_iommu *iommu,
+					   u16 devid, u32 ga_tag)
+{
+	struct amd_ir_data *ir_data;
+	unsigned long flags;
+	int vec = 0;
+
+	if (!iommu_ga_log_notifier)
+		return;
+
+	spin_lock_irqsave(&iommu->ga_hash_lock, flags);
+	hash_for_each_possible(iommu->ga_hash, ir_data, hnode, ga_tag) {
+		vec = ir_data->irte_ga_entry.hi.fields.vector;
+		break;
+	}
+	spin_unlock_irqrestore(&iommu->ga_hash_lock, flags);
+
+	if (vec) {
+		pr_debug("AMD-Vi: %s: devid=%#x, ga_tag=%#x\n",
+			 __func__, devid, ga_tag);
+
+		if (iommu_ga_log_notifier(GATAG_TO_AVICTAG(ga_tag),
+					  GATAG_TO_VCPUID(ga_tag), vec) != 0)
+			pr_err("AMD-Vi: GA log notifier failed.\n");
+	}
+}
+
+static void iommu_poll_ga_log(struct amd_iommu *iommu)
+{
+	u32 head, tail, cnt = 0;
+
+	if (iommu->ga_log == NULL)
+		return;
+
+	head = readl(iommu->mmio_base + MMIO_GA_HEAD_OFFSET);
+	tail = readl(iommu->mmio_base + MMIO_GA_TAIL_OFFSET);
+
+	while (head != tail) {
+		volatile u64 *raw;
+		u64 entry;
+
+		raw = (u64 *)(iommu->ga_log + head);
+		cnt++;
+
+		/* Avoid memcpy function-call overhead */
+		entry = *raw;
+
+		/* Update head pointer of hardware ring-buffer */
+		head = (head + GA_ENTRY_SIZE) % GA_LOG_SIZE;
+		writel(head, iommu->mmio_base + MMIO_GA_HEAD_OFFSET);
+
+		/* Handle GA entry */
+		switch (GA_REQ_TYPE(entry)) {
+		case GA_GUEST_NR:
+			iommu_handle_ga_guest_nr_entry(iommu,
+							GA_DEVID(entry),
+							GA_TAG(entry));
+			break;
+		default:
+			break;
+		}
+
+		/* Refresh ring-buffer information */
+		head = readl(iommu->mmio_base + MMIO_GA_HEAD_OFFSET);
+		tail = readl(iommu->mmio_base + MMIO_GA_TAIL_OFFSET);
+	}
+
+	ADD_STATS_COUNTER(galog_total, cnt);
+
+	if (STATS_COUNTER(galog_max) < cnt)
+		SET_STATS_COUNTER(galog_max, cnt);
+}
+
+#define AMD_IOMMU_INT_MASK	\
+	(MMIO_STATUS_EVT_INT_MASK | \
+	 MMIO_STATUS_PPR_INT_MASK | \
+	 MMIO_STATUS_GALOG_INT_MASK)
+
 irqreturn_t amd_iommu_int_thread(int irq, void *data)
 {
 	struct amd_iommu *iommu = (struct amd_iommu *) data;
 	u32 status = readl(iommu->mmio_base + MMIO_STATUS_OFFSET);
 
-	while (status & (MMIO_STATUS_EVT_INT_MASK | MMIO_STATUS_PPR_INT_MASK)) {
-		/* Enable EVT and PPR interrupts again */
-		writel((MMIO_STATUS_EVT_INT_MASK | MMIO_STATUS_PPR_INT_MASK),
+	while (status & AMD_IOMMU_INT_MASK) {
+		/* Enable EVT and PPR and GA interrupts again */
+		writel(AMD_IOMMU_INT_MASK,
 			iommu->mmio_base + MMIO_STATUS_OFFSET);
 
 		if (status & MMIO_STATUS_EVT_INT_MASK) {
@@ -675,6 +767,11 @@ irqreturn_t amd_iommu_int_thread(int irq, void *data)
 			iommu_poll_ppr_log(iommu);
 		}
 
+		if (status & MMIO_STATUS_GALOG_INT_MASK) {
+			pr_devel("AMD-Vi: Processing IOMMU GA Log\n");
+			iommu_poll_ga_log(iommu);
+		}
+
 		/*
 		 * Hardware bug: ERBT1312
 		 * When re-enabling interrupt (by writing 1
diff --git a/include/linux/amd-iommu.h b/include/linux/amd-iommu.h
index 2b08e79..36648fe 100644
--- a/include/linux/amd-iommu.h
+++ b/include/linux/amd-iommu.h
@@ -169,10 +169,20 @@ typedef void (*amd_iommu_invalidate_ctx)(struct pci_dev *pdev, int pasid);
 extern int amd_iommu_set_invalidate_ctx_cb(struct pci_dev *pdev,
 					   amd_iommu_invalidate_ctx cb);
 
+/* IOMMU AVIC Function */
+extern int
+amd_iommu_register_ga_log_notifier(int (*notifier)(int, int, int));
+
 #else
 
 static inline int amd_iommu_detect(void) { return -ENODEV; }
 
+static inline int
+amd_iommu_register_ga_log_notifier(int (*notifier)(int, int, int))
+{
+	return 0;
+}
+
 #endif
 
 #endif /* _ASM_X86_AMD_IOMMU_H */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PART2 RFC v1 5/9] iommu/amd: Introduce amd_iommu_update_ga()
  2016-04-08 12:49 [PART2 RFC v1 0/9] iommu/AMD: Introduce IOMMU AVIC support Suravee Suthikulpanit
                   ` (3 preceding siblings ...)
  2016-04-08 12:49 ` [PART2 RFC v1 4/9] iommu/amd: Adding GALOG interrupt handler Suravee Suthikulpanit
@ 2016-04-08 12:49 ` Suravee Suthikulpanit
  2016-04-13 17:06   ` Radim Krčmář
  2016-04-08 12:49 ` [PART2 RFC v1 6/9] iommu/amd: Implements irq_set_vcpu_affinity hook to setup GA mode for pass-through devices Suravee Suthikulpanit
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 14+ messages in thread
From: Suravee Suthikulpanit @ 2016-04-08 12:49 UTC (permalink / raw)
  To: pbonzini, rkrcmar, joro, bp, gleb, alex.williamson
  Cc: kvm, linux-kernel, wei, sherry.hurwitz, Suravee Suthikulpanit

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

This patch introduces a new IOMMU interface, amd_iommu_update_ga(),
which allows KVM (SVM) to update existing posted interrupt IOMMU IRTE when
load/unload vcpu.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd_iommu.c | 70 +++++++++++++++++++++++++++++++++++++++++++++++
 include/linux/amd-iommu.h |  8 ++++++
 2 files changed, 78 insertions(+)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 8cdde339..1d17597 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -4330,4 +4330,74 @@ int amd_iommu_create_irq_domain(struct amd_iommu *iommu)
 
 	return 0;
 }
+
+static int
+set_irte_ga(struct amd_iommu *iommu, unsigned int devid,
+	    u64 base, int cpu, bool is_run)
+{
+	struct irq_remap_table *irt = get_irq_table(devid, false);
+	unsigned long flags;
+	int index;
+
+	if (!irt)
+		return -ENODEV;
+
+	spin_lock_irqsave(&irt->lock, flags);
+
+	for (index = irt->min_index; index < MAX_IRQS_PER_TABLE; ++index) {
+		struct irte_ga *irte = amd_iommu_get_irte(irt, index);
+
+		if (!irte->lo.fields_vapic.guest_mode)
+			continue;
+
+		irte->hi.fields.ga_root_ptr = (base >> 12);
+		irte->lo.fields_vapic.destination = cpu;
+		irte->lo.fields_vapic.is_run = is_run;
+		barrier();
+	}
+
+	spin_unlock_irqrestore(&irt->lock, flags);
+
+	iommu_flush_irt(iommu, devid);
+	iommu_completion_wait(iommu);
+
+	return 0;
+}
+
+int amd_iommu_update_ga(u32 vcpu_id, u32 cpu, u32 ga_tag,
+			u64 base, bool is_run)
+{
+	unsigned long flags;
+	struct amd_iommu *iommu;
+
+	if (amd_iommu_guest_ir < AMD_IOMMU_GUEST_IR_GA)
+		return 0;
+
+	for_each_iommu(iommu) {
+		struct amd_ir_data *ir_data;
+
+		spin_lock_irqsave(&iommu->ga_hash_lock, flags);
+
+		hash_for_each_possible(iommu->ga_hash, ir_data, hnode,
+				       AMD_IOMMU_GATAG(ga_tag, vcpu_id)) {
+			struct iommu_dev_data *dev_data;
+
+			if (!ir_data)
+				break;
+
+			dev_data = search_dev_data(ir_data->irq_2_irte.devid);
+
+			if (!dev_data || !dev_data->guest_mode)
+				continue;
+
+			set_irte_ga(iommu, ir_data->irq_2_irte.devid,
+				    base, cpu, is_run);
+		}
+
+		spin_unlock_irqrestore(&iommu->ga_hash_lock, flags);
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL(amd_iommu_update_ga);
 #endif
diff --git a/include/linux/amd-iommu.h b/include/linux/amd-iommu.h
index 36648fe..e52cee5 100644
--- a/include/linux/amd-iommu.h
+++ b/include/linux/amd-iommu.h
@@ -173,6 +173,9 @@ extern int amd_iommu_set_invalidate_ctx_cb(struct pci_dev *pdev,
 extern int
 amd_iommu_register_ga_log_notifier(int (*notifier)(int, int, int));
 
+extern int
+amd_iommu_update_ga(u32 vcpu_id, u32 cpu, u32 ga_tag, u64 base, bool is_run);
+
 #else
 
 static inline int amd_iommu_detect(void) { return -ENODEV; }
@@ -183,6 +186,11 @@ amd_iommu_register_ga_log_notifier(int (*notifier)(int, int, int))
 	return 0;
 }
 
+static inline int
+amd_iommu_update_ga(u32 vcpu_id, u32 cpu, u32 ga_tag, u64 base, bool is_run)
+{
+	return 0;
+}
 #endif
 
 #endif /* _ASM_X86_AMD_IOMMU_H */
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PART2 RFC v1 6/9] iommu/amd: Implements irq_set_vcpu_affinity hook to setup GA mode for pass-through devices
  2016-04-08 12:49 [PART2 RFC v1 0/9] iommu/AMD: Introduce IOMMU AVIC support Suravee Suthikulpanit
                   ` (4 preceding siblings ...)
  2016-04-08 12:49 ` [PART2 RFC v1 5/9] iommu/amd: Introduce amd_iommu_update_ga() Suravee Suthikulpanit
@ 2016-04-08 12:49 ` Suravee Suthikulpanit
  2016-04-08 12:49 ` [PART2 RFC v1 7/9] svm: Introduce AMD IOMMU avic_ga_log_notifier Suravee Suthikulpanit
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 14+ messages in thread
From: Suravee Suthikulpanit @ 2016-04-08 12:49 UTC (permalink / raw)
  To: pbonzini, rkrcmar, joro, bp, gleb, alex.williamson
  Cc: kvm, linux-kernel, wei, sherry.hurwitz, Suravee Suthikulpanit

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

This patch implements irq_set_vcpu_affinity function hook to set up interrupt
remapping table entry with GA mode for pass-through devices.

In case requirements for GA mode are not met, it falls back to set up
the IRTE in legacy mode.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 drivers/iommu/amd_iommu.c       | 98 ++++++++++++++++++++++++++++++++++-------
 drivers/iommu/amd_iommu_init.c  |  4 ++
 drivers/iommu/amd_iommu_types.h | 27 ++++++++++++
 include/linux/amd-iommu.h       |  6 +++
 4 files changed, 119 insertions(+), 16 deletions(-)

diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
index 1d17597..112a937 100644
--- a/drivers/iommu/amd_iommu.c
+++ b/drivers/iommu/amd_iommu.c
@@ -101,6 +101,7 @@ struct iommu_dev_data {
 	bool pri_tlp;			  /* PASID TLB required for
 					     PPR completions */
 	u32 errata;			  /* Bitmap for errata to apply */
+	u32 guest_mode;
 };
 
 /*
@@ -3145,6 +3146,10 @@ static void amd_iommu_detach_device(struct iommu_domain *dom,
 	if (!iommu)
 		return;
 
+	if ((amd_iommu_guest_ir >= AMD_IOMMU_GUEST_IR_GA) &&
+	    (dom->type == IOMMU_DOMAIN_UNMANAGED))
+		dev_data->guest_mode = 0;
+
 	iommu_completion_wait(iommu);
 }
 
@@ -3170,6 +3175,13 @@ static int amd_iommu_attach_device(struct iommu_domain *dom,
 
 	ret = attach_device(dev, domain);
 
+	if (amd_iommu_guest_ir >= AMD_IOMMU_GUEST_IR_GA) {
+		if (dom->type == IOMMU_DOMAIN_UNMANAGED)
+			dev_data->guest_mode = 1;
+		else
+			dev_data->guest_mode = 0;
+	}
+
 	iommu_completion_wait(iommu);
 
 	return ret;
@@ -3698,20 +3710,6 @@ EXPORT_SYMBOL(amd_iommu_device_info);
  *
  *****************************************************************************/
 
-struct irq_2_irte {
-	u16 devid; /* Device ID for IRTE table */
-	u16 index; /* Index into IRTE table*/
-};
-
-struct amd_ir_data {
-	struct irq_2_irte			irq_2_irte;
-	union irte				irte_entry;
-	struct irte_ga				irte_ga_entry;
-	union {
-		struct msi_msg			msi_entry;
-	};
-};
-
 static struct irq_chip amd_ir_chip;
 
 #define DTE_IRQ_PHYS_ADDR_MASK	(((1ULL << 45)-1) << 6)
@@ -4067,6 +4065,7 @@ static void irq_remapping_prepare_irte(struct amd_ir_data *data,
 	struct irq_2_irte *irte_info = &data->irq_2_irte;
 	struct msi_msg *msg = &data->msi_entry;
 	struct IO_APIC_route_entry *entry;
+	struct iommu_dev_data *dev_data = search_dev_data(devid);
 
 	data->irq_2_irte.devid = devid;
 	data->irq_2_irte.index = index + sub_handle;
@@ -4086,7 +4085,8 @@ static void irq_remapping_prepare_irte(struct amd_ir_data *data,
 
 		irte->lo.val                      = 0;
 		irte->hi.val                      = 0;
-		irte->lo.fields_remap.guest_mode  = 0;
+		irte->lo.fields_remap.guest_mode  = dev_data ?
+						    dev_data->guest_mode : 0;
 		irte->lo.fields_remap.int_type    = apic->irq_delivery_mode;
 		irte->lo.fields_remap.dm          = apic->irq_dest_mode;
 		irte->hi.fields.vector            = irq_cfg->vector;
@@ -4259,6 +4259,70 @@ static struct irq_domain_ops amd_ir_domain_ops = {
 	.deactivate = irq_remapping_deactivate,
 };
 
+static int amd_ir_set_vcpu_affinity(struct irq_data *data, void *vcpu_info)
+{
+	unsigned long flags;
+	struct amd_iommu *iommu;
+	struct amd_iommu_pi_data *pi_data = vcpu_info;
+	struct vcpu_data *vcpu_pi_info = pi_data->vcpu_data;
+	struct amd_ir_data *ir_data = data->chip_data;
+	struct irte_ga *irte = &ir_data->irte_ga_entry;
+	struct irq_2_irte *irte_info = &ir_data->irq_2_irte;
+	struct iommu_dev_data *dev_data = search_dev_data(irte_info->devid);
+
+	/* Note:
+	 * This device has never been set up for guest mode.
+	 * we should not modify the IRTE
+	 */
+	if (!dev_data || !dev_data->guest_mode)
+		return 0;
+
+	/* Note:
+	 * SVM tries to set up for GA mode, but we are in
+	 * legacy mode. So, we force legacy mode instead.
+	 */
+	if (amd_iommu_guest_ir < AMD_IOMMU_GUEST_IR_GA) {
+		pr_debug("AMD-Vi: %s: Fall back to using intr legacy remap\n",
+			 __func__);
+		vcpu_pi_info = NULL;
+	}
+
+	iommu = amd_iommu_rlookup_table[irte_info->devid];
+	if (iommu == NULL)
+		return -EINVAL;
+
+	spin_lock_irqsave(&iommu->ga_hash_lock, flags);
+
+	if (vcpu_pi_info) {
+		/* Setting */
+		irte->hi.fields.vector = vcpu_pi_info->vector;
+		irte->lo.fields_vapic.guest_mode = 1;
+		irte->lo.fields_vapic.ga_tag =
+			AMD_IOMMU_GATAG(pi_data->avic_tag, pi_data->vcpu_id);
+
+		if (!hash_hashed(&ir_data->hnode))
+			hash_add(iommu->ga_hash, &ir_data->hnode,
+				 (u16)(irte->lo.fields_vapic.ga_tag));
+	} else {
+		/* Un-Setting */
+		struct irq_cfg *cfg = irqd_cfg(data);
+
+		irte->hi.val = 0;
+		irte->lo.val = 0;
+		irte->hi.fields.vector = cfg->vector;
+		irte->lo.fields_remap.guest_mode = 0;
+		irte->lo.fields_remap.destination = cfg->dest_apicid;
+		irte->lo.fields_remap.int_type = apic->irq_delivery_mode;
+		irte->lo.fields_remap.dm = apic->irq_dest_mode;
+
+		hash_del(&ir_data->hnode);
+	}
+
+	spin_unlock_irqrestore(&iommu->ga_hash_lock, flags);
+
+	return modify_irte_ga(irte_info->devid, irte_info->index, irte);
+}
+
 static int amd_ir_set_affinity(struct irq_data *data,
 			       const struct cpumask *mask, bool force)
 {
@@ -4266,6 +4330,7 @@ static int amd_ir_set_affinity(struct irq_data *data,
 	struct irq_2_irte *irte_info = &ir_data->irq_2_irte;
 	struct irq_cfg *cfg = irqd_cfg(data);
 	struct irq_data *parent = data->parent_data;
+	struct iommu_dev_data *dev_data = search_dev_data(irte_info->devid);
 	int ret;
 
 	ret = parent->chip->irq_set_affinity(parent, mask, force);
@@ -4281,7 +4346,7 @@ static int amd_ir_set_affinity(struct irq_data *data,
 		ir_data->irte_entry.fields.destination = cfg->dest_apicid;
 		modify_irte(irte_info->devid, irte_info->index,
 			    ir_data->irte_entry);
-	} else {
+	} else if (!dev_data || !dev_data->guest_mode) {
 		struct irte_ga *entry = &ir_data->irte_ga_entry;
 
 		entry->hi.fields.vector = cfg->vector;
@@ -4311,6 +4376,7 @@ static struct irq_chip amd_ir_chip = {
 	.name = "AMD-IR-IRQ-CHIP",
 	.irq_ack = ir_ack_apic_edge,
 	.irq_set_affinity = amd_ir_set_affinity,
+	.irq_set_vcpu_affinity = amd_ir_set_vcpu_affinity,
 	.irq_compose_msi_msg = ir_compose_msi_msg,
 };
 
diff --git a/drivers/iommu/amd_iommu_init.c b/drivers/iommu/amd_iommu_init.c
index 5b783af..4d933a7 100644
--- a/drivers/iommu/amd_iommu_init.c
+++ b/drivers/iommu/amd_iommu_init.c
@@ -1370,6 +1370,10 @@ static int iommu_init_pci(struct amd_iommu *iommu)
 	    !iommu_feature(iommu, FEATURE_GAM_VAPIC))
 		amd_iommu_guest_ir = AMD_IOMMU_GUEST_IR_LEGACY_GA;
 
+	if (amd_iommu_guest_ir >= AMD_IOMMU_GUEST_IR_GA) {
+		hash_init(iommu->ga_hash);
+		spin_lock_init(&iommu->ga_hash_lock);
+	}
 
 	ret = iommu_init_ga_log(iommu);
 	if (ret)
diff --git a/drivers/iommu/amd_iommu_types.h b/drivers/iommu/amd_iommu_types.h
index d528a46..1b640b2 100644
--- a/drivers/iommu/amd_iommu_types.h
+++ b/drivers/iommu/amd_iommu_types.h
@@ -22,10 +22,12 @@
 
 #include <linux/types.h>
 #include <linux/mutex.h>
+#include <linux/msi.h>
 #include <linux/list.h>
 #include <linux/spinlock.h>
 #include <linux/pci.h>
 #include <linux/irqreturn.h>
+#include <linux/hashtable.h>
 
 /*
  * Maximum number of IOMMUs supported
@@ -119,6 +121,14 @@
 #define MMIO_STATUS_GALOG_OVERFLOW_MASK	(1 << 9)
 #define MMIO_STATUS_GALOG_INT_MASK	(1 << 10)
 
+#define AMD_IOMMU_GA_HASH_BITS	16
+#define AMD_IOMMU_GA_HASH_MASK	((1U << AMD_IOMMU_GA_HASH_BITS) - 1)
+#define AMD_IOMMU_GATAG(x, y)	\
+	((((x & 0xFF) << 8) | (y & 0xFF)) & AMD_IOMMU_GA_HASH_MASK)
+
+#define GATAG_TO_AVICTAG(x)	((x >> 8) & 0xFF)
+#define GATAG_TO_VCPUID(x)	(x & 0xFF)
+
 /* event logging constants */
 #define EVENT_ENTRY_SIZE	0x10
 #define EVENT_TYPE_SHIFT	28
@@ -556,6 +566,8 @@ struct amd_iommu {
 	struct irq_domain *ir_domain;
 	struct irq_domain *msi_domain;
 #endif
+	DECLARE_HASHTABLE(ga_hash, AMD_IOMMU_GA_HASH_BITS);
+	spinlock_t ga_hash_lock;
 };
 
 struct devid_map {
@@ -801,4 +813,19 @@ struct irte_ga {
 	union irte_ga_hi hi;
 };
 
+struct irq_2_irte {
+	u16 devid; /* Device ID for IRTE table */
+	u16 index; /* Index into IRTE table*/
+};
+
+struct amd_ir_data {
+	struct hlist_node			hnode;
+	struct irq_2_irte			irq_2_irte;
+	union irte				irte_entry;
+	struct irte_ga				irte_ga_entry;
+	union {
+		struct msi_msg			msi_entry;
+	};
+};
+
 #endif /* _ASM_X86_AMD_IOMMU_TYPES_H */
diff --git a/include/linux/amd-iommu.h b/include/linux/amd-iommu.h
index e52cee5..f698900 100644
--- a/include/linux/amd-iommu.h
+++ b/include/linux/amd-iommu.h
@@ -22,6 +22,12 @@
 
 #include <linux/types.h>
 
+struct amd_iommu_pi_data {
+	u32 vcpu_id;
+	u32 avic_tag;
+	struct vcpu_data *vcpu_data;
+};
+
 #ifdef CONFIG_AMD_IOMMU
 
 struct task_struct;
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PART2 RFC v1 7/9] svm: Introduce AMD IOMMU avic_ga_log_notifier
  2016-04-08 12:49 [PART2 RFC v1 0/9] iommu/AMD: Introduce IOMMU AVIC support Suravee Suthikulpanit
                   ` (5 preceding siblings ...)
  2016-04-08 12:49 ` [PART2 RFC v1 6/9] iommu/amd: Implements irq_set_vcpu_affinity hook to setup GA mode for pass-through devices Suravee Suthikulpanit
@ 2016-04-08 12:49 ` Suravee Suthikulpanit
  2016-04-08 12:49 ` [PART2 RFC v1 8/9] svm: Implements update_pi_irte hook to setup posted interrupt Suravee Suthikulpanit
  2016-04-08 12:49 ` [PART2 RFC v1 9/9] svm: Update AMD IOMMU IRTE with vcpu scheduling information when enable AVIC Suravee Suthikulpanit
  8 siblings, 0 replies; 14+ messages in thread
From: Suravee Suthikulpanit @ 2016-04-08 12:49 UTC (permalink / raw)
  To: pbonzini, rkrcmar, joro, bp, gleb, alex.williamson
  Cc: kvm, linux-kernel, wei, sherry.hurwitz, Suravee Suthikulpanit

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

This patch introduces avic_ga_log_notifier, which will be called
by IOMMU driver whenever it handles the Guest vAPIC (GA) log entry.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/include/asm/kvm_host.h |  2 ++
 arch/x86/kvm/svm.c              | 55 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 57 insertions(+)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 5ee3eb7..b56ba9f 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -776,9 +776,11 @@ struct kvm_arch {
 	bool disabled_lapic_found;
 
 	/* Struct members for AVIC */
+	u32 avic_tag;
 	u32 ldr_mode;
 	struct page *avic_logical_id_table_page;
 	struct page *avic_physical_id_table_page;
+	struct hlist_node hnode;
 };
 
 struct kvm_vm_stat {
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index a037ceb..6b5ce27 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -34,6 +34,8 @@
 #include <linux/sched.h>
 #include <linux/trace_events.h>
 #include <linux/slab.h>
+#include <linux/amd-iommu.h>
+#include <linux/hashtable.h>
 
 #include <asm/apic.h>
 #include <asm/perf_event.h>
@@ -921,6 +923,45 @@ static void svm_disable_lbrv(struct vcpu_svm *svm)
 	set_msr_interception(msrpm, MSR_IA32_LASTINTTOIP, 0, 0);
 }
 
+#define SVM_VM_DATA_HASH_BITS	8
+DECLARE_HASHTABLE(svm_vm_data_hash, SVM_VM_DATA_HASH_BITS);
+static spinlock_t svm_vm_data_hash_lock;
+
+static int avic_ga_log_notifier(int avic_tag, int vcpu_id, int vec)
+{
+	unsigned long flags;
+	struct kvm_arch *ka = NULL;
+	struct kvm_vcpu *vcpu = NULL;
+	struct vcpu_svm *svm = NULL;
+
+	pr_debug("SVM: %s: avic_tag=%#x, vcpu_id=%#x, vec=%#x\n",
+		 __func__, avic_tag, vcpu_id, vec);
+
+	spin_lock_irqsave(&svm_vm_data_hash_lock, flags);
+	hash_for_each_possible(svm_vm_data_hash, ka, hnode, avic_tag) {
+		struct kvm *kvm = container_of(ka, struct kvm, arch);
+
+		vcpu = kvm_get_vcpu_by_id(kvm, vcpu_id);
+		break;
+	}
+	spin_unlock_irqrestore(&svm_vm_data_hash_lock, flags);
+
+	if (!vcpu)
+		return 0;
+
+	svm = to_svm(vcpu);
+
+	/* Note:
+	 * At this point, the IOMMU should have already set the pending
+	 * bit in the vAPIC backing page. So, we just need to schedule
+	 * in the vcpu.
+	 */
+	if (vcpu->mode == OUTSIDE_GUEST_MODE)
+		kvm_vcpu_wake_up(vcpu);
+
+	return 0;
+}
+
 static __init int svm_hardware_setup(void)
 {
 	int cpu;
@@ -981,6 +1022,10 @@ static __init int svm_hardware_setup(void)
 
 	if (avic) {
 		printk(KERN_INFO "kvm: AVIC enabled\n");
+
+		hash_init(svm_vm_data_hash);
+		spin_lock_init(&svm_vm_data_hash_lock);
+		amd_iommu_register_ga_log_notifier(&avic_ga_log_notifier);
 	} else {
 		svm_x86_ops.deliver_posted_interrupt = NULL;
 	}
@@ -1280,12 +1325,17 @@ static int avic_init_backing_page(struct kvm_vcpu *vcpu)
 
 static void avic_vm_uninit(struct kvm *kvm)
 {
+	unsigned long flags;
 	struct kvm_arch *vm_data = &kvm->arch;
 
 	if (vm_data->avic_logical_id_table_page)
 		__free_page(vm_data->avic_logical_id_table_page);
 	if (vm_data->avic_physical_id_table_page)
 		__free_page(vm_data->avic_physical_id_table_page);
+
+	spin_lock_irqsave(&svm_vm_data_hash_lock, flags);
+	hash_del(&vm_data->hnode);
+	spin_unlock_irqrestore(&svm_vm_data_hash_lock, flags);
 }
 
 static void avic_vcpu_uninit(struct kvm_vcpu *vcpu)
@@ -1301,6 +1351,7 @@ static void avic_vcpu_uninit(struct kvm_vcpu *vcpu)
 
 static int avic_vm_init(struct kvm *kvm)
 {
+	unsigned long flags;
 	int err = -ENOMEM;
 	struct kvm_arch *vm_data = &kvm->arch;
 	struct page *p_page;
@@ -1325,6 +1376,10 @@ static int avic_vm_init(struct kvm *kvm)
 	vm_data->avic_logical_id_table_page = l_page;
 	clear_page(page_address(l_page));
 
+	spin_lock_irqsave(&svm_vm_data_hash_lock, flags);
+	hash_add(svm_vm_data_hash, &vm_data->hnode, vm_data->avic_tag);
+	spin_unlock_irqrestore(&svm_vm_data_hash_lock, flags);
+
 	return 0;
 
 free_avic:
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PART2 RFC v1 8/9] svm: Implements update_pi_irte hook to setup posted interrupt
  2016-04-08 12:49 [PART2 RFC v1 0/9] iommu/AMD: Introduce IOMMU AVIC support Suravee Suthikulpanit
                   ` (6 preceding siblings ...)
  2016-04-08 12:49 ` [PART2 RFC v1 7/9] svm: Introduce AMD IOMMU avic_ga_log_notifier Suravee Suthikulpanit
@ 2016-04-08 12:49 ` Suravee Suthikulpanit
  2016-04-08 12:49 ` [PART2 RFC v1 9/9] svm: Update AMD IOMMU IRTE with vcpu scheduling information when enable AVIC Suravee Suthikulpanit
  8 siblings, 0 replies; 14+ messages in thread
From: Suravee Suthikulpanit @ 2016-04-08 12:49 UTC (permalink / raw)
  To: pbonzini, rkrcmar, joro, bp, gleb, alex.williamson
  Cc: kvm, linux-kernel, wei, sherry.hurwitz, Suravee Suthikulpanit

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

This patch implements update_pi_irte function hook to allow SVM
communicate to IOMMU driver regarding how to set up IRTE for handling
posted interrupt.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/kvm/svm.c | 107 +++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 107 insertions(+)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 6b5ce27..38fd7a3 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -43,6 +43,7 @@
 #include <asm/desc.h>
 #include <asm/debugreg.h>
 #include <asm/kvm_para.h>
+#include <asm/irq_remapping.h>
 
 #include <asm/virtext.h>
 #include "trace.h"
@@ -1349,6 +1350,13 @@ static void avic_vcpu_uninit(struct kvm_vcpu *vcpu)
 		svm->avic_physical_id_cache = NULL;
 }
 
+static atomic_t avic_tag_gen = ATOMIC_INIT(0);
+
+static inline u32 avic_get_next_tag(void)
+{
+	return atomic_inc_return(&avic_tag_gen);
+}
+
 static int avic_vm_init(struct kvm *kvm)
 {
 	unsigned long flags;
@@ -1373,6 +1381,8 @@ static int avic_vm_init(struct kvm *kvm)
 	if (!l_page)
 		goto free_avic;
 
+	vm_data->avic_tag = avic_get_next_tag();
+
 	vm_data->avic_logical_id_table_page = l_page;
 	clear_page(page_address(l_page));
 
@@ -4278,6 +4288,102 @@ static void svm_deliver_avic_intr(struct kvm_vcpu *vcpu, int vec)
 		kvm_vcpu_wake_up(vcpu);
 }
 
+/*
+ * svm_update_pi_irte - set IRTE for Posted-Interrupts
+ *
+ * @kvm: kvm
+ * @host_irq: host irq of the interrupt
+ * @guest_irq: gsi of the interrupt
+ * @set: set or unset PI
+ * returns 0 on success, < 0 on failure
+ */
+static int svm_update_pi_irte(struct kvm *kvm, unsigned int host_irq,
+			      uint32_t guest_irq, bool set)
+{
+	struct kvm_kernel_irq_routing_entry *e;
+	struct kvm_irq_routing_table *irq_rt;
+	struct kvm_lapic_irq irq;
+	struct kvm_vcpu *vcpu = NULL;
+	struct vcpu_data vcpu_info;
+	int idx, ret = -EINVAL;
+	struct vcpu_svm *svm;
+	struct amd_iommu_pi_data pi_data;
+
+	if (!kvm_arch_has_assigned_device(kvm) ||
+	    !irq_remapping_cap(IRQ_POSTING_CAP))
+		return 0;
+
+	pr_debug("SVM: %s: host_irq=%#x, guest_irq=%#x, set=%#x\n",
+		 __func__, host_irq, guest_irq, set);
+
+	idx = srcu_read_lock(&kvm->irq_srcu);
+	irq_rt = srcu_dereference(kvm->irq_routing, &kvm->irq_srcu);
+	WARN_ON(guest_irq >= irq_rt->nr_rt_entries);
+
+	hlist_for_each_entry(e, &irq_rt->map[guest_irq], link) {
+		if (e->type != KVM_IRQ_ROUTING_MSI)
+			continue;
+
+		/**
+		 * Note:
+		 * The HW cannot support posting multicast/broadcast
+		 * interrupts to a vCPU. So, we still use interrupt
+		 * remapping for these kind of interrupts.
+		 *
+		 * For lowest-priority interrupts, we only support
+		 * those with single CPU as the destination, e.g. user
+		 * configures the interrupts via /proc/irq or uses
+		 * irqbalance to make the interrupts single-CPU.
+		 */
+		kvm_set_msi_irq(e, &irq);
+		if (kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) {
+			svm = to_svm(vcpu);
+			vcpu_info.pi_desc_addr = page_to_phys(svm->avic_backing_page);
+			vcpu_info.vector = irq.vector;
+
+			trace_kvm_pi_irte_update(vcpu->vcpu_id, host_irq, e->gsi,
+						 vcpu_info.vector,
+						 vcpu_info.pi_desc_addr, set);
+
+			pi_data.vcpu_id = vcpu->vcpu_id;
+
+			pr_debug("SVM: %s: use GA mode for irq %u\n", __func__,
+				 irq.vector);
+		} else {
+			set = false;
+
+			pr_debug("SVM: %s: use legacy intr remap mode for irq %u\n",
+				 __func__, irq.vector);
+		}
+
+		/**
+		 * Note:
+		 * When AVIC is disabled, we fall-back to setup
+		 * IRTE w/ legacy mode
+		 */
+		if (set && svm_vcpu_avic_enabled(svm)) {
+			/* Enable GA mode in IRTE */
+			pi_data.avic_tag = kvm->arch.avic_tag;
+			pi_data.vcpu_data = &vcpu_info;
+			ret = irq_set_vcpu_affinity(host_irq, &pi_data);
+		} else {
+			/* Use legacy mode in IRTE */
+			pi_data.vcpu_data = NULL;
+			ret = irq_set_vcpu_affinity(host_irq, &pi_data);
+		}
+
+		if (ret < 0) {
+			pr_err("%s: failed to update PI IRTE\n", __func__);
+			goto out;
+		}
+	}
+
+	ret = 0;
+out:
+	srcu_read_unlock(&kvm->irq_srcu, idx);
+	return ret;
+}
+
 static int svm_nmi_allowed(struct kvm_vcpu *vcpu)
 {
 	struct vcpu_svm *svm = to_svm(vcpu);
@@ -5094,6 +5200,7 @@ static struct kvm_x86_ops svm_x86_ops = {
 
 	.pmu_ops = &amd_pmu_ops,
 	.deliver_posted_interrupt = svm_deliver_avic_intr,
+	.update_pi_irte = svm_update_pi_irte,
 };
 
 static int __init svm_init(void)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PART2 RFC v1 9/9] svm: Update AMD IOMMU IRTE with vcpu scheduling information when enable AVIC
  2016-04-08 12:49 [PART2 RFC v1 0/9] iommu/AMD: Introduce IOMMU AVIC support Suravee Suthikulpanit
                   ` (7 preceding siblings ...)
  2016-04-08 12:49 ` [PART2 RFC v1 8/9] svm: Implements update_pi_irte hook to setup posted interrupt Suravee Suthikulpanit
@ 2016-04-08 12:49 ` Suravee Suthikulpanit
  8 siblings, 0 replies; 14+ messages in thread
From: Suravee Suthikulpanit @ 2016-04-08 12:49 UTC (permalink / raw)
  To: pbonzini, rkrcmar, joro, bp, gleb, alex.williamson
  Cc: kvm, linux-kernel, wei, sherry.hurwitz, Suravee Suthikulpanit

From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>

In case AVIC is enabled, during vcpu_load/unload, SVM needs to update
IOMMU IRTE with appropriate host physical APIC ID. Also, when
vcpu_blocking/unblocking, SVM needs to update the is-running bit in
the IOMMU IRTE. Both are achieved via calling amd_iommu_update_ga().

However, if GA mode is not enabled for the pass-through device,
IOMMU driver will simply just return when calling amd_iommu_update_ga.

Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---
 arch/x86/kvm/svm.c | 50 +++++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 43 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 38fd7a3..3b9a0b2 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -1397,11 +1397,24 @@ free_avic:
 	return err;
 }
 
+static inline int
+avic_update_iommu(struct kvm_vcpu *vcpu, int cpu, phys_addr_t pa, bool r)
+{
+	struct kvm_arch *vm_data = &vcpu->kvm->arch;
+
+	if (!kvm_arch_has_assigned_device(vcpu->kvm))
+		return 0;
+
+	return amd_iommu_update_ga(vcpu->vcpu_id, cpu, vm_data->avic_tag,
+				   (pa & AVIC_HPA_MASK), r);
+}
+
 /**
  * This function is called during VCPU halt/unhalt.
  */
 static int avic_set_running(struct kvm_vcpu *vcpu, bool is_run)
 {
+	int ret = 0;
 	u64 entry;
 	int h_physical_id = __default_cpu_present_to_apicid(vcpu->cpu);
 	struct vcpu_svm *svm = to_svm(vcpu);
@@ -1420,17 +1433,27 @@ static int avic_set_running(struct kvm_vcpu *vcpu, bool is_run)
 		WARN_ON((entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK) == 0);
 
 	entry &= ~AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK;
-	if (is_run)
+	if (is_run) {
 		entry |= AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK;
-	WRITE_ONCE(*(svm->avic_physical_id_cache), entry);
+		WRITE_ONCE(*(svm->avic_physical_id_cache), entry);
 
-	return 0;
+		ret = avic_update_iommu(vcpu, h_physical_id,
+					page_to_phys(svm->avic_backing_page), 1);
+	} else {
+		ret = avic_update_iommu(vcpu, h_physical_id,
+					page_to_phys(svm->avic_backing_page), 0);
+
+		WRITE_ONCE(*(svm->avic_physical_id_cache), entry);
+	}
+
+	return ret;
 }
 
 static int avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu, bool is_load)
 {
-	u64 entry;
+	int ret = 0;
 	int h_physical_id = __default_cpu_present_to_apicid(cpu);
+	u64 entry;
 	struct vcpu_svm *svm = to_svm(vcpu);
 
 	if (!svm_vcpu_avic_enabled(svm))
@@ -1443,16 +1466,29 @@ static int avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu, bool is_load)
 	entry = READ_ONCE(*(svm->avic_physical_id_cache));
 	WARN_ON(is_load && (entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK));
 
-	entry &= ~AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK;
 	if (is_load) {
 		entry &= ~AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK;
 		entry |= (h_physical_id & AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK);
+
+		entry &= ~AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK;
 		if (!svm->avic_is_blocking)
 			entry |= AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK;
+		WRITE_ONCE(*(svm->avic_physical_id_cache), entry);
+
+		ret = avic_update_iommu(vcpu, h_physical_id,
+					page_to_phys(svm->avic_backing_page),
+					!svm->avic_is_blocking);
+	} else {
+		if (entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK) {
+			ret = avic_update_iommu(vcpu, h_physical_id,
+						page_to_phys(svm->avic_backing_page), 0);
+
+			entry &= ~AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK;
+			WRITE_ONCE(*(svm->avic_physical_id_cache), entry);
+		}
 	}
-	WRITE_ONCE(*(svm->avic_physical_id_cache), entry);
 
-	return 0;
+	return ret;
 }
 
 static void svm_vcpu_reset(struct kvm_vcpu *vcpu, bool init_event)
-- 
1.9.1

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PART2 RFC v1 5/9] iommu/amd: Introduce amd_iommu_update_ga()
  2016-04-08 12:49 ` [PART2 RFC v1 5/9] iommu/amd: Introduce amd_iommu_update_ga() Suravee Suthikulpanit
@ 2016-04-13 17:06   ` Radim Krčmář
  2016-06-09 23:59     ` Suravee Suthikulpanit
  0 siblings, 1 reply; 14+ messages in thread
From: Radim Krčmář @ 2016-04-13 17:06 UTC (permalink / raw)
  To: Suravee Suthikulpanit
  Cc: pbonzini, joro, bp, gleb, alex.williamson, kvm, linux-kernel,
	wei, sherry.hurwitz

2016-04-08 07:49-0500, Suravee Suthikulpanit:
> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> 
> This patch introduces a new IOMMU interface, amd_iommu_update_ga(),
> which allows KVM (SVM) to update existing posted interrupt IOMMU IRTE when
> load/unload vcpu.
> 
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> ---
> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
> @@ -4330,4 +4330,74 @@ int amd_iommu_create_irq_domain(struct amd_iommu *iommu)
> +int amd_iommu_update_ga(u32 vcpu_id, u32 cpu, u32 ga_tag,

It'd be nicer to generate the tag on SVM side and pass it whole -- IOMMU
doesn't have to care how hypervisors use the tag.

> +			u64 base, bool is_run)
> +{
> +	unsigned long flags;
> +	struct amd_iommu *iommu;
> +
> +	if (amd_iommu_guest_ir < AMD_IOMMU_GUEST_IR_GA)
> +		return 0;
> +
> +	for_each_iommu(iommu) {
> +		struct amd_ir_data *ir_data;
> +
> +		spin_lock_irqsave(&iommu->ga_hash_lock, flags);
> +
> +		hash_for_each_possible(iommu->ga_hash, ir_data, hnode,
> +				       AMD_IOMMU_GATAG(ga_tag, vcpu_id)) {

All tags can map into the same bucket.  Code below doesn't check that
the ir_data belongs to the tag and will modify unrelated IRTEs.

Have you considered a per-VCPU list of IRTEs on the SVM side?

> +			struct iommu_dev_data *dev_data;
> +			if (!ir_data)

(ir_data can't be NULL.)

> +				break;
> +
> +			dev_data = search_dev_data(ir_data->irq_2_irte.devid);
> +
> +			if (!dev_data || !dev_data->guest_mode)
> +				continue;

(guest_mode can be also read from the irte.)

> +			set_irte_ga(iommu, ir_data->irq_2_irte.devid,
> +				    base, cpu, is_run);

set_irte_ga() is pretty expensive -- do we need to invalidate the irt
when changing cpu and is_run?

2.2.5.2 Interrupt Virtualization Tables with Guest Virtual APIC Enabled,
point 9, bullet 5 says that IRTE is read from memory before considering
IsRun, GATag and Destination, which makes me think that avoiding races
can be faster in the common case.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PART2 RFC v1 1/9] iommu/amd: Detect and enable guest vAPIC support
  2016-04-08 12:49 ` [PART2 RFC v1 1/9] iommu/amd: Detect and enable guest vAPIC support Suravee Suthikulpanit
@ 2016-05-09 11:49   ` Joerg Roedel
  2016-06-02 20:38     ` Suravee Suthikulanit
  0 siblings, 1 reply; 14+ messages in thread
From: Joerg Roedel @ 2016-05-09 11:49 UTC (permalink / raw)
  To: Suravee Suthikulpanit
  Cc: pbonzini, rkrcmar, bp, gleb, alex.williamson, kvm, linux-kernel,
	wei, sherry.hurwitz

On Fri, Apr 08, 2016 at 07:49:22AM -0500, Suthikulpanit, Suravee wrote:
> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> 
> This patch introduces a new IOMMU driver parameter, amd_iommu_guest_ir,
> which can be used to specify different interrupt remapping mode for
> passthrough devices to VM guest:
>     * legacy: Legacy interrupt remapping mode (w/ 32-bit IRTE)
>     * ga    : Guest vAPIC interrupt remapping mode (w/ 128-bit IRTE)
> 
> Note that the GA mode also supports legacy interrupt remapping
> for non-passthrough devices with the 128-bit IRTE.

Does this need to be under user control? The code can just check what
the hardware supports and use the 128bit IRTEs if supported, no?



	Joerg

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PART2 RFC v1 1/9] iommu/amd: Detect and enable guest vAPIC support
  2016-05-09 11:49   ` Joerg Roedel
@ 2016-06-02 20:38     ` Suravee Suthikulanit
  0 siblings, 0 replies; 14+ messages in thread
From: Suravee Suthikulanit @ 2016-06-02 20:38 UTC (permalink / raw)
  To: Joerg Roedel
  Cc: pbonzini, rkrcmar, bp, gleb, alex.williamson, kvm, linux-kernel,
	wei, sherry.hurwitz

On 5/9/2016 6:49 AM, Joerg Roedel wrote:
> On Fri, Apr 08, 2016 at 07:49:22AM -0500, Suthikulpanit, Suravee wrote:
>> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>>
>> This patch introduces a new IOMMU driver parameter, amd_iommu_guest_ir,
>> which can be used to specify different interrupt remapping mode for
>> passthrough devices to VM guest:
>>     * legacy: Legacy interrupt remapping mode (w/ 32-bit IRTE)
>>     * ga    : Guest vAPIC interrupt remapping mode (w/ 128-bit IRTE)
>>
>> Note that the GA mode also supports legacy interrupt remapping
>> for non-passthrough devices with the 128-bit IRTE.
>
> Does this need to be under user control? The code can just check what
> the hardware supports and use the 128bit IRTEs if supported, no?
>
> 	Joerg

It does not need to be signified by user.

Currently, if the MMIO Offset 30h[GASup] bit (of IOMMU Extended Feature 
Register) is set, the driver should default to using the 128bit IRTE by 
setting MMIO Offset 0018h[GAEn] bit (of IOMMU Control Register). The 
default is also enabling GA mode (by setting MMIO 0018h[GAEn] if MMIO 
0030h[GASup] is set).

However, if SVM AVIC is not enabled, or if the AVIC HW cannot support 
the type of interrupt (e.g. multicast/broadcast), it falls back to use 
legacy  interrupt remapping mode w/ 128-bit IRTE.

This option is intended for the case when we want to force IOMMU to use 
legacy interrupt remapping (hence no need for 128-bit IRTE).

I will improve on the documentation in the next patch series.

Thanks,
Suravee

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PART2 RFC v1 5/9] iommu/amd: Introduce amd_iommu_update_ga()
  2016-04-13 17:06   ` Radim Krčmář
@ 2016-06-09 23:59     ` Suravee Suthikulpanit
  0 siblings, 0 replies; 14+ messages in thread
From: Suravee Suthikulpanit @ 2016-06-09 23:59 UTC (permalink / raw)
  To: Radim Krčmář
  Cc: pbonzini, joro, bp, gleb, alex.williamson, kvm, linux-kernel,
	wei, sherry.hurwitz

Hi Radim,

On 4/13/16 12:06, Radim Krčmář wrote:
> 2016-04-08 07:49-0500, Suravee Suthikulpanit:
>> From: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>>
>> This patch introduces a new IOMMU interface, amd_iommu_update_ga(),
>> which allows KVM (SVM) to update existing posted interrupt IOMMU IRTE when
>> load/unload vcpu.
>>
>> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
>> ---
>> diff --git a/drivers/iommu/amd_iommu.c b/drivers/iommu/amd_iommu.c
>> @@ -4330,4 +4330,74 @@ int amd_iommu_create_irq_domain(struct amd_iommu *iommu)
>> +int amd_iommu_update_ga(u32 vcpu_id, u32 cpu, u32 ga_tag,
>
> It'd be nicer to generate the tag on SVM side and pass it whole -- IOMMU
> doesn't have to care how hypervisors use the tag.

Actually, we are generating the tag from the SVM side currently (please 
see avic_get_next_tag() in patch 8). The amd_iommu_update_ga() is meant 
to be called from SVM side and we are passing in the tag here.

>
>> +			u64 base, bool is_run)
>> +{
>> +	unsigned long flags;
>> +	struct amd_iommu *iommu;
>> +
>> +	if (amd_iommu_guest_ir < AMD_IOMMU_GUEST_IR_GA)
>> +		return 0;
>> +
>> +	for_each_iommu(iommu) {
>> +		struct amd_ir_data *ir_data;
>> +
>> +		spin_lock_irqsave(&iommu->ga_hash_lock, flags);
>> +
>> +		hash_for_each_possible(iommu->ga_hash, ir_data, hnode,
>> +				       AMD_IOMMU_GATAG(ga_tag, vcpu_id)) {
>
> All tags can map into the same bucket.  Code below doesn't check that
> the ir_data belongs to the tag and will modify unrelated IRTEs.
>
> Have you considered a per-VCPU list of IRTEs on the SVM side?

Actually, the hash key is basically vm-id and vcpu-id. So, this should 
get us all the ir_data for a specific vcpu in a particular VM.

>> +			set_irte_ga(iommu, ir_data->irq_2_irte.devid,
>> +				    base, cpu, is_run);
>
> set_irte_ga() is pretty expensive -- do we need to invalidate the irt
> when changing cpu and is_run?

You are right -- I think we can actually keep a pointer to IRTE in the 
amd_ir_data, and use that to directly get to the IRTE when we need to 
update the GA mode related stuff. That way, we don't need to go through 
the whole interrupt remapping table.

> 2.2.5.2 Interrupt Virtualization Tables with Guest Virtual APIC Enabled,
> point 9, bullet 5 says that IRTE is read from memory before considering
> IsRun, GATag and Destination, which makes me think that avoiding races
> can be faster in the common case.

Right.

I'm working on sending out V2 soon.

Thanks,
Suravee

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2016-06-09 23:59 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-04-08 12:49 [PART2 RFC v1 0/9] iommu/AMD: Introduce IOMMU AVIC support Suravee Suthikulpanit
2016-04-08 12:49 ` [PART2 RFC v1 1/9] iommu/amd: Detect and enable guest vAPIC support Suravee Suthikulpanit
2016-05-09 11:49   ` Joerg Roedel
2016-06-02 20:38     ` Suravee Suthikulanit
2016-04-08 12:49 ` [PART2 RFC v1 2/9] iommu/amd: Add data structure for " Suravee Suthikulpanit
2016-04-08 12:49 ` [PART2 RFC v1 3/9] iommu/amd: Detect and initialize guest vAPIC log Suravee Suthikulpanit
2016-04-08 12:49 ` [PART2 RFC v1 4/9] iommu/amd: Adding GALOG interrupt handler Suravee Suthikulpanit
2016-04-08 12:49 ` [PART2 RFC v1 5/9] iommu/amd: Introduce amd_iommu_update_ga() Suravee Suthikulpanit
2016-04-13 17:06   ` Radim Krčmář
2016-06-09 23:59     ` Suravee Suthikulpanit
2016-04-08 12:49 ` [PART2 RFC v1 6/9] iommu/amd: Implements irq_set_vcpu_affinity hook to setup GA mode for pass-through devices Suravee Suthikulpanit
2016-04-08 12:49 ` [PART2 RFC v1 7/9] svm: Introduce AMD IOMMU avic_ga_log_notifier Suravee Suthikulpanit
2016-04-08 12:49 ` [PART2 RFC v1 8/9] svm: Implements update_pi_irte hook to setup posted interrupt Suravee Suthikulpanit
2016-04-08 12:49 ` [PART2 RFC v1 9/9] svm: Update AMD IOMMU IRTE with vcpu scheduling information when enable AVIC Suravee Suthikulpanit

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.