[PATCH 0/3] Updates for SGI UV1/UV2 systems...

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/3] Updates for SGI UV1/UV2 systems...
@ 2012-02-02 23:56 Mike Travis
  2012-02-02 23:56 ` [PATCH 1/3] x86 PCI: Fix identity mapping for sandy bridge Mike Travis
                   ` (3 more replies)
  0 siblings, 4 replies; 8+ messages in thread
From: Mike Travis @ 2012-02-02 23:56 UTC (permalink / raw)
  To: Ingo Molnar, Andrew Morton, Jack Steiner; +Cc: x86, linux-kernel


- Fix arbitrary limitation of number of IOMMU (from 64 to MAX_IO_APICS [128])

- Fix IOMMU identity mapping to not map a PCI device that does not to DMA.

- Update uv_handle_nmi to latest NMI processing.

-- 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3] x86 PCI: Fix identity mapping for sandy bridge
  2012-02-02 23:56 [PATCH 0/3] Updates for SGI UV1/UV2 systems Mike Travis
@ 2012-02-02 23:56 ` Mike Travis
  2012-02-22 20:20   ` Andrew Morton
  2012-02-02 23:56 ` [PATCH 2/3] x86, pci: Increase the number of iommus supported to be MAX_IO_APICS Mike Travis
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 8+ messages in thread
From: Mike Travis @ 2012-02-02 23:56 UTC (permalink / raw)
  To: Ingo Molnar, Andrew Morton, Jack Steiner; +Cc: x86, linux-kernel, Mike Habeck

[-- Attachment #1: fix-iommu-identity-mapping-for-sb --]
[-- Type: text/plain, Size: 1874 bytes --]

With SandyBridge, Intel has changed these Socket PCI devices to have a class
type of "System Peripheral" & "Performance counter", rather than "HostBridge".
So instead of using a "special" case to detect which devices will not be
doing DMA, use the fact that a device that is not associated with an IOMMU,
will not need an identity map.

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Mike Habeck <habeck@sgi.com>
---
 drivers/iommu/intel-iommu.c |   20 +++++++++-----------
 1 file changed, 9 insertions(+), 11 deletions(-)

--- linux.orig/drivers/iommu/intel-iommu.c
+++ linux/drivers/iommu/intel-iommu.c
@@ -48,8 +48,6 @@
 #define ROOT_SIZE		VTD_PAGE_SIZE
 #define CONTEXT_SIZE		VTD_PAGE_SIZE
 
-#define IS_BRIDGE_HOST_DEVICE(pdev) \
-			    ((pdev->class >> 8) == PCI_CLASS_BRIDGE_HOST)
 #define IS_GFX_DEVICE(pdev) ((pdev->class >> 16) == PCI_BASE_CLASS_DISPLAY)
 #define IS_ISA_DEVICE(pdev) ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA)
 #define IS_AZALIA(pdev) ((pdev)->vendor == 0x8086 && (pdev)->device == 0x3a3e)
@@ -2369,18 +2367,18 @@ static int __init iommu_prepare_static_i
 		return -EFAULT;
 
 	for_each_pci_dev(pdev) {
-		/* Skip Host/PCI Bridge devices */
-		if (IS_BRIDGE_HOST_DEVICE(pdev))
-			continue;
 		if (iommu_should_identity_map(pdev, 1)) {
-			printk(KERN_INFO "IOMMU: %s identity mapping for device %s\n",
-			       hw ? "hardware" : "software", pci_name(pdev));
-
 			ret = domain_add_dev_info(si_domain, pdev,
-						     hw ? CONTEXT_TT_PASS_THROUGH :
-						     CONTEXT_TT_MULTI_LEVEL);
-			if (ret)
+					     hw ? CONTEXT_TT_PASS_THROUGH :
+						  CONTEXT_TT_MULTI_LEVEL);
+			if (ret) {
+				/* device not associated with an iommu */
+				if (ret == -ENODEV)
+					continue;
 				return ret;
+			}
+			pr_info("IOMMU: %s identity mapping for device %s\n",
+				hw ? "hardware" : "software", pci_name(pdev));
 		}
 	}
 

-- 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 2/3] x86, pci: Increase the number of iommus supported to be MAX_IO_APICS
  2012-02-02 23:56 [PATCH 0/3] Updates for SGI UV1/UV2 systems Mike Travis
  2012-02-02 23:56 ` [PATCH 1/3] x86 PCI: Fix identity mapping for sandy bridge Mike Travis
@ 2012-02-02 23:56 ` Mike Travis
  2012-02-02 23:56 ` [PATCH 3/3] UV Update NMI handlers Mike Travis
  2012-02-03 21:00 ` [PATCH 1/3] x86 PCI: Fix identity mapping for sandy bridge Mike Travis
  3 siblings, 0 replies; 8+ messages in thread
From: Mike Travis @ 2012-02-02 23:56 UTC (permalink / raw)
  To: Ingo Molnar, Andrew Morton, Jack Steiner; +Cc: x86, linux-kernel

[-- Attachment #1: uv2-up-iommu-limit --]
[-- Type: text/plain, Size: 4460 bytes --]

The number of IOMMUs supported should be the same as the number of IO APICS.
This limit comes into play when the IOMMUs are identity mapped, thus the
number of possible IOMMUs in the "static identity" (si) domain should be
this same number.

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Jack Steiner <steiner@sgi.com>
---
 drivers/iommu/intel-iommu.c |   25 +++++++++++++------------
 1 file changed, 13 insertions(+), 12 deletions(-)

--- linux.orig/drivers/iommu/intel-iommu.c
+++ linux/drivers/iommu/intel-iommu.c
@@ -357,7 +357,8 @@ static int hw_pass_through = 1;
 struct dmar_domain {
 	int	id;			/* domain id */
 	int	nid;			/* node id */
-	unsigned long iommu_bmp;	/* bitmap of iommus this domain uses*/
+	DECLARE_BITMAP(iommu_bmp, MAX_IO_APICS);
+					/* bitmap of iommus this domain uses*/
 
 	struct list_head devices; 	/* all devices' list */
 	struct iova_domain iovad;	/* iova's that belong to this domain */
@@ -569,7 +570,7 @@ static struct intel_iommu *domain_get_io
 	BUG_ON(domain->flags & DOMAIN_FLAG_VIRTUAL_MACHINE);
 	BUG_ON(domain->flags & DOMAIN_FLAG_STATIC_IDENTITY);
 
-	iommu_id = find_first_bit(&domain->iommu_bmp, g_num_of_iommus);
+	iommu_id = find_first_bit(domain->iommu_bmp, g_num_of_iommus);
 	if (iommu_id < 0 || iommu_id >= g_num_of_iommus)
 		return NULL;
 
@@ -582,7 +583,7 @@ static void domain_update_iommu_coherenc
 
 	domain->iommu_coherency = 1;
 
-	for_each_set_bit(i, &domain->iommu_bmp, g_num_of_iommus) {
+	for_each_set_bit(i, domain->iommu_bmp, g_num_of_iommus) {
 		if (!ecap_coherent(g_iommus[i]->ecap)) {
 			domain->iommu_coherency = 0;
 			break;
@@ -596,7 +597,7 @@ static void domain_update_iommu_snooping
 
 	domain->iommu_snooping = 1;
 
-	for_each_set_bit(i, &domain->iommu_bmp, g_num_of_iommus) {
+	for_each_set_bit(i, domain->iommu_bmp, g_num_of_iommus) {
 		if (!ecap_sc_support(g_iommus[i]->ecap)) {
 			domain->iommu_snooping = 0;
 			break;
@@ -1332,7 +1333,7 @@ static struct dmar_domain *alloc_domain(
 		return NULL;
 
 	domain->nid = -1;
-	memset(&domain->iommu_bmp, 0, sizeof(unsigned long));
+	memset(domain->iommu_bmp, 0, sizeof(domain->iommu_bmp));
 	domain->flags = 0;
 
 	return domain;
@@ -1358,7 +1359,7 @@ static int iommu_attach_domain(struct dm
 
 	domain->id = num;
 	set_bit(num, iommu->domain_ids);
-	set_bit(iommu->seq_id, &domain->iommu_bmp);
+	set_bit(iommu->seq_id, domain->iommu_bmp);
 	iommu->domains[num] = domain;
 	spin_unlock_irqrestore(&iommu->lock, flags);
 
@@ -1383,7 +1384,7 @@ static void iommu_detach_domain(struct d
 
 	if (found) {
 		clear_bit(num, iommu->domain_ids);
-		clear_bit(iommu->seq_id, &domain->iommu_bmp);
+		clear_bit(iommu->seq_id, domain->iommu_bmp);
 		iommu->domains[num] = NULL;
 	}
 	spin_unlock_irqrestore(&iommu->lock, flags);
@@ -1525,7 +1526,7 @@ static void domain_exit(struct dmar_doma
 	dma_pte_free_pagetable(domain, 0, DOMAIN_MAX_PFN(domain->gaw));
 
 	for_each_active_iommu(iommu, drhd)
-		if (test_bit(iommu->seq_id, &domain->iommu_bmp))
+		if (test_bit(iommu->seq_id, domain->iommu_bmp))
 			iommu_detach_domain(domain, iommu);
 
 	free_domain_mem(domain);
@@ -1651,7 +1652,7 @@ static int domain_context_mapping_one(st
 	spin_unlock_irqrestore(&iommu->lock, flags);
 
 	spin_lock_irqsave(&domain->iommu_lock, flags);
-	if (!test_and_set_bit(iommu->seq_id, &domain->iommu_bmp)) {
+	if (!test_and_set_bit(iommu->seq_id, domain->iommu_bmp)) {
 		domain->iommu_count++;
 		if (domain->iommu_count == 1)
 			domain->nid = iommu->node;
@@ -3746,7 +3747,7 @@ static void domain_remove_one_dev_info(s
 	if (found == 0) {
 		unsigned long tmp_flags;
 		spin_lock_irqsave(&domain->iommu_lock, tmp_flags);
-		clear_bit(iommu->seq_id, &domain->iommu_bmp);
+		clear_bit(iommu->seq_id, domain->iommu_bmp);
 		domain->iommu_count--;
 		domain_update_iommu_cap(domain);
 		spin_unlock_irqrestore(&domain->iommu_lock, tmp_flags);
@@ -3788,7 +3789,7 @@ static void vm_domain_remove_all_dev_inf
 		 */
 		spin_lock_irqsave(&domain->iommu_lock, flags2);
 		if (test_and_clear_bit(iommu->seq_id,
-				       &domain->iommu_bmp)) {
+				       domain->iommu_bmp)) {
 			domain->iommu_count--;
 			domain_update_iommu_cap(domain);
 		}
@@ -3813,7 +3814,7 @@ static struct dmar_domain *iommu_alloc_v
 
 	domain->id = vm_domid++;
 	domain->nid = -1;
-	memset(&domain->iommu_bmp, 0, sizeof(unsigned long));
+	memset(domain->iommu_bmp, 0, sizeof(domain->iommu_bmp));
 	domain->flags = DOMAIN_FLAG_VIRTUAL_MACHINE;
 
 	return domain;

-- 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 3/3] UV Update NMI handlers
  2012-02-02 23:56 [PATCH 0/3] Updates for SGI UV1/UV2 systems Mike Travis
  2012-02-02 23:56 ` [PATCH 1/3] x86 PCI: Fix identity mapping for sandy bridge Mike Travis
  2012-02-02 23:56 ` [PATCH 2/3] x86, pci: Increase the number of iommus supported to be MAX_IO_APICS Mike Travis
@ 2012-02-02 23:56 ` Mike Travis
  2012-02-03  7:31   ` Ingo Molnar
  2012-02-10 21:43   ` Don Zickus
  2012-02-03 21:00 ` [PATCH 1/3] x86 PCI: Fix identity mapping for sandy bridge Mike Travis
  3 siblings, 2 replies; 8+ messages in thread
From: Mike Travis @ 2012-02-02 23:56 UTC (permalink / raw)
  To: Ingo Molnar, Andrew Morton, Jack Steiner; +Cc: x86, linux-kernel

[-- Attachment #1: uv_update_nmi_handler --]
[-- Type: text/plain, Size: 6536 bytes --]

These changes update the UV NMI handler to be compatible with the new
NMI processing.

In order to determine if an incoming NMI was generated externally
by the BMC, the UV BIOS sets a flag in an MMR register.  Reading
this register though is expensive especially when all CPU's on
a blade share the same MMR.  Part of this patch reduces the number
of readers to only the first CPU to enter the NMI handler.  The other
problem was all CPUs on a blade were clearing the flag, when only
one of them on each blade needs to clear the MMR bit.

This greatly reduces the overhead to enter either KDB or print the
trace, and also removes the possibility of some of the CPUs not
entering into KDB correctly, leaving their state somewhat undefined,
as some where not entering after KDB generated an IPI NMI.

Note that this patch does not have the capability of entering KDB
from the BMC NMI interrupt.  This will be in a separate patch.

Signed-off-by: Mike Travis <travis@sgi.com>
---
 arch/x86/kernel/apic/x2apic_uv_x.c |  132 +++++++++++++++++++++++++++++--------
 1 file changed, 105 insertions(+), 27 deletions(-)

--- linux.orig/arch/x86/kernel/apic/x2apic_uv_x.c
+++ linux/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -43,7 +43,17 @@
 #define UVH_NMI_MMR				UVH_SCRATCH5
 #define UVH_NMI_MMR_CLEAR			(UVH_NMI_MMR + 8)
 #define UV_NMI_PENDING_MASK			(1UL << 63)
-DEFINE_PER_CPU(unsigned long, cpu_last_nmi_count);
+static DEFINE_PER_CPU(unsigned long, cpu_nmi_count);
+
+static inline int check_nmi_mmr(void)
+{
+	return (uv_read_local_mmr(UVH_NMI_MMR & UV_NMI_PENDING_MASK) != 0);
+}
+
+static inline void clear_nmi_mmr(void)
+{
+	uv_write_local_mmr(UVH_NMI_MMR_CLEAR, UV_NMI_PENDING_MASK);
+}
 
 DEFINE_PER_CPU(int, x2apic_extra_bits);
 
@@ -57,6 +67,7 @@ EXPORT_SYMBOL_GPL(uv_min_hub_revision_id
 unsigned int uv_apicid_hibits;
 EXPORT_SYMBOL_GPL(uv_apicid_hibits);
 static DEFINE_SPINLOCK(uv_nmi_lock);
+static DEFINE_SPINLOCK(uv_nmi_reason_lock);
 
 static struct apic apic_x2apic_uv_x;
 
@@ -672,52 +683,119 @@ void __cpuinit uv_cpu_init(void)
 }
 
 /*
- * When NMI is received, print a stack trace.
+ * When an NMI from the BMC is received:
+ *	- call KDB if active (not yet implemented)
+ *	- print a stack trace if kdb is not active.
  */
-int uv_handle_nmi(unsigned int reason, struct pt_regs *regs)
+int uv_handle_nmi(unsigned int cmd, struct pt_regs *regs)
 {
-	unsigned long real_uv_nmi;
-	int bid;
+	static int in_uv_nmi = -1;
+	static atomic_t nr_nmi_cpus;
+	int bid, handled = 0;
+
+	if (cmd != NMI_LOCAL && cmd != NMI_UNKNOWN)
+		return NMI_DONE;
+
+	if (in_crash_kexec)
+		/* do nothing if entering the crash kernel */
+		return NMI_DONE;
+
+	/* note which NMI this one is */
+	__this_cpu_inc(cpu_nmi_count);
 
 	/*
 	 * Each blade has an MMR that indicates when an NMI has been sent
-	 * to cpus on the blade. If an NMI is detected, atomically
-	 * clear the MMR and update a per-blade NMI count used to
-	 * cause each cpu on the blade to notice a new NMI.
+	 * to cpus on the blade.  We optimize accesses to the MMR as the
+	 * read operation is expensive and only the first CPU to enter this
+	 * function needs to read the register and set a flag indicating
+	 * we are indeed servicing an external BMC NMI.  Once it's determined
+	 * whether or not this is a real NMI, the last responding CPU clears
+	 * the flag for the next NMI.
 	 */
-	bid = uv_numa_blade_id();
-	real_uv_nmi = (uv_read_local_mmr(UVH_NMI_MMR) & UV_NMI_PENDING_MASK);
+	if (in_uv_nmi < 0) {
+		spin_lock(&uv_nmi_reason_lock);
+		if (in_uv_nmi < 0) {
+			int nc = num_online_cpus();
+			atomic_set(&nr_nmi_cpus, nc);
+			in_uv_nmi = check_nmi_mmr();
+		}
+		spin_unlock(&uv_nmi_reason_lock);
+	}
+
+	if (likely(in_uv_nmi == 0)) {
+		if (atomic_sub_and_test(1, &nr_nmi_cpus) == 0)
+			in_uv_nmi = -1;
+		return NMI_DONE;
+	}
+
+	/* If we are here, we are processing a real BMC NMI */
+
+#ifdef CONFIG_KGDB_KDB_NOT_YET
+#include <linux/kdb.h>
+
+	/* Here we want to call KDB with reason == NMI */
+	if (kdb_on) {
+		static int controlling_cpu = -1;
+
+		spin_lock(&uv_nmi_lock);
+		if (controlling_cpu == -1) {
+			controlling_cpu = smp_processor_id();
+			spin_unlock(&uv_nmi_lock);
+			(void)kdb(LKDB_REASON_NMI, reason, regs);
+			controlling_cpu = -1;
+		} else {
+			spin_unlock(&uv_nmi_lock);
+			(void)kdb(LKDB_REASON_ENTER_SLAVE, reason, regs);
+			while (controlling_cpu != -1)
+				cpu_relax();
+		}
+		handled = 1;	/* handled by KDB */
+	}
+#endif
 
-	if (unlikely(real_uv_nmi)) {
+	/* Only one cpu per blade needs to clear the MMR BMC NMI flag */
+	bid = uv_numa_blade_id();
+	if (uv_blade_info[bid].nmi_count < __get_cpu_var(cpu_nmi_count)) {
 		spin_lock(&uv_blade_info[bid].nmi_lock);
-		real_uv_nmi = (uv_read_local_mmr(UVH_NMI_MMR) & UV_NMI_PENDING_MASK);
-		if (real_uv_nmi) {
-			uv_blade_info[bid].nmi_count++;
-			uv_write_local_mmr(UVH_NMI_MMR_CLEAR, UV_NMI_PENDING_MASK);
+		if (uv_blade_info[bid].nmi_count <
+						__get_cpu_var(cpu_nmi_count)) {
+			uv_blade_info[bid].nmi_count =
+						__get_cpu_var(cpu_nmi_count);
+			clear_nmi_mmr();
 		}
 		spin_unlock(&uv_blade_info[bid].nmi_lock);
 	}
 
-	if (likely(__get_cpu_var(cpu_last_nmi_count) == uv_blade_info[bid].nmi_count))
-		return NMI_DONE;
+	/* If not handled by KDB, then print a process trace for each cpu */
+	if (!handled) {
+		int saved_console_loglevel = console_loglevel;
 
-	__get_cpu_var(cpu_last_nmi_count) = uv_blade_info[bid].nmi_count;
+		/*
+		 * Use a lock so only one cpu prints at a time.
+		 * This prevents intermixed output.  We can reuse the
+		 * uv_nmi_lock since if KDB was called, then all the
+		 * CPUs have exited KDB, and if it was not called,
+		 * then the lock was not used.
+		 */
+		spin_lock(&uv_nmi_lock);
+		pr_err("== UV NMI process trace NMI %lu: ==\n",
+						__get_cpu_var(cpu_nmi_count));
+		console_loglevel = 15;
+		show_regs(regs);
+		console_loglevel = saved_console_loglevel;
+		spin_unlock(&uv_nmi_lock);
+	}
 
-	/*
-	 * Use a lock so only one cpu prints at a time.
-	 * This prevents intermixed output.
-	 */
-	spin_lock(&uv_nmi_lock);
-	pr_info("UV NMI stack dump cpu %u:\n", smp_processor_id());
-	dump_stack();
-	spin_unlock(&uv_nmi_lock);
+	/* last cpu resets the "in nmi" flag */
+	if (atomic_sub_and_test(1, &nr_nmi_cpus) == 0)
+		in_uv_nmi = -1;
 
 	return NMI_HANDLED;
 }
 
 void uv_register_nmi_notifier(void)
 {
-	if (register_nmi_handler(NMI_UNKNOWN, uv_handle_nmi, 0, "uv"))
+	if (register_nmi_handler(NMI_LOCAL, uv_handle_nmi, 0, "uv"))
 		printk(KERN_WARNING "UV NMI handler failed to register\n");
 }
 

-- 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/3] UV Update NMI handlers
  2012-02-02 23:56 ` [PATCH 3/3] UV Update NMI handlers Mike Travis
@ 2012-02-03  7:31   ` Ingo Molnar
  2012-02-10 21:43   ` Don Zickus
  1 sibling, 0 replies; 8+ messages in thread
From: Ingo Molnar @ 2012-02-03  7:31 UTC (permalink / raw)
  To: Mike Travis; +Cc: Andrew Morton, Jack Steiner, x86, linux-kernel


* Mike Travis <travis@sgi.com> wrote:

> ---
>  arch/x86/kernel/apic/x2apic_uv_x.c |  132 +++++++++++++++++++++++++++++--------
>  1 file changed, 105 insertions(+), 27 deletions(-)

This patch is stock full of basic code cleanliness violations - 
it's a highly educational collection of various uglies so I'm 
going to ignore it.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [PATCH 1/3] x86 PCI: Fix identity mapping for sandy bridge
  2012-02-02 23:56 [PATCH 0/3] Updates for SGI UV1/UV2 systems Mike Travis
                   ` (2 preceding siblings ...)
  2012-02-02 23:56 ` [PATCH 3/3] UV Update NMI handlers Mike Travis
@ 2012-02-03 21:00 ` Mike Travis
  3 siblings, 0 replies; 8+ messages in thread
From: Mike Travis @ 2012-02-03 21:00 UTC (permalink / raw)
  To: Ingo Molnar, Andrew Morton, Jack Steiner
  Cc: x86, linux-kernel, Mike Habeck, David Woodhouse, Chris Wright,
	Daniel Rahn

[adding some more folks to the CC list for their comments.]

With SandyBridge, Intel has changed these Socket PCI devices to have a class
type of "System Peripheral" & "Performance counter", rather than "HostBridge".
So instead of using a "special" case to detect which devices will not be
doing DMA, use the fact that a device that is not associated with an IOMMU,
will not need an identity map.

Signed-off-by: Mike Travis <travis@sgi.com>
Signed-off-by: Mike Habeck <habeck@sgi.com>
---
 drivers/iommu/intel-iommu.c |   20 +++++++++-----------
 1 file changed, 9 insertions(+), 11 deletions(-)

--- linux.orig/drivers/iommu/intel-iommu.c
+++ linux/drivers/iommu/intel-iommu.c
@@ -48,8 +48,6 @@
 #define ROOT_SIZE		VTD_PAGE_SIZE
 #define CONTEXT_SIZE		VTD_PAGE_SIZE
 -#define IS_BRIDGE_HOST_DEVICE(pdev) \
-			    ((pdev->class >> 8) == PCI_CLASS_BRIDGE_HOST)
 #define IS_GFX_DEVICE(pdev) ((pdev->class >> 16) == PCI_BASE_CLASS_DISPLAY)
 #define IS_ISA_DEVICE(pdev) ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA)
 #define IS_AZALIA(pdev) ((pdev)->vendor == 0x8086 && (pdev)->device == 0x3a3e)
@@ -2369,18 +2367,18 @@ static int __init iommu_prepare_static_i
 		return -EFAULT;
  	for_each_pci_dev(pdev) {
-		/* Skip Host/PCI Bridge devices */
-		if (IS_BRIDGE_HOST_DEVICE(pdev))
-			continue;
 		if (iommu_should_identity_map(pdev, 1)) {
-			printk(KERN_INFO "IOMMU: %s identity mapping for device %s\n",
-			       hw ? "hardware" : "software", pci_name(pdev));
-
 			ret = domain_add_dev_info(si_domain, pdev,
-						     hw ? CONTEXT_TT_PASS_THROUGH :
-						     CONTEXT_TT_MULTI_LEVEL);
-			if (ret)
+					     hw ? CONTEXT_TT_PASS_THROUGH :
+						  CONTEXT_TT_MULTI_LEVEL);
+			if (ret) {
+				/* device not associated with an iommu */
+				if (ret == -ENODEV)
+					continue;
 				return ret;
+			}
+			pr_info("IOMMU: %s identity mapping for device %s\n",
+				hw ? "hardware" : "software", pci_name(pdev));
 		}
 	}
 
-- 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 3/3] UV Update NMI handlers
  2012-02-02 23:56 ` [PATCH 3/3] UV Update NMI handlers Mike Travis
  2012-02-03  7:31   ` Ingo Molnar
@ 2012-02-10 21:43   ` Don Zickus
  1 sibling, 0 replies; 8+ messages in thread
From: Don Zickus @ 2012-02-10 21:43 UTC (permalink / raw)
  To: Mike Travis; +Cc: Ingo Molnar, Andrew Morton, Jack Steiner, x86, linux-kernel

On Thu, Feb 02, 2012 at 05:56:53PM -0600, Mike Travis wrote:
> These changes update the UV NMI handler to be compatible with the new
> NMI processing.

Sorry for just noticing this now...

> -int uv_handle_nmi(unsigned int reason, struct pt_regs *regs)
> +int uv_handle_nmi(unsigned int cmd, struct pt_regs *regs)
>  {
> -	unsigned long real_uv_nmi;
> -	int bid;
> +	static int in_uv_nmi = -1;
> +	static atomic_t nr_nmi_cpus;
> +	int bid, handled = 0;
> +
> +	if (cmd != NMI_LOCAL && cmd != NMI_UNKNOWN)
> +		return NMI_DONE;

This shouldn't be necessary, you register on a list, either NMI_LOCAL or
NMI_UNKNOWN.  The reason for passing the command is in case you register
on both and need to separate the code paths.

If your handler is entered and the above if-statement is true, then the
code did something wrong with registering and should be fix.

For the most part that if-statement should never be needed now.

> +
> +	if (in_crash_kexec)
> +		/* do nothing if entering the crash kernel */
> +		return NMI_DONE;

This should also be not needed.  This was only necessary because the
priorities were screwed up and the kdump NMI handler was being called
after this handler.  Now the kdump handler is registered as the highest
priority and this one should never be called.  Again making this piece
unnecessary.

And if you do notice this in the future, then it is a bug and please let
me know so I can fix it.

> +
> +	/* note which NMI this one is */
> +	__this_cpu_inc(cpu_nmi_count);
>  
>  	/*
>  	 * Each blade has an MMR that indicates when an NMI has been sent
> -	 * to cpus on the blade. If an NMI is detected, atomically
> -	 * clear the MMR and update a per-blade NMI count used to
> -	 * cause each cpu on the blade to notice a new NMI.
> +	 * to cpus on the blade.  We optimize accesses to the MMR as the
> +	 * read operation is expensive and only the first CPU to enter this
> +	 * function needs to read the register and set a flag indicating
> +	 * we are indeed servicing an external BMC NMI.  Once it's determined
> +	 * whether or not this is a real NMI, the last responding CPU clears
> +	 * the flag for the next NMI.
>  	 */
> -	bid = uv_numa_blade_id();
> -	real_uv_nmi = (uv_read_local_mmr(UVH_NMI_MMR) & UV_NMI_PENDING_MASK);
> +	if (in_uv_nmi < 0) {
> +		spin_lock(&uv_nmi_reason_lock);
> +		if (in_uv_nmi < 0) {
> +			int nc = num_online_cpus();
> +			atomic_set(&nr_nmi_cpus, nc);
> +			in_uv_nmi = check_nmi_mmr();
> +		}
> +		spin_unlock(&uv_nmi_reason_lock);
> +	}
> +
> +	if (likely(in_uv_nmi == 0)) {
> +		if (atomic_sub_and_test(1, &nr_nmi_cpus) == 0)
> +			in_uv_nmi = -1;
> +		return NMI_DONE;
> +	}

I'm a little confused on this part.  Originally you had one NMI sent to
the system and it was an issue.  You guys changed the code so on UV
systems an external NMI went to _all_ cpus.  Now you are kinda converting
it back to one NMI.  

How about changing the x86_ops to only send one NMI to a cpu and then use
arch_trigger_all_cpu_backtrace() to generate a backtrace (through an NMI
IPI) to all cpus.  This would elminiate all the spinlock magic you are
doing here and still generate the backtrace you wanted (if that is still
the main reason for this NMI, IIRC).

> +
> +	/* If we are here, we are processing a real BMC NMI */
> +
> +#ifdef CONFIG_KGDB_KDB_NOT_YET
> +#include <linux/kdb.h>
> +
> +	/* Here we want to call KDB with reason == NMI */
> +	if (kdb_on) {
> +		static int controlling_cpu = -1;
> +
> +		spin_lock(&uv_nmi_lock);
> +		if (controlling_cpu == -1) {
> +			controlling_cpu = smp_processor_id();
> +			spin_unlock(&uv_nmi_lock);
> +			(void)kdb(LKDB_REASON_NMI, reason, regs);
> +			controlling_cpu = -1;
> +		} else {
> +			spin_unlock(&uv_nmi_lock);
> +			(void)kdb(LKDB_REASON_ENTER_SLAVE, reason, regs);
> +			while (controlling_cpu != -1)
> +				cpu_relax();
> +		}
> +		handled = 1;	/* handled by KDB */
> +	}
> +#endif

Would it just be easier to register another NMI handler for this case? I
don't think you need to access the BMC for this NMI? I believe it is an
artificial one and you can just register on NMI_UNKNOWN.  I think the
arch/x86/kernel/kgdb.c does this currently.

>  
> -	if (unlikely(real_uv_nmi)) {
> +	/* Only one cpu per blade needs to clear the MMR BMC NMI flag */
> +	bid = uv_numa_blade_id();
> +	if (uv_blade_info[bid].nmi_count < __get_cpu_var(cpu_nmi_count)) {
>  		spin_lock(&uv_blade_info[bid].nmi_lock);
> -		real_uv_nmi = (uv_read_local_mmr(UVH_NMI_MMR) & UV_NMI_PENDING_MASK);
> -		if (real_uv_nmi) {
> -			uv_blade_info[bid].nmi_count++;
> -			uv_write_local_mmr(UVH_NMI_MMR_CLEAR, UV_NMI_PENDING_MASK);
> +		if (uv_blade_info[bid].nmi_count <
> +						__get_cpu_var(cpu_nmi_count)) {
> +			uv_blade_info[bid].nmi_count =
> +						__get_cpu_var(cpu_nmi_count);
> +			clear_nmi_mmr();
>  		}
>  		spin_unlock(&uv_blade_info[bid].nmi_lock);
>  	}

Using one NMI I think elminates all this stuff too.

>  
> -	if (likely(__get_cpu_var(cpu_last_nmi_count) == uv_blade_info[bid].nmi_count))
> -		return NMI_DONE;
> +	/* If not handled by KDB, then print a process trace for each cpu */
> +	if (!handled) {
> +		int saved_console_loglevel = console_loglevel;
>  
> -	__get_cpu_var(cpu_last_nmi_count) = uv_blade_info[bid].nmi_count;
> +		/*
> +		 * Use a lock so only one cpu prints at a time.
> +		 * This prevents intermixed output.  We can reuse the
> +		 * uv_nmi_lock since if KDB was called, then all the
> +		 * CPUs have exited KDB, and if it was not called,
> +		 * then the lock was not used.
> +		 */
> +		spin_lock(&uv_nmi_lock);
> +		pr_err("== UV NMI process trace NMI %lu: ==\n",
> +						__get_cpu_var(cpu_nmi_count));
> +		console_loglevel = 15;
> +		show_regs(regs);
> +		console_loglevel = saved_console_loglevel;
> +		spin_unlock(&uv_nmi_lock);
> +	}

This sort of mimics what arch_trigger_all_cpu_backtrace_handler is doing.

>  
> -	/*
> -	 * Use a lock so only one cpu prints at a time.
> -	 * This prevents intermixed output.
> -	 */
> -	spin_lock(&uv_nmi_lock);
> -	pr_info("UV NMI stack dump cpu %u:\n", smp_processor_id());
> -	dump_stack();
> -	spin_unlock(&uv_nmi_lock);
> +	/* last cpu resets the "in nmi" flag */
> +	if (atomic_sub_and_test(1, &nr_nmi_cpus) == 0)
> +		in_uv_nmi = -1;
>  
>  	return NMI_HANDLED;
>  }
>  
>  void uv_register_nmi_notifier(void)
>  {
> -	if (register_nmi_handler(NMI_UNKNOWN, uv_handle_nmi, 0, "uv"))
> +	if (register_nmi_handler(NMI_LOCAL, uv_handle_nmi, 0, "uv"))
>  		printk(KERN_WARNING "UV NMI handler failed to register\n");

Right.  And then perhaps a second one for NMI_UNKNOWN for kgdb?

Cheers,
Don

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH 1/3] x86 PCI: Fix identity mapping for sandy bridge
  2012-02-02 23:56 ` [PATCH 1/3] x86 PCI: Fix identity mapping for sandy bridge Mike Travis
@ 2012-02-22 20:20   ` Andrew Morton
  0 siblings, 0 replies; 8+ messages in thread
From: Andrew Morton @ 2012-02-22 20:20 UTC (permalink / raw)
  To: Mike Travis; +Cc: Ingo Molnar, Jack Steiner, x86, linux-kernel, Mike Habeck

On Thu, 02 Feb 2012 17:56:51 -0600
Mike Travis <travis@sgi.com> wrote:

> --- linux.orig/drivers/iommu/intel-iommu.c
> +++ linux/drivers/iommu/intel-iommu.c
> @@ -48,8 +48,6 @@
>  #define ROOT_SIZE		VTD_PAGE_SIZE
>  #define CONTEXT_SIZE		VTD_PAGE_SIZE
>  
> -#define IS_BRIDGE_HOST_DEVICE(pdev) \
> -			    ((pdev->class >> 8) == PCI_CLASS_BRIDGE_HOST)
>  #define IS_GFX_DEVICE(pdev) ((pdev->class >> 16) == PCI_BASE_CLASS_DISPLAY)
>  #define IS_ISA_DEVICE(pdev) ((pdev->class >> 8) == PCI_CLASS_BRIDGE_ISA)
>  #define IS_AZALIA(pdev) ((pdev)->vendor == 0x8086 && (pdev)->device == 0x3a3e)

I suppose it would be tiresome to point out what happens when someone
does IS_GFX_DEVICE(pdev + 1).

Macros: Just Don't Do It.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-02-22 20:20 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-02 23:56 [PATCH 0/3] Updates for SGI UV1/UV2 systems Mike Travis
2012-02-02 23:56 ` [PATCH 1/3] x86 PCI: Fix identity mapping for sandy bridge Mike Travis
2012-02-22 20:20   ` Andrew Morton
2012-02-02 23:56 ` [PATCH 2/3] x86, pci: Increase the number of iommus supported to be MAX_IO_APICS Mike Travis
2012-02-02 23:56 ` [PATCH 3/3] UV Update NMI handlers Mike Travis
2012-02-03  7:31   ` Ingo Molnar
2012-02-10 21:43   ` Don Zickus
2012-02-03 21:00 ` [PATCH 1/3] x86 PCI: Fix identity mapping for sandy bridge Mike Travis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).