linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [patch 0/4] x86, intr-remapping patches for addressing kexec/kdump issues
@ 2010-12-01  6:22 Suresh Siddha
  2010-12-01  6:22 ` [patch 1/4] vt-d: quirk for masking vtd spec errors to platform error handling logic Suresh Siddha
                   ` (3 more replies)
  0 siblings, 4 replies; 21+ messages in thread
From: Suresh Siddha @ 2010-12-01  6:22 UTC (permalink / raw)
  To: tglx, mingo, hpa, linux-kernel
  Cc: Kenji Kaneshige, Chris Wright, Max Asbock, indou.takao,
	Jesse Barnes, Bjorn Helgaas, David Woodhouse, stable

Following patches address/workaround the issues we have identified
with the interrupt-remapping code flow while debugging hangs/spurious NMI's
we have seen with different OEM platforms during kexec/kdump in the
presence of interrupt-remapping (and x2apic in some cases).

All the patches are small and self-contained and are marked stable
as it makes kexec/kdump functional on these platforms. While some of these
patches touch pci files and self-contained, I would appreciate if all
these patches get routed to Linus tree (for v2.6.37) through -tip tree.

thanks,
suresh


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [patch 1/4] vt-d: quirk for masking vtd spec errors to platform error handling logic
  2010-12-01  6:22 [patch 0/4] x86, intr-remapping patches for addressing kexec/kdump issues Suresh Siddha
@ 2010-12-01  6:22 ` Suresh Siddha
  2010-12-01  7:26   ` Chris Wright
  2010-12-06 17:27   ` Jesse Barnes
  2010-12-01  6:22 ` [patch 2/4] x86, vtd: fix the vt-d fault handling irq migration in the x2apic mode Suresh Siddha
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 21+ messages in thread
From: Suresh Siddha @ 2010-12-01  6:22 UTC (permalink / raw)
  To: tglx, mingo, hpa, linux-kernel
  Cc: Kenji Kaneshige, Chris Wright, Max Asbock, indou.takao,
	Jesse Barnes, Bjorn Helgaas, David Woodhouse, Suresh Siddha,
	stable

[-- Attachment #1: vtd_quirk_mask_spec_errors.patch --]
[-- Type: text/plain, Size: 2433 bytes --]

On platforms with Intel 7500 chipset, there were some reports of system
hang/NMI's during kexec/kdump in the presence of interrupt-remapping enabled.

During kdump, there is a window where the devices might be still using old
kernel's interrupt information, while the kdump kernel is coming up. This can
cause vt-d faults as the interrupt configuration from the old kernel map to
null IRTE entries in the new kernel etc. (with out interrupt-remapping enabled,
we still have the same issue but in this case we will see benign spurious
interrupt hit the new kernel).

Based on platform config settings, these platforms seem to generate NMI/SMI
when a vt-d fault happens and there were reports that the resulting SMI causes
the  system to hang.

Fix it by masking vt-d spec defined errors to platform error reporting logic.
VT-d spec related errors are already handled by the VT-d OS code, so need to
report the same erorr through other channels.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: stable@kernel.org [v2.6.32+]
---
 drivers/pci/quirks.c |   20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

Index: tip/drivers/pci/quirks.c
===================================================================
--- tip.orig/drivers/pci/quirks.c
+++ tip/drivers/pci/quirks.c
@@ -2764,6 +2764,26 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_RI
 DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_RICOH, PCI_DEVICE_ID_RICOH_R5C832, ricoh_mmc_fixup_r5c832);
 #endif /*CONFIG_MMC_RICOH_MMC*/
 
+#if defined(CONFIG_DMAR) || defined(CONFIG_INTR_REMAP)
+/*
+ * This is a quirk for masking vt-d spec defined errors to platform error
+ * handling logic. With out this, platforms seem to generate NMI/SMI (based
+ * on the RAS config settings of the platform) when a vt-d fault happens and
+ * there were reports that the resulting SMI causes system to hang.
+ *
+ * VT-d spec related errors are already handled by the VT-d OS code, so no
+ * need to report the same erorr through other channels.
+ */
+static void vtd_mask_spec_errors(struct pci_dev *dev)
+{
+	u32 word;
+
+	pci_read_config_dword(dev, 0x1AC, &word);
+	pci_write_config_dword(dev, 0x1AC, word | (1 << 31));
+}
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x342e, vtd_mask_spec_errors);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x3c28, vtd_mask_spec_errors);
+#endif
 
 static void pci_do_fixups(struct pci_dev *dev, struct pci_fixup *f,
 			  struct pci_fixup *end)



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [patch 2/4] x86, vtd: fix the vt-d fault handling irq migration in the x2apic mode
  2010-12-01  6:22 [patch 0/4] x86, intr-remapping patches for addressing kexec/kdump issues Suresh Siddha
  2010-12-01  6:22 ` [patch 1/4] vt-d: quirk for masking vtd spec errors to platform error handling logic Suresh Siddha
@ 2010-12-01  6:22 ` Suresh Siddha
  2010-12-01  8:52   ` Chris Wright
  2010-12-01 15:14   ` Bjorn Helgaas
  2010-12-01  6:22 ` [patch 3/4] x86: enable the intr-remap fault handling after local apic setup Suresh Siddha
  2010-12-01  6:22 ` [patch 4/4] vt-d: handle previous faults after enabling fault handling Suresh Siddha
  3 siblings, 2 replies; 21+ messages in thread
From: Suresh Siddha @ 2010-12-01  6:22 UTC (permalink / raw)
  To: tglx, mingo, hpa, linux-kernel
  Cc: Kenji Kaneshige, Chris Wright, Max Asbock, indou.takao,
	Jesse Barnes, Bjorn Helgaas, David Woodhouse, Suresh Siddha,
	stable

[-- Attachment #1: fix_dmar_set_affinity.patch --]
[-- Type: text/plain, Size: 1076 bytes --]

From: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Subject: x86, vtd: fix the vt-d fault handling irq migration in the x2apic mode

In x2apic mode, we need to set the upper address register of the fault
handling interrupt register of the vt-d hardware. Without this
irq migration of the vt-d fault handling interrupt is broken.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: stable@kernel.org [v2.6.32+]
---
 arch/x86/kernel/apic/io_apic.c |    2 ++
 1 file changed, 2 insertions(+)

Index: tip/arch/x86/kernel/apic/io_apic.c
===================================================================
--- tip.orig/arch/x86/kernel/apic/io_apic.c
+++ tip/arch/x86/kernel/apic/io_apic.c
@@ -3367,6 +3367,8 @@ dmar_msi_set_affinity(struct irq_data *d
 	msg.data |= MSI_DATA_VECTOR(cfg->vector);
 	msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
 	msg.address_lo |= MSI_ADDR_DEST_ID(dest);
+	if (x2apic_mode)
+		msg.address_hi = MSI_ADDR_BASE_HI | MSI_ADDR_EXT_DEST_ID(dest);
 
 	dmar_msi_write(irq, &msg);
 



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [patch 3/4] x86: enable the intr-remap fault handling after local apic setup
  2010-12-01  6:22 [patch 0/4] x86, intr-remapping patches for addressing kexec/kdump issues Suresh Siddha
  2010-12-01  6:22 ` [patch 1/4] vt-d: quirk for masking vtd spec errors to platform error handling logic Suresh Siddha
  2010-12-01  6:22 ` [patch 2/4] x86, vtd: fix the vt-d fault handling irq migration in the x2apic mode Suresh Siddha
@ 2010-12-01  6:22 ` Suresh Siddha
  2010-12-01  8:51   ` Chris Wright
  2010-12-14  1:16   ` [tip:x86/urgent] x86: Enable the intr-remap fault handling after local APIC setup tip-bot for Kenji Kaneshige
  2010-12-01  6:22 ` [patch 4/4] vt-d: handle previous faults after enabling fault handling Suresh Siddha
  3 siblings, 2 replies; 21+ messages in thread
From: Suresh Siddha @ 2010-12-01  6:22 UTC (permalink / raw)
  To: tglx, mingo, hpa, linux-kernel
  Cc: Kenji Kaneshige, Chris Wright, Max Asbock, indou.takao,
	Jesse Barnes, Bjorn Helgaas, David Woodhouse, Suresh Siddha,
	stable

[-- Attachment #1: vtd_fault_handling_after_local_apic_setup.patch --]
[-- Type: text/plain, Size: 2188 bytes --]

From: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Subject: x86: enable the intr-remap fault handling after local apic setup

Interrupt-remapping gets enabled very early in the boot, as it determines the
apic mode that the processor can use. And the current code enables the vt-d
fault handling before the setup_local_APIC(). And hence the APIC LDR registers
and data structure in the memory may not be initialized. So the vt-d fault
handling in logical xapic/x2apic modes were broken.

Fix this by enabling the vt-d fault handling in the end_local_APIC_setup()

A cleaner fix of enabling fault handling while enabling intr-remapping
will be addressed for v2.6.38. [ Enabling intr-remapping determines the
usage of x2apic mode and the apic mode determines the fault-handling
configuration. ]

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: stable@kernel.org [v2.6.32+]
---
 arch/x86/kernel/apic/apic.c     |    8 ++++++++
 arch/x86/kernel/apic/probe_64.c |    7 -------
 2 files changed, 8 insertions(+), 7 deletions(-)

Index: tip/arch/x86/kernel/apic/apic.c
===================================================================
--- tip.orig/arch/x86/kernel/apic/apic.c
+++ tip/arch/x86/kernel/apic/apic.c
@@ -1384,6 +1384,14 @@ void __cpuinit end_local_APIC_setup(void
 #endif
 
 	apic_pm_activate();
+
+	/*
+	 * Now that local APIC setup is completed for BP, configure the fault
+	 * handling for interrupt remapping.
+	 */
+	if (!smp_processor_id() && intr_remapping_enabled)
+		enable_drhd_fault_handling();
+
 }
 
 #ifdef CONFIG_X86_X2APIC
Index: tip/arch/x86/kernel/apic/probe_64.c
===================================================================
--- tip.orig/arch/x86/kernel/apic/probe_64.c
+++ tip/arch/x86/kernel/apic/probe_64.c
@@ -79,13 +79,6 @@ void __init default_setup_apic_routing(v
 		/* need to update phys_pkg_id */
 		apic->phys_pkg_id = apicid_phys_pkg_id;
 	}
-
-	/*
-	 * Now that apic routing model is selected, configure the
-	 * fault handling for intr remapping.
-	 */
-	if (intr_remapping_enabled)
-		enable_drhd_fault_handling();
 }
 
 /* Same for both flat and physical. */



^ permalink raw reply	[flat|nested] 21+ messages in thread

* [patch 4/4] vt-d: handle previous faults after enabling fault handling
  2010-12-01  6:22 [patch 0/4] x86, intr-remapping patches for addressing kexec/kdump issues Suresh Siddha
                   ` (2 preceding siblings ...)
  2010-12-01  6:22 ` [patch 3/4] x86: enable the intr-remap fault handling after local apic setup Suresh Siddha
@ 2010-12-01  6:22 ` Suresh Siddha
  2010-12-01  8:52   ` Chris Wright
  2010-12-14  1:17   ` [tip:x86/urgent] x86, vt-d: Handle " tip-bot for Suresh Siddha
  3 siblings, 2 replies; 21+ messages in thread
From: Suresh Siddha @ 2010-12-01  6:22 UTC (permalink / raw)
  To: tglx, mingo, hpa, linux-kernel
  Cc: Kenji Kaneshige, Chris Wright, Max Asbock, indou.takao,
	Jesse Barnes, Bjorn Helgaas, David Woodhouse, Suresh Siddha,
	stable

[-- Attachment #1: vtd_handle_oldfaults_in_enable_drhd_fault_handling.patch --]
[-- Type: text/plain, Size: 1132 bytes --]

Fault handling is getting enabled after enabling the interrupt-remapping (as
the success of interrupt-remapping can affect the apic mode and hence the
fault handling mode).

Hence there can potentially be some faults between the window of enabling
interrupt-remapping in the vt-d and the fault-handling of the vt-d units.

Handle any previous faults after enabling the vt-d fault handling.

For v2.6.38 cleanup, need to check if we can remove the dmar_fault() in the
enable_intr_remapping() and see if we can enable fault handling along with
enabling intr-remapping.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: stable@kernel.org [v2.6.32+]
---
 drivers/pci/dmar.c |    5 +++++
 1 file changed, 5 insertions(+)

Index: tip/drivers/pci/dmar.c
===================================================================
--- tip.orig/drivers/pci/dmar.c
+++ tip/drivers/pci/dmar.c
@@ -1417,6 +1417,11 @@ int __init enable_drhd_fault_handling(vo
 			       (unsigned long long)drhd->reg_base_addr, ret);
 			return -1;
 		}
+
+		/*
+ 		 * Clear any previous faults.
+ 		 */
+		dmar_fault(iommu->irq, iommu);
 	}
 
 	return 0;



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 1/4] vt-d: quirk for masking vtd spec errors to platform error handling logic
  2010-12-01  6:22 ` [patch 1/4] vt-d: quirk for masking vtd spec errors to platform error handling logic Suresh Siddha
@ 2010-12-01  7:26   ` Chris Wright
  2010-12-06 17:27   ` Jesse Barnes
  1 sibling, 0 replies; 21+ messages in thread
From: Chris Wright @ 2010-12-01  7:26 UTC (permalink / raw)
  To: Suresh Siddha
  Cc: tglx, mingo, hpa, linux-kernel, Kenji Kaneshige, Chris Wright,
	Max Asbock, indou.takao, Jesse Barnes, Bjorn Helgaas,
	David Woodhouse, stable

* Suresh Siddha (suresh.b.siddha@intel.com) wrote:
> On platforms with Intel 7500 chipset, there were some reports of system
> hang/NMI's during kexec/kdump in the presence of interrupt-remapping enabled.
> 
> During kdump, there is a window where the devices might be still using old
> kernel's interrupt information, while the kdump kernel is coming up. This can
> cause vt-d faults as the interrupt configuration from the old kernel map to
> null IRTE entries in the new kernel etc. (with out interrupt-remapping enabled,
> we still have the same issue but in this case we will see benign spurious
> interrupt hit the new kernel).
> 
> Based on platform config settings, these platforms seem to generate NMI/SMI
> when a vt-d fault happens and there were reports that the resulting SMI causes
> the  system to hang.
> 
> Fix it by masking vt-d spec defined errors to platform error reporting logic.
> VT-d spec related errors are already handled by the VT-d OS code, so need to
> report the same erorr through other channels.
> 
> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>

Acked-by: Chris Wright <chrisw@sous-sol.org>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 3/4] x86: enable the intr-remap fault handling after local apic setup
  2010-12-01  6:22 ` [patch 3/4] x86: enable the intr-remap fault handling after local apic setup Suresh Siddha
@ 2010-12-01  8:51   ` Chris Wright
  2010-12-14  1:16   ` [tip:x86/urgent] x86: Enable the intr-remap fault handling after local APIC setup tip-bot for Kenji Kaneshige
  1 sibling, 0 replies; 21+ messages in thread
From: Chris Wright @ 2010-12-01  8:51 UTC (permalink / raw)
  To: Suresh Siddha
  Cc: tglx, mingo, hpa, linux-kernel, Kenji Kaneshige, Chris Wright,
	Max Asbock, indou.takao, Jesse Barnes, Bjorn Helgaas,
	David Woodhouse, stable

* Suresh Siddha (suresh.b.siddha@intel.com) wrote:
> From: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
> Subject: x86: enable the intr-remap fault handling after local apic setup
> 
> Interrupt-remapping gets enabled very early in the boot, as it determines the
> apic mode that the processor can use. And the current code enables the vt-d
> fault handling before the setup_local_APIC(). And hence the APIC LDR registers
> and data structure in the memory may not be initialized. So the vt-d fault
> handling in logical xapic/x2apic modes were broken.
> 
> Fix this by enabling the vt-d fault handling in the end_local_APIC_setup()
> 
> A cleaner fix of enabling fault handling while enabling intr-remapping
> will be addressed for v2.6.38. [ Enabling intr-remapping determines the
> usage of x2apic mode and the apic mode determines the fault-handling
> configuration. ]
> 
> Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>

Acked-by: Chris Wright <chrisw@sous-sol.org>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 2/4] x86, vtd: fix the vt-d fault handling irq migration in the x2apic mode
  2010-12-01  6:22 ` [patch 2/4] x86, vtd: fix the vt-d fault handling irq migration in the x2apic mode Suresh Siddha
@ 2010-12-01  8:52   ` Chris Wright
  2010-12-01 15:14   ` Bjorn Helgaas
  1 sibling, 0 replies; 21+ messages in thread
From: Chris Wright @ 2010-12-01  8:52 UTC (permalink / raw)
  To: Suresh Siddha
  Cc: tglx, mingo, hpa, linux-kernel, Kenji Kaneshige, Chris Wright,
	Max Asbock, indou.takao, Jesse Barnes, Bjorn Helgaas,
	David Woodhouse, stable

* Suresh Siddha (suresh.b.siddha@intel.com) wrote:
> From: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
> Subject: x86, vtd: fix the vt-d fault handling irq migration in the x2apic mode
> 
> In x2apic mode, we need to set the upper address register of the fault
> handling interrupt register of the vt-d hardware. Without this
> irq migration of the vt-d fault handling interrupt is broken.
> 
> Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>

Looks correct, I didn't have a chance to test this patch.

Acked-by: Chris Wright <chrisw@sous-sol.org>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 4/4] vt-d: handle previous faults after enabling fault handling
  2010-12-01  6:22 ` [patch 4/4] vt-d: handle previous faults after enabling fault handling Suresh Siddha
@ 2010-12-01  8:52   ` Chris Wright
  2010-12-14  1:17   ` [tip:x86/urgent] x86, vt-d: Handle " tip-bot for Suresh Siddha
  1 sibling, 0 replies; 21+ messages in thread
From: Chris Wright @ 2010-12-01  8:52 UTC (permalink / raw)
  To: Suresh Siddha
  Cc: tglx, mingo, hpa, linux-kernel, Kenji Kaneshige, Chris Wright,
	Max Asbock, indou.takao, Jesse Barnes, Bjorn Helgaas,
	David Woodhouse, stable

* Suresh Siddha (suresh.b.siddha@intel.com) wrote:
> Fault handling is getting enabled after enabling the interrupt-remapping (as
> the success of interrupt-remapping can affect the apic mode and hence the
> fault handling mode).
> 
> Hence there can potentially be some faults between the window of enabling
> interrupt-remapping in the vt-d and the fault-handling of the vt-d units.
> 
> Handle any previous faults after enabling the vt-d fault handling.
> 
> For v2.6.38 cleanup, need to check if we can remove the dmar_fault() in the
> enable_intr_remapping() and see if we can enable fault handling along with
> enabling intr-remapping.
> 
> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>

Acked-by: Chris Wright <chrisw@sous-sol.org>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 2/4] x86, vtd: fix the vt-d fault handling irq migration in the x2apic mode
  2010-12-01  6:22 ` [patch 2/4] x86, vtd: fix the vt-d fault handling irq migration in the x2apic mode Suresh Siddha
  2010-12-01  8:52   ` Chris Wright
@ 2010-12-01 15:14   ` Bjorn Helgaas
  2010-12-01 17:40     ` Suresh Siddha
  1 sibling, 1 reply; 21+ messages in thread
From: Bjorn Helgaas @ 2010-12-01 15:14 UTC (permalink / raw)
  To: Suresh Siddha
  Cc: tglx, mingo, hpa, linux-kernel, Kenji Kaneshige, Chris Wright,
	Max Asbock, indou.takao, Jesse Barnes, David Woodhouse, stable,
	Tony Luck

On Tue, Nov 30, 2010 at 10:22:27PM -0800, Suresh Siddha wrote:
> From: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
> Subject: x86, vtd: fix the vt-d fault handling irq migration in the x2apic mode
> 
> In x2apic mode, we need to set the upper address register of the fault
> handling interrupt register of the vt-d hardware. Without this
> irq migration of the vt-d fault handling interrupt is broken.
> 
> Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
> Cc: stable@kernel.org [v2.6.32+]
> ---
>  arch/x86/kernel/apic/io_apic.c |    2 ++
>  1 file changed, 2 insertions(+)
> 
> Index: tip/arch/x86/kernel/apic/io_apic.c
> ===================================================================
> --- tip.orig/arch/x86/kernel/apic/io_apic.c
> +++ tip/arch/x86/kernel/apic/io_apic.c
> @@ -3367,6 +3367,8 @@ dmar_msi_set_affinity(struct irq_data *d
>  	msg.data |= MSI_DATA_VECTOR(cfg->vector);
>  	msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
>  	msg.address_lo |= MSI_ADDR_DEST_ID(dest);
> +	if (x2apic_mode)
> +		msg.address_hi = MSI_ADDR_BASE_HI | MSI_ADDR_EXT_DEST_ID(dest);

Is it necessary to test x2apic_mode here?  It looks like
MSI_ADDR_EXT_DEST_ID() gives you everything above the low 8
bits of the APIC ID.  If those bits are always zero except in
x2apic_mode, we might not need the test.

Does the ia64 dmar_msi_set_affinity() need the same fix?

Why do we have both x2apic_enabled() and x2apic_mode?  They
seem sort of redundant.  (Not related to this patch, of course.)

Bjorn

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 2/4] x86, vtd: fix the vt-d fault handling irq migration in the x2apic mode
  2010-12-01 15:14   ` Bjorn Helgaas
@ 2010-12-01 17:40     ` Suresh Siddha
  2010-12-07 17:38       ` Takao Indoh
  2010-12-14  1:16       ` [tip:x86/urgent] x86, vt-d: Fix " tip-bot for Kenji Kaneshige
  0 siblings, 2 replies; 21+ messages in thread
From: Suresh Siddha @ 2010-12-01 17:40 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: tglx, mingo, hpa, linux-kernel, Kenji Kaneshige, Chris Wright,
	Max Asbock, indou.takao, Jesse Barnes, David Woodhouse, stable,
	Luck, Tony

On Wed, 2010-12-01 at 07:14 -0800, Bjorn Helgaas wrote:
> On Tue, Nov 30, 2010 at 10:22:27PM -0800, Suresh Siddha wrote:
> > +	if (x2apic_mode)
> > +		msg.address_hi = MSI_ADDR_BASE_HI | MSI_ADDR_EXT_DEST_ID(dest);
> 
> Is it necessary to test x2apic_mode here?  It looks like
> MSI_ADDR_EXT_DEST_ID() gives you everything above the low 8
> bits of the APIC ID.  If those bits are always zero except in
> x2apic_mode, we might not need the test.

True. Appended the updated patch.

> Does the ia64 dmar_msi_set_affinity() need the same fix?

No.

> 
> Why do we have both x2apic_enabled() and x2apic_mode?  They
> seem sort of redundant.  (Not related to this patch, of course.)

BIOS can handover to OS in x2apic mode in some cases. x2apic_enabled()
is used to check for that and it reads the MSR to check the status. Some
early portions of the kernel boot will use it.

For all others, we should be using x2apic_mode.

thanks,
suresh
---

From: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Subject: x86, vtd: fix the vt-d fault handling irq migration in the x2apic mode

In x2apic mode, we need to set the upper address register of the fault
handling interrupt register of the vt-d hardware. Without this
irq migration of the vt-d fault handling interrupt is broken.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: stable@kernel.org [v2.6.32+]
---
 arch/x86/kernel/apic/io_apic.c |    1 +
 1 file changed, 1 insertion(+)

Index: tip/arch/x86/kernel/apic/io_apic.c
===================================================================
--- tip.orig/arch/x86/kernel/apic/io_apic.c
+++ tip/arch/x86/kernel/apic/io_apic.c
@@ -3367,6 +3367,7 @@ dmar_msi_set_affinity(struct irq_data *d
 	msg.data |= MSI_DATA_VECTOR(cfg->vector);
 	msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
 	msg.address_lo |= MSI_ADDR_DEST_ID(dest);
+	msg.address_hi = MSI_ADDR_BASE_HI | MSI_ADDR_EXT_DEST_ID(dest);
 
 	dmar_msi_write(irq, &msg);
 



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 1/4] vt-d: quirk for masking vtd spec errors to platform error handling logic
  2010-12-01  6:22 ` [patch 1/4] vt-d: quirk for masking vtd spec errors to platform error handling logic Suresh Siddha
  2010-12-01  7:26   ` Chris Wright
@ 2010-12-06 17:27   ` Jesse Barnes
  2010-12-06 20:26     ` Suresh Siddha
  1 sibling, 1 reply; 21+ messages in thread
From: Jesse Barnes @ 2010-12-06 17:27 UTC (permalink / raw)
  To: Suresh Siddha
  Cc: tglx, mingo, hpa, linux-kernel, Kenji Kaneshige, Chris Wright,
	Max Asbock, indou.takao, Bjorn Helgaas, David Woodhouse, stable

On Tue, 30 Nov 2010 22:22:26 -0800
Suresh Siddha <suresh.b.siddha@intel.com> wrote:

> On platforms with Intel 7500 chipset, there were some reports of system
> hang/NMI's during kexec/kdump in the presence of interrupt-remapping enabled.
> 
> During kdump, there is a window where the devices might be still using old
> kernel's interrupt information, while the kdump kernel is coming up. This can
> cause vt-d faults as the interrupt configuration from the old kernel map to
> null IRTE entries in the new kernel etc. (with out interrupt-remapping enabled,
> we still have the same issue but in this case we will see benign spurious
> interrupt hit the new kernel).
> 
> Based on platform config settings, these platforms seem to generate NMI/SMI
> when a vt-d fault happens and there were reports that the resulting SMI causes
> the  system to hang.
> 
> Fix it by masking vt-d spec defined errors to platform error reporting logic.
> VT-d spec related errors are already handled by the VT-d OS code, so need to
> report the same erorr through other channels.
> 
> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
> Cc: stable@kernel.org [v2.6.32+]
> ---
>  drivers/pci/quirks.c |   20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)
> 
> Index: tip/drivers/pci/quirks.c
> ===================================================================
> --- tip.orig/drivers/pci/quirks.c
> +++ tip/drivers/pci/quirks.c
> @@ -2764,6 +2764,26 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_RI
>  DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_RICOH, PCI_DEVICE_ID_RICOH_R5C832, ricoh_mmc_fixup_r5c832);
>  #endif /*CONFIG_MMC_RICOH_MMC*/
>  
> +#if defined(CONFIG_DMAR) || defined(CONFIG_INTR_REMAP)
> +/*
> + * This is a quirk for masking vt-d spec defined errors to platform error
> + * handling logic. With out this, platforms seem to generate NMI/SMI (based
> + * on the RAS config settings of the platform) when a vt-d fault happens and
> + * there were reports that the resulting SMI causes system to hang.
> + *
> + * VT-d spec related errors are already handled by the VT-d OS code, so no
> + * need to report the same erorr through other channels.
> + */
> +static void vtd_mask_spec_errors(struct pci_dev *dev)
> +{
> +	u32 word;
> +
> +	pci_read_config_dword(dev, 0x1AC, &word);
> +	pci_write_config_dword(dev, 0x1AC, word | (1 << 31));
> +}
> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x342e, vtd_mask_spec_errors);
> +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x3c28, vtd_mask_spec_errors);
> +#endif
>  
>  static void pci_do_fixups(struct pci_dev *dev, struct pci_fixup *f,
>  			  struct pci_fixup *end)

Can we make these registers and bits a bit more self-documenting (i.e.
#defines for both, maybe along with other useful bit definitions for
this reg)? Also, "error" is misspelled as "erorr" above. :)

-- 
Jesse Barnes, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 1/4] vt-d: quirk for masking vtd spec errors to platform error handling logic
  2010-12-06 17:27   ` Jesse Barnes
@ 2010-12-06 20:26     ` Suresh Siddha
  2010-12-06 20:44       ` Jesse Barnes
  2010-12-14  1:15       ` [tip:x86/urgent] x86, vt-d: Quirk " tip-bot for Suresh Siddha
  0 siblings, 2 replies; 21+ messages in thread
From: Suresh Siddha @ 2010-12-06 20:26 UTC (permalink / raw)
  To: Jesse Barnes
  Cc: tglx, mingo, hpa, linux-kernel, Kenji Kaneshige, Chris Wright,
	Max Asbock, indou.takao, Bjorn Helgaas, David Woodhouse, stable

On Mon, 2010-12-06 at 09:27 -0800, Jesse Barnes wrote:
> Can we make these registers and bits a bit more self-documenting (i.e.
> #defines for both, maybe along with other useful bit definitions for
> this reg)? Also, "error" is misspelled as "erorr" above. :)

Thanks for the review. Appended the updated patch. I haven't used
#defines for the pci-id's, as the first one (IOH) is used by several
chipsets and the second one is not named yet.

---

From: Suresh Siddha <suresh.b.siddha@intel.com>
Subject: vt-d: quirk for masking vtd spec errors to platform error handling logic

On platforms with Intel 7500 chipset, there were some reports of system
hang/NMI's during kexec/kdump in the presence of interrupt-remapping enabled.

During kdump, there is a window where the devices might be still using old
kernel's interrupt information, while the kdump kernel is coming up. This can
cause vt-d faults as the interrupt configuration from the old kernel map to
null IRTE entries in the new kernel etc. (with out interrupt-remapping enabled,
we still have the same issue but in this case we will see benign spurious
interrupt hit the new kernel).

Based on platform config settings, these platforms seem to generate NMI/SMI
when a vt-d fault happens and there were reports that the resulting SMI causes
the  system to hang.

Fix it by masking vt-d spec defined errors to platform error reporting logic.
VT-d spec related errors are already handled by the VT-d OS code, so need to
report the same error through other channels.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: stable@kernel.org	[v2.6.32+]
---
 drivers/pci/quirks.c |   23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

Index: tip/drivers/pci/quirks.c
===================================================================
--- tip.orig/drivers/pci/quirks.c
+++ tip/drivers/pci/quirks.c
@@ -2764,6 +2764,29 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_RI
 DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_RICOH, PCI_DEVICE_ID_RICOH_R5C832, ricoh_mmc_fixup_r5c832);
 #endif /*CONFIG_MMC_RICOH_MMC*/
 
+#if defined(CONFIG_DMAR) || defined(CONFIG_INTR_REMAP)
+#define VTUNCERRMSK_REG	0x1ac
+#define VTD_MSK_SPEC_ERRORS	(1 << 31)
+/*
+ * This is a quirk for masking vt-d spec defined errors to platform error
+ * handling logic. With out this, platforms using Intel 7500, 5500 chipsets
+ * (and the derivative chipsets like X58 etc) seem to generate NMI/SMI (based
+ * on the RAS config settings of the platform) when a vt-d fault happens.
+ * The resulting SMI caused the system to hang.
+ *
+ * VT-d spec related errors are already handled by the VT-d OS code, so no
+ * need to report the same error through other channels.
+ */
+static void vtd_mask_spec_errors(struct pci_dev *dev)
+{
+	u32 word;
+
+	pci_read_config_dword(dev, VTUNCERRMSK_REG, &word);
+	pci_write_config_dword(dev, VTUNCERRMSK_REG, word | VTD_MSK_SPEC_ERRORS);
+}
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x342e, vtd_mask_spec_errors);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x3c28, vtd_mask_spec_errors);
+#endif
 
 static void pci_do_fixups(struct pci_dev *dev, struct pci_fixup *f,
 			  struct pci_fixup *end)



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 1/4] vt-d: quirk for masking vtd spec errors to platform error handling logic
  2010-12-06 20:26     ` Suresh Siddha
@ 2010-12-06 20:44       ` Jesse Barnes
  2010-12-06 21:02         ` Suresh Siddha
  2010-12-14  1:15       ` [tip:x86/urgent] x86, vt-d: Quirk " tip-bot for Suresh Siddha
  1 sibling, 1 reply; 21+ messages in thread
From: Jesse Barnes @ 2010-12-06 20:44 UTC (permalink / raw)
  To: Suresh Siddha
  Cc: tglx, mingo, hpa, linux-kernel, Kenji Kaneshige, Chris Wright,
	Max Asbock, indou.takao, Bjorn Helgaas, David Woodhouse, stable

On Mon, 06 Dec 2010 12:26:30 -0800
Suresh Siddha <suresh.b.siddha@intel.com> wrote:

> On Mon, 2010-12-06 at 09:27 -0800, Jesse Barnes wrote:
> > Can we make these registers and bits a bit more self-documenting (i.e.
> > #defines for both, maybe along with other useful bit definitions for
> > this reg)? Also, "error" is misspelled as "erorr" above. :)
> 
> Thanks for the review. Appended the updated patch. I haven't used
> #defines for the pci-id's, as the first one (IOH) is used by several
> chipsets and the second one is not named yet.

Is there a bug # that should be referenced in the commit log?  Any
tested-bys to add?

Thanks,
-- 
Jesse Barnes, Intel Open Source Technology Center

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 1/4] vt-d: quirk for masking vtd spec errors to platform error handling logic
  2010-12-06 20:44       ` Jesse Barnes
@ 2010-12-06 21:02         ` Suresh Siddha
  2010-12-06 23:01           ` Max Asbock
  0 siblings, 1 reply; 21+ messages in thread
From: Suresh Siddha @ 2010-12-06 21:02 UTC (permalink / raw)
  To: Jesse Barnes
  Cc: tglx, mingo, hpa, linux-kernel, Kenji Kaneshige, Chris Wright,
	Max Asbock, indou.takao, Bjorn Helgaas, David Woodhouse, stable

On Mon, 2010-12-06 at 12:44 -0800, Jesse Barnes wrote:
> On Mon, 06 Dec 2010 12:26:30 -0800
> Suresh Siddha <suresh.b.siddha@intel.com> wrote:
> 
> > On Mon, 2010-12-06 at 09:27 -0800, Jesse Barnes wrote:
> > > Can we make these registers and bits a bit more self-documenting (i.e.
> > > #defines for both, maybe along with other useful bit definitions for
> > > this reg)? Also, "error" is misspelled as "erorr" above. :)
> > 
> > Thanks for the review. Appended the updated patch. I haven't used
> > #defines for the pci-id's, as the first one (IOH) is used by several
> > chipsets and the second one is not named yet.
> 
> Is there a bug # that should be referenced in the commit log?  Any
> tested-bys to add?

There is no kernel.org bug# but there are multiple bugs with different
OSV's. And hence didn't care to mention to the bug #

Please add:

Reported-by: Max Asbock <masbock@linux.vnet.ibm.com>
Reported-and-tested-by: Takao Indoh <indou.takao@jp.fujitsu.com>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>

thanks,
suresh


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 1/4] vt-d: quirk for masking vtd spec errors to platform error handling logic
  2010-12-06 21:02         ` Suresh Siddha
@ 2010-12-06 23:01           ` Max Asbock
  0 siblings, 0 replies; 21+ messages in thread
From: Max Asbock @ 2010-12-06 23:01 UTC (permalink / raw)
  To: Suresh Siddha
  Cc: Jesse Barnes, tglx, mingo, hpa, linux-kernel, Kenji Kaneshige,
	Chris Wright, indou.takao, Bjorn Helgaas, David Woodhouse,
	stable

On Mon, 2010-12-06 at 13:02 -0800, Suresh Siddha wrote:
> On Mon, 2010-12-06 at 12:44 -0800, Jesse Barnes wrote:
> > On Mon, 06 Dec 2010 12:26:30 -0800
> > Suresh Siddha <suresh.b.siddha@intel.com> wrote:
> > 
> > > On Mon, 2010-12-06 at 09:27 -0800, Jesse Barnes wrote:
> > > > Can we make these registers and bits a bit more self-documenting (i.e.
> > > > #defines for both, maybe along with other useful bit definitions for
> > > > this reg)? Also, "error" is misspelled as "erorr" above. :)
> > > 
> > > Thanks for the review. Appended the updated patch. I haven't used
> > > #defines for the pci-id's, as the first one (IOH) is used by several
> > > chipsets and the second one is not named yet.
> > 
> > Is there a bug # that should be referenced in the commit log?  Any
> > tested-bys to add?
> 
> There is no kernel.org bug# but there are multiple bugs with different
> OSV's. And hence didn't care to mention to the bug #
> 
> Please add:
> 
> Reported-by: Max Asbock <masbock@linux.vnet.ibm.com>
> Reported-and-tested-by: Takao Indoh <indou.takao@jp.fujitsu.com>
> Acked-by: Chris Wright <chrisw@sous-sol.org>
> Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
> 

I tested the patches on a system with a Tylersburg chipset. I used the
patches against the 2.6.37-rc4 kernel and tested kdump. I still see the
Vt-d errors but they no longer cause NMIs. It works as expected. 

- Max


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [patch 2/4] x86, vtd: fix the vt-d fault handling irq migration in the x2apic mode
  2010-12-01 17:40     ` Suresh Siddha
@ 2010-12-07 17:38       ` Takao Indoh
  2010-12-14  1:16       ` [tip:x86/urgent] x86, vt-d: Fix " tip-bot for Kenji Kaneshige
  1 sibling, 0 replies; 21+ messages in thread
From: Takao Indoh @ 2010-12-07 17:38 UTC (permalink / raw)
  To: Suresh Siddha, Bjorn Helgaas
  Cc: tglx, mingo, hpa, linux-kernel, Kenji Kaneshige, Chris Wright,
	Max Asbock, Jesse Barnes, David Woodhouse, stable, Luck, Tony

On Wed, 01 Dec 2010 09:40:32 -0800, Suresh Siddha wrote:

>On Wed, 2010-12-01 at 07:14 -0800, Bjorn Helgaas wrote:
>> On Tue, Nov 30, 2010 at 10:22:27PM -0800, Suresh Siddha wrote:
>> > +	if (x2apic_mode)
>> > +		msg.address_hi = MSI_ADDR_BASE_HI | MSI_ADDR_EXT_DEST_ID
>> > (dest);
>> 
>> Is it necessary to test x2apic_mode here?  It looks like
>> MSI_ADDR_EXT_DEST_ID() gives you everything above the low 8
>> bits of the APIC ID.  If those bits are always zero except in
>> x2apic_mode, we might not need the test.
>
>True. Appended the updated patch.

I applied this patch against 2.6.36 and confirmed irq migration of
vt-d fault worked.

Tested-by: Takao Indoh <indou.takao@jp.fujitsu.com>

Thanks,
Takao Indoh

>
>> Does the ia64 dmar_msi_set_affinity() need the same fix?
>
>No.
>
>> 
>> Why do we have both x2apic_enabled() and x2apic_mode?  They
>> seem sort of redundant.  (Not related to this patch, of course.)
>
>BIOS can handover to OS in x2apic mode in some cases. x2apic_enabled()
>is used to check for that and it reads the MSR to check the status. Some
>early portions of the kernel boot will use it.
>
>For all others, we should be using x2apic_mode.
>
>thanks,
>suresh
>---
>
>From: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
>Subject: x86, vtd: fix the vt-d fault handling irq migration in the x2apic 
>mode
>
>In x2apic mode, we need to set the upper address register of the fault
>handling interrupt register of the vt-d hardware. Without this
>irq migration of the vt-d fault handling interrupt is broken.
>
>Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
>Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
>Cc: stable@kernel.org [v2.6.32+]
>---
> arch/x86/kernel/apic/io_apic.c |    1 +
> 1 file changed, 1 insertion(+)
>
>Index: tip/arch/x86/kernel/apic/io_apic.c
>===================================================================
>--- tip.orig/arch/x86/kernel/apic/io_apic.c
>+++ tip/arch/x86/kernel/apic/io_apic.c
>@@ -3367,6 +3367,7 @@ dmar_msi_set_affinity(struct irq_data *d
> 	msg.data |= MSI_DATA_VECTOR(cfg->vector);
> 	msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
> 	msg.address_lo |= MSI_ADDR_DEST_ID(dest);
>+	msg.address_hi = MSI_ADDR_BASE_HI | MSI_ADDR_EXT_DEST_ID(dest);
> 
> 	dmar_msi_write(irq, &msg);
> 


---
印藤隆夫(INDOH Takao)
 E-Mail : indou.takao@jp.fujitsu.com

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [tip:x86/urgent] x86, vt-d: Quirk for masking vtd spec errors to platform error handling logic
  2010-12-06 20:26     ` Suresh Siddha
  2010-12-06 20:44       ` Jesse Barnes
@ 2010-12-14  1:15       ` tip-bot for Suresh Siddha
  1 sibling, 0 replies; 21+ messages in thread
From: tip-bot for Suresh Siddha @ 2010-12-14  1:15 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, kaneshige.kenji, indou.takao, masbock,
	chrisw, suresh.b.siddha, tglx, hpa

Commit-ID:  254e42006c893f45bca48f313536fcba12206418
Gitweb:     http://git.kernel.org/tip/254e42006c893f45bca48f313536fcba12206418
Author:     Suresh Siddha <suresh.b.siddha@intel.com>
AuthorDate: Mon, 6 Dec 2010 12:26:30 -0800
Committer:  H. Peter Anvin <hpa@linux.intel.com>
CommitDate: Mon, 13 Dec 2010 16:51:51 -0800

x86, vt-d: Quirk for masking vtd spec errors to platform error handling logic

On platforms with Intel 7500 chipset, there were some reports of system
hang/NMI's during kexec/kdump in the presence of interrupt-remapping enabled.

During kdump, there is a window where the devices might be still using old
kernel's interrupt information, while the kdump kernel is coming up. This can
cause vt-d faults as the interrupt configuration from the old kernel map to
null IRTE entries in the new kernel etc. (with out interrupt-remapping enabled,
we still have the same issue but in this case we will see benign spurious
interrupt hit the new kernel).

Based on platform config settings, these platforms seem to generate NMI/SMI
when a vt-d fault happens and there were reports that the resulting SMI causes
the  system to hang.

Fix it by masking vt-d spec defined errors to platform error reporting logic.
VT-d spec related errors are already handled by the VT-d OS code, so need to
report the same error through other channels.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <1291667190.2675.8.camel@sbsiddha-MOBL3.sc.intel.com>
Cc: stable@kernel.org	[v2.6.32+]
Reported-by: Max Asbock <masbock@linux.vnet.ibm.com>
Reported-and-tested-by: Takao Indoh <indou.takao@jp.fujitsu.com>
Acked-by: Chris Wright <chrisw@sous-sol.org>
Acked-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
---
 drivers/pci/quirks.c |   23 +++++++++++++++++++++++
 1 files changed, 23 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 6f9350c..36191ed 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -2764,6 +2764,29 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_RICOH, PCI_DEVICE_ID_RICOH_R5C832, ricoh_m
 DECLARE_PCI_FIXUP_RESUME_EARLY(PCI_VENDOR_ID_RICOH, PCI_DEVICE_ID_RICOH_R5C832, ricoh_mmc_fixup_r5c832);
 #endif /*CONFIG_MMC_RICOH_MMC*/
 
+#if defined(CONFIG_DMAR) || defined(CONFIG_INTR_REMAP)
+#define VTUNCERRMSK_REG	0x1ac
+#define VTD_MSK_SPEC_ERRORS	(1 << 31)
+/*
+ * This is a quirk for masking vt-d spec defined errors to platform error
+ * handling logic. With out this, platforms using Intel 7500, 5500 chipsets
+ * (and the derivative chipsets like X58 etc) seem to generate NMI/SMI (based
+ * on the RAS config settings of the platform) when a vt-d fault happens.
+ * The resulting SMI caused the system to hang.
+ *
+ * VT-d spec related errors are already handled by the VT-d OS code, so no
+ * need to report the same error through other channels.
+ */
+static void vtd_mask_spec_errors(struct pci_dev *dev)
+{
+	u32 word;
+
+	pci_read_config_dword(dev, VTUNCERRMSK_REG, &word);
+	pci_write_config_dword(dev, VTUNCERRMSK_REG, word | VTD_MSK_SPEC_ERRORS);
+}
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x342e, vtd_mask_spec_errors);
+DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x3c28, vtd_mask_spec_errors);
+#endif
 
 static void pci_do_fixups(struct pci_dev *dev, struct pci_fixup *f,
 			  struct pci_fixup *end)

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [tip:x86/urgent] x86, vt-d: Fix the vt-d fault handling irq migration in the x2apic mode
  2010-12-01 17:40     ` Suresh Siddha
  2010-12-07 17:38       ` Takao Indoh
@ 2010-12-14  1:16       ` tip-bot for Kenji Kaneshige
  1 sibling, 0 replies; 21+ messages in thread
From: tip-bot for Kenji Kaneshige @ 2010-12-14  1:16 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, kaneshige.kenji, indou.takao, chrisw,
	suresh.b.siddha, tglx, hpa

Commit-ID:  086e8ced65d9bcc4a8e8f1cd39b09640f2883f90
Gitweb:     http://git.kernel.org/tip/086e8ced65d9bcc4a8e8f1cd39b09640f2883f90
Author:     Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
AuthorDate: Wed, 1 Dec 2010 09:40:32 -0800
Committer:  H. Peter Anvin <hpa@linux.intel.com>
CommitDate: Mon, 13 Dec 2010 16:52:52 -0800

x86, vt-d: Fix the vt-d fault handling irq migration in the x2apic mode

In x2apic mode, we need to set the upper address register of the fault
handling interrupt register of the vt-d hardware. Without this
irq migration of the vt-d fault handling interrupt is broken.

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
LKML-Reference: <1291225233.2648.39.camel@sbsiddha-MOBL3>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: stable@kernel.org [v2.6.32+]
Acked-by: Chris Wright <chrisw@sous-sol.org>
Tested-by: Takao Indoh <indou.takao@jp.fujitsu.com>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
---
 arch/x86/kernel/apic/io_apic.c |    1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 226060e..fadcd74 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -3412,6 +3412,7 @@ dmar_msi_set_affinity(struct irq_data *data, const struct cpumask *mask,
 	msg.data |= MSI_DATA_VECTOR(cfg->vector);
 	msg.address_lo &= ~MSI_ADDR_DEST_ID_MASK;
 	msg.address_lo |= MSI_ADDR_DEST_ID(dest);
+	msg.address_hi = MSI_ADDR_BASE_HI | MSI_ADDR_EXT_DEST_ID(dest);
 
 	dmar_msi_write(irq, &msg);
 

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [tip:x86/urgent] x86: Enable the intr-remap fault handling after local APIC setup
  2010-12-01  6:22 ` [patch 3/4] x86: enable the intr-remap fault handling after local apic setup Suresh Siddha
  2010-12-01  8:51   ` Chris Wright
@ 2010-12-14  1:16   ` tip-bot for Kenji Kaneshige
  1 sibling, 0 replies; 21+ messages in thread
From: tip-bot for Kenji Kaneshige @ 2010-12-14  1:16 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, chrisw, suresh.b.siddha,
	kaneshige.kenji, tglx, hpa

Commit-ID:  7f7fbf45c6b748074546f7f16b9488ca71de99c1
Gitweb:     http://git.kernel.org/tip/7f7fbf45c6b748074546f7f16b9488ca71de99c1
Author:     Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
AuthorDate: Tue, 30 Nov 2010 22:22:28 -0800
Committer:  H. Peter Anvin <hpa@linux.intel.com>
CommitDate: Mon, 13 Dec 2010 16:53:32 -0800

x86: Enable the intr-remap fault handling after local APIC setup

Interrupt-remapping gets enabled very early in the boot, as it determines the
apic mode that the processor can use. And the current code enables the vt-d
fault handling before the setup_local_APIC(). And hence the APIC LDR registers
and data structure in the memory may not be initialized. So the vt-d fault
handling in logical xapic/x2apic modes were broken.

Fix this by enabling the vt-d fault handling in the end_local_APIC_setup()

A cleaner fix of enabling fault handling while enabling intr-remapping
will be addressed for v2.6.38. [ Enabling intr-remapping determines the
usage of x2apic mode and the apic mode determines the fault-handling
configuration. ]

Signed-off-by: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
LKML-Reference: <20101201062244.541996375@intel.com>
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
Cc: stable@kernel.org [v2.6.32+]
Acked-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
---
 arch/x86/kernel/apic/apic.c     |    8 ++++++++
 arch/x86/kernel/apic/probe_64.c |    7 -------
 2 files changed, 8 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 3f838d5..7821813 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1389,6 +1389,14 @@ void __cpuinit end_local_APIC_setup(void)
 
 	setup_apic_nmi_watchdog(NULL);
 	apic_pm_activate();
+
+	/*
+	 * Now that local APIC setup is completed for BP, configure the fault
+	 * handling for interrupt remapping.
+	 */
+	if (!smp_processor_id() && intr_remapping_enabled)
+		enable_drhd_fault_handling();
+
 }
 
 #ifdef CONFIG_X86_X2APIC
diff --git a/arch/x86/kernel/apic/probe_64.c b/arch/x86/kernel/apic/probe_64.c
index f9e4e6a..d8c4a6f 100644
--- a/arch/x86/kernel/apic/probe_64.c
+++ b/arch/x86/kernel/apic/probe_64.c
@@ -79,13 +79,6 @@ void __init default_setup_apic_routing(void)
 		/* need to update phys_pkg_id */
 		apic->phys_pkg_id = apicid_phys_pkg_id;
 	}
-
-	/*
-	 * Now that apic routing model is selected, configure the
-	 * fault handling for intr remapping.
-	 */
-	if (intr_remapping_enabled)
-		enable_drhd_fault_handling();
 }
 
 /* Same for both flat and physical. */

^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [tip:x86/urgent] x86, vt-d: Handle previous faults after enabling fault handling
  2010-12-01  6:22 ` [patch 4/4] vt-d: handle previous faults after enabling fault handling Suresh Siddha
  2010-12-01  8:52   ` Chris Wright
@ 2010-12-14  1:17   ` tip-bot for Suresh Siddha
  1 sibling, 0 replies; 21+ messages in thread
From: tip-bot for Suresh Siddha @ 2010-12-14  1:17 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: linux-kernel, hpa, mingo, chrisw, suresh.b.siddha, tglx, hpa

Commit-ID:  7f99d946e71e71d484b7543b49e990508e70d0c0
Gitweb:     http://git.kernel.org/tip/7f99d946e71e71d484b7543b49e990508e70d0c0
Author:     Suresh Siddha <suresh.b.siddha@intel.com>
AuthorDate: Tue, 30 Nov 2010 22:22:29 -0800
Committer:  H. Peter Anvin <hpa@linux.intel.com>
CommitDate: Mon, 13 Dec 2010 16:53:57 -0800

x86, vt-d: Handle previous faults after enabling fault handling

Fault handling is getting enabled after enabling the interrupt-remapping (as
the success of interrupt-remapping can affect the apic mode and hence the
fault handling mode).

Hence there can potentially be some faults between the window of enabling
interrupt-remapping in the vt-d and the fault-handling of the vt-d units.

Handle any previous faults after enabling the vt-d fault handling.

For v2.6.38 cleanup, need to check if we can remove the dmar_fault() in the
enable_intr_remapping() and see if we can enable fault handling along with
enabling intr-remapping.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
LKML-Reference: <20101201062244.630417138@intel.com>
Cc: stable@kernel.org [v2.6.32+]
Acked-by: Chris Wright <chrisw@sous-sol.org>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
---
 drivers/pci/dmar.c |    5 +++++
 1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/dmar.c b/drivers/pci/dmar.c
index 0157708..09933eb 100644
--- a/drivers/pci/dmar.c
+++ b/drivers/pci/dmar.c
@@ -1417,6 +1417,11 @@ int __init enable_drhd_fault_handling(void)
 			       (unsigned long long)drhd->reg_base_addr, ret);
 			return -1;
 		}
+
+		/*
+		 * Clear any previous faults.
+		 */
+		dmar_fault(iommu->irq, iommu);
 	}
 
 	return 0;

^ permalink raw reply related	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2010-12-14  1:17 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-12-01  6:22 [patch 0/4] x86, intr-remapping patches for addressing kexec/kdump issues Suresh Siddha
2010-12-01  6:22 ` [patch 1/4] vt-d: quirk for masking vtd spec errors to platform error handling logic Suresh Siddha
2010-12-01  7:26   ` Chris Wright
2010-12-06 17:27   ` Jesse Barnes
2010-12-06 20:26     ` Suresh Siddha
2010-12-06 20:44       ` Jesse Barnes
2010-12-06 21:02         ` Suresh Siddha
2010-12-06 23:01           ` Max Asbock
2010-12-14  1:15       ` [tip:x86/urgent] x86, vt-d: Quirk " tip-bot for Suresh Siddha
2010-12-01  6:22 ` [patch 2/4] x86, vtd: fix the vt-d fault handling irq migration in the x2apic mode Suresh Siddha
2010-12-01  8:52   ` Chris Wright
2010-12-01 15:14   ` Bjorn Helgaas
2010-12-01 17:40     ` Suresh Siddha
2010-12-07 17:38       ` Takao Indoh
2010-12-14  1:16       ` [tip:x86/urgent] x86, vt-d: Fix " tip-bot for Kenji Kaneshige
2010-12-01  6:22 ` [patch 3/4] x86: enable the intr-remap fault handling after local apic setup Suresh Siddha
2010-12-01  8:51   ` Chris Wright
2010-12-14  1:16   ` [tip:x86/urgent] x86: Enable the intr-remap fault handling after local APIC setup tip-bot for Kenji Kaneshige
2010-12-01  6:22 ` [patch 4/4] vt-d: handle previous faults after enabling fault handling Suresh Siddha
2010-12-01  8:52   ` Chris Wright
2010-12-14  1:17   ` [tip:x86/urgent] x86, vt-d: Handle " tip-bot for Suresh Siddha

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).