From: Mathias Nyman <mathias.nyman@linux.intel.com>
To: x86@kernel.org
Cc: linux-pci <linux-pci@vger.kernel.org>,
Thomas Gleixner <tglx@linutronix.de>,
LKML <linux-kernel@vger.kernel.org>,
Bjorn Helgaas <bhelgaas@google.com>,
Evan Green <evgreen@chromium.org>,
"Ghorai, Sukumar" <sukumar.ghorai@intel.com>,
"Amara, Madhusudanarao" <madhusudanarao.amara@intel.com>,
"Nandamuri, Srikanth" <srikanth.nandamuri@intel.com>
Subject: MSI interrupt for xhci still lost on 5.6-rc6 after cpu hotplug
Date: Wed, 18 Mar 2020 21:25:39 +0200 [thread overview]
Message-ID: <806c51fa-992b-33ac-61a9-00a606f82edb@linux.intel.com> (raw)
Hi
I can reproduce the lost MSI interrupt issue on 5.6-rc6 which includes
the "Plug non-maskable MSI affinity race" patch.
I can see this on a couple platforms, I'm running a script that first generates
a lot of usb traffic, and then in a busyloop sets irq affinity and turns off
and on cpus:
for i in 1 3 5 7; do
echo "1" > /sys/devices/system/cpu/cpu$i/online
done
echo "A" > "/proc/irq/*/smp_affinity"
echo "A" > "/proc/irq/*/smp_affinity"
echo "F" > "/proc/irq/*/smp_affinity"
for i in 1 3 5 7; do
echo "0" > /sys/devices/system/cpu/cpu$i/online
done
I added some very simple debugging but I don't really know what to look for.
xhci interrupts (122) just stop after a setting msi affinity, it survived many
similar msi_set_affinity() calls before this.
I'm not that familiar with the inner workings of this, but I'll be happy to
help out with adding debugging and testing patches.
Details:
cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7
0: 26 0 0 0 0 0 0 0 IO-APIC 2-edge timer
1: 0 0 0 0 0 7 0 0 IO-APIC 1-edge i8042
4: 0 4 59941 0 0 0 0 0 IO-APIC 4-edge ttyS0
8: 0 0 0 0 0 0 1 0 IO-APIC 8-edge rtc0
9: 0 40 8 0 0 0 0 0 IO-APIC 9-fasteoi acpi
16: 0 0 0 0 0 0 0 0 IO-APIC 16-fasteoi i801_smbus
120: 0 0 293 0 0 0 0 0 PCI-MSI 32768-edge i915
121: 728 0 0 58 0 0 0 0 PCI-MSI 520192-edge enp0s31f6
122: 63575 2271 0 1957 7262 0 0 0 PCI-MSI 327680-edge xhci_hcd
123: 0 0 0 0 0 0 0 0 PCI-MSI 514048-edge snd_hda_intel:card0
NMI: 0 0 0 0 0 0 0 0 Non-maskable interrupts
trace snippet:
<idle>-0 [001] d.h. 129.676900: xhci_irq: xhci irq
<idle>-0 [001] d.h. 129.677507: xhci_irq: xhci irq
<idle>-0 [001] d.h. 129.677556: xhci_irq: xhci irq
<idle>-0 [001] d.h. 129.677647: xhci_irq: xhci irq
<...>-14 [001] d..1 129.679802: msi_set_affinity: direct update msi 122, vector 33 -> 33, apicid: 2 -> 6
<idle>-0 [003] d.h. 129.682639: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.682769: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.682908: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.683552: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.683677: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.683819: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.689017: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.689140: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.689307: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.689984: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.690107: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.690278: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.695541: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.695674: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.695839: xhci_irq: xhci irq
<idle>-0 [003] d.H. 129.696667: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.696797: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.696973: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.702288: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.702380: xhci_irq: xhci irq
<idle>-0 [003] d.h. 129.702493: xhci_irq: xhci irq
migration/3-24 [003] d..1 129.703150: msi_set_affinity: direct update msi 122, vector 33 -> 33, apicid: 6 -> 0
kworker/0:0-5 [000] d.h. 131.328790: msi_set_affinity: direct update msi 121, vector 34 -> 34, apicid: 0 -> 0
kworker/0:0-5 [000] d.h. 133.312704: msi_set_affinity: direct update msi 121, vector 34 -> 34, apicid: 0 -> 0
kworker/0:0-5 [000] d.h. 135.360786: msi_set_affinity: direct update msi 121, vector 34 -> 34, apicid: 0 -> 0
<idle>-0 [000] d.h. 137.344694: msi_set_affinity: direct update msi 121, vector 34 -> 34, apicid: 0 -> 0
kworker/0:0-5 [000] d.h. 139.128679: msi_set_affinity: direct update msi 121, vector 34 -> 34, apicid: 0 -> 0
kworker/0:0-5 [000] d.h. 141.312686: msi_set_affinity: direct update msi 121, vector 34 -> 34, apicid: 0 -> 0
kworker/0:0-5 [000] d.h. 143.360703: msi_set_affinity: direct update msi 121, vector 34 -> 34, apicid: 0 -> 0
kworker/0:0-5 [000] d.h. 145.344791: msi_set_affinity: direct update msi 121, vector 34 -> 34, apicid: 0 -> 0
--- a/arch/x86/kernel/apic/msi.c
+++ b/arch/x86/kernel/apic/msi.c
@@ -92,6 +92,10 @@ msi_set_affinity(struct irq_data *irqd, const struct cpumask *mask, bool force)
cfg->vector == old_cfg.vector ||
old_cfg.vector == MANAGED_IRQ_SHUTDOWN_VECTOR ||
cfg->dest_apicid == old_cfg.dest_apicid) {
+ trace_printk("direct update msi %u, vector %u -> %u, apicid: %u -> %u\n",
+ irqd->irq,
+ old_cfg.vector, cfg->vector,
+ old_cfg.dest_apicid, cfg->dest_apicid);
irq_msi_update_msg(irqd, cfg);
return ret;
}
@@ -134,7 +138,10 @@ msi_set_affinity(struct irq_data *irqd, const struct cpumask *mask, bool force)
*/
if (IS_ERR_OR_NULL(this_cpu_read(vector_irq[cfg->vector])))
this_cpu_write(vector_irq[cfg->vector], VECTOR_RETRIGGERED);
-
+ trace_printk("twostep update msi, irq %u, vector %u -> %u, apicid: %u -> %u\n",
+ irqd->irq,
+ old_cfg.vector, cfg->vector,
+ old_cfg.dest_apicid, cfg->dest_apicid);
/* Redirect it to the new vector on the local CPU temporarily */
old_cfg.vector = cfg->vector;
irq_msi_update_msg(irqd, &old_cfg);
Thanks
-Mathias
next reply other threads:[~2020-03-18 19:23 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-18 19:25 Mathias Nyman [this message]
2020-03-19 20:24 ` MSI interrupt for xhci still lost on 5.6-rc6 after cpu hotplug Evan Green
2020-03-20 8:07 ` Mathias Nyman
2020-03-20 9:52 ` Thomas Gleixner
2020-03-23 9:42 ` Mathias Nyman
2020-03-23 14:10 ` Thomas Gleixner
2020-03-23 20:32 ` Mathias Nyman
2020-03-24 0:24 ` Thomas Gleixner
2020-03-24 16:17 ` Evan Green
2020-03-24 19:03 ` Thomas Gleixner
2020-05-01 18:43 ` Raj, Ashok
2020-05-05 19:36 ` Thomas Gleixner
2020-05-05 20:16 ` Raj, Ashok
2020-05-05 21:47 ` Thomas Gleixner
2020-05-07 12:18 ` Raj, Ashok
2020-05-07 12:53 ` Thomas Gleixner
[not found] ` <20200507175715.GA22426@otc-nc-03>
2020-05-07 19:41 ` Thomas Gleixner
2020-03-25 17:12 ` Mathias Nyman
[not found] <20200508005528.GB61703@otc-nc-03>
2020-05-08 11:04 ` Thomas Gleixner
2020-05-08 16:09 ` Raj, Ashok
2020-05-08 16:49 ` Thomas Gleixner
2020-05-11 19:03 ` Raj, Ashok
2020-05-11 20:14 ` Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=806c51fa-992b-33ac-61a9-00a606f82edb@linux.intel.com \
--to=mathias.nyman@linux.intel.com \
--cc=bhelgaas@google.com \
--cc=evgreen@chromium.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=madhusudanarao.amara@intel.com \
--cc=srikanth.nandamuri@intel.com \
--cc=sukumar.ghorai@intel.com \
--cc=tglx@linutronix.de \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).