linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Chris J Arges <chris.j.arges@canonical.com>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Rafael David Tinoco <inaddy@ubuntu.com>,
	Ingo Molnar <mingo@kernel.org>, Peter Anvin <hpa@zytor.com>,
	Jiang Liu <jiang.liu@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	LKML <linux-kernel@vger.kernel.org>, Jens Axboe <axboe@kernel.dk>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Gema Gomez <gema.gomez-solano@canonical.com>,
	the arch/x86 maintainers <x86@kernel.org>
Subject: Re: smp_call_function_single lockups
Date: Mon, 30 Mar 2015 22:15:36 -0500	[thread overview]
Message-ID: <20150331031536.GA9303@canonical.com> (raw)
In-Reply-To: <CA+55aFxFkw7cKu6R8-v9z=c+yG+jsPHyQKW5-yyn3+M0BuyvxA@mail.gmail.com>

On Thu, Feb 19, 2015 at 02:45:54PM -0800, Linus Torvalds wrote:
> On Thu, Feb 19, 2015 at 1:59 PM, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
> >
> > Is this worth looking at? Or is it something spurious? I might have
> > gotten the vectors wrong, and maybe the warning is not because the ISR
> > bit isn't set, but because I test the wrong bit.
> 
> I edited the patch to do ratelimiting (one per 10s max) rather than
> "once". And tested it some more. It seems to work correctly. The irq
> case during 8042 probing is not repeatable, and I suspect it happens
> because the interrupt source goes away (some probe-time thing that
> first triggers an interrupt, but then clears it itself), so it doesn't
> happen every boot, and I've gotten it with slightly different
> backtraces.
> 
> But it's the only warning that happens for me, so I think my code is
> right (at least for the cases that trigger on this machine). It's
> definitely not a "every interrupt causes the warning because the code
> was buggy, and the WARN_ONCE() just printed the first one".
> 
> It would be interesting to hear if others see spurious APIC EOI cases
> too. In particular, the people seeing the IPI lockup. Because a lot of
> the lockups we've seen have *looked* like the IPI interrupt just never
> happened, and so we're waiting forever for the target CPU to react to
> it. And just maybe the spurious EOI could cause the wrong bit to be
> cleared in the ISR, and then the interrupt never shows up. Something
> like that would certainly explain why it only happens on some machines
> and under certain timing circumstances.
> 
>                     Linus
> 

Linus,

I'm able to reproduce this IPI lockup easily now when using specific hardware
and nested KVM VMs. However, it seems to only occur when using certain host
hardware. For example:
- Xeon E5620 (Westmere-EP) (fam: 06, model: 2c)
- Xeon E312xx (Sandy Bridge) (fam: 06, model: 2a)
Now I'm not sure if this indicates hardware LAPIC issues or if the timing in
these processors make it more likely to hit this issue. So far I haven't seen
the issue on other CPUs.

To set this up, I've done the following (L0 being the Xeon box):
1) Create an L1 KVM VM with 2 vCPUs (single vCPU case doesn't reproduce)
2) Create an L2 KVM VM inside the L1 VM with 1 vCPU
3) Run something like 'stress -c 1 -m 1 -d 1 -t 1200' inside the L2 VM

Sometimes this is sufficient to reproduce the issue, I've observed that running
KSM in the L1 VM can agitate this issue (it calls native_flush_tlb_others).
If this doesn't reproduce then you can do the following:
4) Migrate the L2 vCPU randomly (via virsh vcpupin --live  OR tasksel) between
L1 vCPUs until the hang occurs.

Pinning L1 vCPUs to L0 initially (i.e. 0->0 1->1) causes the hangs to not occur
in my testing. (Which makes sense because we're unlikely to do real IPIs.)

I've been able to repro with your patch and observed the WARN_ON when booting a
VM on affected hardware and non affected hardware:

[   13.613531] ------------[ cut here ]------------
[   13.613531] WARNING: CPU: 0 PID: 0 at ./arch/x86/include/asm/apic.h:444 apic_ack_edge+0x84/0x90()
[   13.613531] Modules linked in: ipt_MASQUERADE(E) nf_nat_masquerade_ipv4(E) iptable_nat(E) nf_nat_ipv4(E) nf_nat(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) xt_conntrack(E) nf_conntrack(E) ipt_REJECT(E) nf_reject_ipv4(E) xt_CHECKSUM(E) iptable_mangle(E) xt_tcpudp(E) bridge(E) stp(E) llc(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) ip_tables(E) ebtable_nat(E) ebtables(E) x_tables(E) dm_crypt(E) ppdev(E) kvm_intel(E) kvm(E) serio_raw(E) i2c_piix4(E) parport_pc(E) pvpanic(E) parport(E) mac_hid(E) nls_utf8(E) isofs(E) psmouse(E) floppy(E) cirrus(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) ttm(E) drm_kms_helper(E) drm(E) pata_acpi(E)
[   13.613531] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G            E   4.0.0-rc6-460f8calinus1 #4
[   13.613531] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[   13.613531]  ffffffff81a911a6 ffff88013fc03eb8 ffffffff817999a7 0000000000000007
[   13.613531]  0000000000000000 ffff88013fc03ef8 ffffffff810734ca 0000000000000092
[   13.613531]  0000000000000011 ffff8800b8bbac00 000000000000001f 00000000000000b1
[   13.613531] Call Trace:
[   13.613531]  <IRQ>  [<ffffffff817999a7>] dump_stack+0x45/0x57
[   13.613531]  [<ffffffff810734ca>] warn_slowpath_common+0x8a/0xc0
[   13.613531]  [<ffffffff810735ba>] warn_slowpath_null+0x1a/0x20
[   13.613531]  [<ffffffff8104d3f4>] apic_ack_edge+0x84/0x90
[   13.613531]  [<ffffffff810cf8e7>] handle_edge_irq+0x57/0x120
[   13.613531]  [<ffffffff81016aa2>] handle_irq+0x22/0x40
[   13.613531]  [<ffffffff817a3b9f>] do_IRQ+0x4f/0x140
[   13.613531]  [<ffffffff817a196d>] common_interrupt+0x6d/0x6d
[   13.613531]  <EOI>  [<ffffffff810def08>] ? hrtimer_start+0x18/0x20
[   13.613531]  [<ffffffff8105a356>] ? native_safe_halt+0x6/0x10
[   13.613531]  [<ffffffff810d5623>] ? rcu_eqs_enter+0xa3/0xb0
[   13.613531]  [<ffffffff8101ecde>] default_idle+0x1e/0xc0
[   13.613531]  [<ffffffff8101f6cf>] arch_cpu_idle+0xf/0x20
[   13.613531]  [<ffffffff810b5d6f>] cpu_startup_entry+0x2ff/0x420
[   13.613531]  [<ffffffff8178e9b7>] rest_init+0x77/0x80
[   13.613531]  [<ffffffff81d50f97>] start_kernel+0x43c/0x449
[   13.613531]  [<ffffffff81d50120>] ? early_idt_handlers+0x120/0x120
[   13.613531]  [<ffffffff81d504d7>] x86_64_start_reservations+0x2a/0x2c
[   13.613531]  [<ffffffff81d5062b>] x86_64_start_kernel+0x152/0x161
[   13.613531] ---[ end trace 4512c19aad733ce8 ]---

I modified the posted patch with the following:
diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index bf32309..dc3e192 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -441,7 +441,7 @@ static inline void ack_APIC_irq(int vector)
        if (vector >= 16) {
                unsigned v = apic_read(APIC_ISR + ((vector & ~0x1f) >> 1));
                v >>= vector & 0x1f;
-               WARN_ON_ONCE(!(v & 1));
+               WARN(!(v & 1), "ack_APIC_irq: vector = %0x\n", vector);
        }
        /*
         * ack_APIC_irq() actually gets compiled as a single instruction

And it showed vector = 1b when booting. However, when I run the reproducer on
an affected machine I get the following WARNs before the hang:

[   36.301282] ------------[ cut here ]------------
[   36.301299] WARNING: CPU: 0 PID: 0 at ./arch/x86/include/asm/apic.h:444 apic_ack_edge+0x93/0xa0()
[   36.301301] ack_APIC_irq: vector = e1
[   36.301303] Modules linked in: ipt_MASQUERADE(E) nf_nat_masquerade_ipv4(E) iptable_nat(E) nf_nat_ipv4(E) nf_nat(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) xt_conntrack(E) nf_conntrack(E) ipt_REJECT(E) nf_reject_ipv4(E) xt_CHECKS
UM(E) iptable_mangle(E) xt_tcpudp(E) bridge(E) stp(E) llc(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) ip_tables(E) ebtable_nat(E) ebtables(E) x_tables(E) dm_crypt(E) ppdev(E) kvm_intel(E) kvm(E) parport_pc(E) pvpanic(E) 
parport(E) serio_raw(E) mac_hid(E) i2c_piix4(E) nls_utf8(E) isofs(E) cirrus(E) psmouse(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) ttm(E) floppy(E) drm_kms_helper(E) drm(E) pata_acpi(E)
[   36.301344] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W   E   4.0.0-rc6-655d7f2linus2 #5
[   36.301346] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[   36.301348]  ffffffff81a910f2 ffff88013fc03e68 ffffffff817998e7 0000000000000007
[   36.301352]  ffff88013fc03eb8 ffff88013fc03ea8 ffffffff810734ca ffffffff81c0c0f8
[   36.301355]  0000000000000001 00000000000000e1 0000000000000019 0000000000000051
[   36.301359] Call Trace:
[   36.301361]  <IRQ>  [<ffffffff817998e7>] dump_stack+0x45/0x57
[   36.301373]  [<ffffffff810734ca>] warn_slowpath_common+0x8a/0xc0
[   36.301377]  [<ffffffff81073546>] warn_slowpath_fmt+0x46/0x50
[   36.301383]  [<ffffffff81405420>] ? pci_msi_unmask_irq+0x10/0x20
[   36.301387]  [<ffffffff8104d3b3>] apic_ack_edge+0x93/0xa0
[   36.301392]  [<ffffffff810cf8e7>] handle_edge_irq+0x57/0x120
[   36.301399]  [<ffffffff81016aa2>] handle_irq+0x22/0x40
[   36.301403]  [<ffffffff817a3adf>] do_IRQ+0x4f/0x140
[   36.301407]  [<ffffffff817a18ad>] common_interrupt+0x6d/0x6d
[   36.301409]  <EOI>  [<ffffffff810def08>] ? hrtimer_start+0x18/0x20
[   36.301417]  [<ffffffff8105a356>] ? native_safe_halt+0x6/0x10
[   36.301422]  [<ffffffff810d5623>] ? rcu_eqs_enter+0xa3/0xb0
[   36.301426]  [<ffffffff8101ecce>] default_idle+0x1e/0xc0
[   36.301430]  [<ffffffff8101f6bf>] arch_cpu_idle+0xf/0x20
[   36.301433]  [<ffffffff810b5d6f>] cpu_startup_entry+0x2ff/0x420
[   36.301438]  [<ffffffff8178e8f7>] rest_init+0x77/0x80
[   36.301443]  [<ffffffff81d50f97>] start_kernel+0x43c/0x449
[   36.301446]  [<ffffffff81d50120>] ? early_idt_handlers+0x120/0x120
[   36.301450]  [<ffffffff81d504d7>] x86_64_start_reservations+0x2a/0x2c
[   36.301453]  [<ffffffff81d5062b>] x86_64_start_kernel+0x152/0x161
[   36.301455] ---[ end trace 8f10e066e6d6eddf ]---
[   40.430515] ------------[ cut here ]------------
[   40.430531] WARNING: CPU: 0 PID: 0 at ./arch/x86/include/asm/apic.h:444 apic_ack_edge+0x93/0xa0()
[   40.430533] ack_APIC_irq: vector = 22
[   40.430535] Modules linked in: ipt_MASQUERADE(E) nf_nat_masquerade_ipv4(E) iptable_nat(E) nf_nat_ipv4(E) nf_nat(E) nf_conntrack_ipv4(E) nf_defrag_ipv4(E) xt_conntrack(E) nf_conntrack(E) ipt_REJECT(E) nf_reject_ipv4(E) xt_CHECKS
UM(E) iptable_mangle(E) xt_tcpudp(E) bridge(E) stp(E) llc(E) ip6table_filter(E) ip6_tables(E) iptable_filter(E) ip_tables(E) ebtable_nat(E) ebtables(E) x_tables(E) dm_crypt(E) ppdev(E) kvm_intel(E) kvm(E) parport_pc(E) pvpanic(E) 
parport(E) serio_raw(E) mac_hid(E) i2c_piix4(E) nls_utf8(E) isofs(E) cirrus(E) psmouse(E) syscopyarea(E) sysfillrect(E) sysimgblt(E) ttm(E) floppy(E) drm_kms_helper(E) drm(E) pata_acpi(E)
[   40.430576] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W   E   4.0.0-rc6-655d7f2linus2 #5
[   40.430578] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
[   40.430581]  ffffffff81a910f2 ffff88013fc03e68 ffffffff817998e7 0000000000000007
[   40.430584]  ffff88013fc03eb8 ffff88013fc03ea8 ffffffff810734ca 0000000000000202
[   40.430587]  0000000000000002 0000000000000022 000000000000001d 0000000000000091
[   40.430591] Call Trace:
[   40.430593]  <IRQ>  [<ffffffff817998e7>] dump_stack+0x45/0x57
[   40.430605]  [<ffffffff810734ca>] warn_slowpath_common+0x8a/0xc0
[   40.430609]  [<ffffffff81073546>] warn_slowpath_fmt+0x46/0x50
[   40.430615]  [<ffffffff81405420>] ? pci_msi_unmask_irq+0x10/0x20
[   40.430631]  [<ffffffff8104d3b3>] apic_ack_edge+0x93/0xa0
[   40.430637]  [<ffffffff810cf8e7>] handle_edge_irq+0x57/0x120
[   40.430643]  [<ffffffff81016aa2>] handle_irq+0x22/0x40
[   40.430647]  [<ffffffff817a3adf>] do_IRQ+0x4f/0x140
[   40.430651]  [<ffffffff817a18ad>] common_interrupt+0x6d/0x6d
[   40.430653]  <EOI>  [<ffffffff810def08>] ? hrtimer_start+0x18/0x20
[   40.430661]  [<ffffffff8105a356>] ? native_safe_halt+0x6/0x10
[   40.430666]  [<ffffffff810d5623>] ? rcu_eqs_enter+0xa3/0xb0
[   40.430670]  [<ffffffff8101ecce>] default_idle+0x1e/0xc0
[   40.430674]  [<ffffffff8101f6bf>] arch_cpu_idle+0xf/0x20
[   40.430678]  [<ffffffff810b5d6f>] cpu_startup_entry+0x2ff/0x420
[   40.430684]  [<ffffffff8178e8f7>] rest_init+0x77/0x80
[   40.430689]  [<ffffffff81d50f97>] start_kernel+0x43c/0x449
[   40.430693]  [<ffffffff81d50120>] ? early_idt_handlers+0x120/0x120
[   40.430696]  [<ffffffff81d504d7>] x86_64_start_reservations+0x2a/0x2c
[   40.430699]  [<ffffffff81d5062b>] x86_64_start_kernel+0x152/0x161
[   40.430702] ---[ end trace 8f10e066e6d6ede0 ]---

So vector = e1 then 22 before the hang.

[   45.728212] kvm [1602]: vcpu0 disabled perfctr wrmsr: 0xc1 data 0xffff
[   53.391974] random: nonblocking pool is initialized
[   80.076005] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [qemu-system-x86:1602]

Some things I've already done are:
- run with nox2apic on L0,L1 (so apic->name == "flat"), this still reproduces
- run with apic=off on L0,L1, this still reproduces

Anyway, maybe this sheds some more light on this issue. I can reproduce this at
will, so let me know of other experiments to do.

Thanks,
--chris j arges


  reply	other threads:[~2015-03-31  3:16 UTC|newest]

Thread overview: 77+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-02-11 13:19 smp_call_function_single lockups Rafael David Tinoco
2015-02-11 18:18 ` Linus Torvalds
2015-02-11 19:59   ` Linus Torvalds
2015-02-11 20:42     ` Linus Torvalds
2015-02-12 16:38       ` Rafael David Tinoco
2015-02-18 22:25       ` Peter Zijlstra
2015-02-19 15:42         ` Rafael David Tinoco
2015-02-19 16:14           ` Linus Torvalds
2015-02-23 14:01             ` Rafael David Tinoco
2015-02-23 19:32               ` Linus Torvalds
2015-02-23 20:50                 ` Peter Zijlstra
2015-02-23 21:02                   ` Rafael David Tinoco
2015-02-19 16:16           ` Peter Zijlstra
2015-02-19 16:26           ` Linus Torvalds
2015-02-19 16:32             ` Rafael David Tinoco
2015-02-19 16:59               ` Linus Torvalds
2015-02-19 17:30                 ` Rafael David Tinoco
2015-02-19 17:39                 ` Linus Torvalds
2015-02-19 20:29                   ` Linus Torvalds
2015-02-19 21:59                     ` Linus Torvalds
2015-02-19 22:45                       ` Linus Torvalds
2015-03-31  3:15                         ` Chris J Arges [this message]
2015-03-31  4:28                           ` Linus Torvalds
2015-03-31 10:56                             ` [debug PATCHes] " Ingo Molnar
2015-03-31 22:38                               ` Chris J Arges
2015-04-01 12:39                                 ` Ingo Molnar
2015-04-01 14:10                                   ` Chris J Arges
2015-04-01 14:55                                     ` Ingo Molnar
2015-03-31  4:46                           ` Linus Torvalds
2015-03-31 15:08                           ` Linus Torvalds
2015-03-31 22:23                             ` Chris J Arges
2015-03-31 23:07                               ` Linus Torvalds
2015-04-01 14:32                                 ` Chris J Arges
2015-04-01 15:36                                   ` Linus Torvalds
2015-04-02  9:55                                     ` Ingo Molnar
2015-04-02 17:35                                       ` Linus Torvalds
2015-04-01 12:43                               ` Ingo Molnar
2015-04-01 16:10                                 ` Chris J Arges
2015-04-01 16:14                                   ` Linus Torvalds
2015-04-01 21:59                                     ` Chris J Arges
2015-04-02 17:31                                       ` Linus Torvalds
2015-04-02 18:26                                         ` Ingo Molnar
2015-04-02 18:51                                           ` Chris J Arges
2015-04-02 19:07                                             ` Ingo Molnar
2015-04-02 20:57                                               ` Linus Torvalds
2015-04-02 21:13                                               ` Chris J Arges
2015-04-03  5:43                                                 ` [PATCH] smp/call: Detect stuck CSD locks Ingo Molnar
2015-04-03  5:47                                                   ` Ingo Molnar
2015-04-06 16:58                                                   ` Chris J Arges
2015-04-06 17:32                                                     ` Linus Torvalds
2015-04-07  9:21                                                       ` Ingo Molnar
2015-04-07 20:59                                                         ` Chris J Arges
2015-04-07 21:15                                                           ` Linus Torvalds
2015-04-08  6:47                                                           ` Ingo Molnar
2015-04-13  3:56                                                             ` Chris J Arges
2015-04-13  6:14                                                               ` Ingo Molnar
2015-04-15 19:54                                                                 ` Chris J Arges
2015-04-16 11:04                                                                   ` Ingo Molnar
2015-04-16 15:58                                                                     ` Chris J Arges
2015-04-16 16:31                                                                       ` Ingo Molnar
2015-04-29 21:08                                                                         ` Chris J Arges
2015-05-11 14:00                                                                           ` Ingo Molnar
2015-05-20 18:19                                                                             ` Chris J Arges
2015-04-03  5:45                                                 ` smp_call_function_single lockups Ingo Molnar
2015-04-06 17:23                                         ` Chris J Arges
2015-02-20  9:30                     ` Ingo Molnar
2015-02-20 16:49                       ` Linus Torvalds
2015-02-20 19:41                         ` Ingo Molnar
2015-02-20 20:03                           ` Linus Torvalds
2015-02-20 20:11                             ` Ingo Molnar
2015-03-20 10:15       ` Peter Zijlstra
2015-03-20 16:26         ` Linus Torvalds
2015-03-20 17:14           ` Mike Galbraith
2015-04-01 14:22       ` Frederic Weisbecker
2015-04-18 10:13       ` [tip:locking/urgent] smp: Fix smp_call_function_single_async() locking tip-bot for Linus Torvalds
2015-02-22  8:59 smp_call_function_single lockups Daniel J Blueman
2015-02-22 10:37 ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150331031536.GA9303@canonical.com \
    --to=chris.j.arges@canonical.com \
    --cc=axboe@kernel.dk \
    --cc=fweisbec@gmail.com \
    --cc=gema.gomez-solano@canonical.com \
    --cc=hpa@zytor.com \
    --cc=inaddy@ubuntu.com \
    --cc=jiang.liu@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).