linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jiri Kosina <jikos@kernel.org>
To: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	LKML <linux-kernel@vger.kernel.org>,
	linux-rt-users <linux-rt-users@vger.kernel.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Frederic Weisbecker <fweisbec@gmail.com>,
	Matt Fleming <matt@codeblueprint.co.uk>,
	Daniel Wagner <dwagner@suse.de>
Subject: [PREEMPT_RT] 8250 IRQ lockup when flooding serial console (was Re: [ANNOUNCE] v5.4.28-rt19)
Date: Thu, 23 Apr 2020 10:51:45 +0200 (CEST)	[thread overview]
Message-ID: <nycvar.YEU.7.76.2004231017470.4730@gjva.wvxbf.pm> (raw)
In-Reply-To: <20200330144712.cwcz5ejal4ankeoi@linutronix.de>

On Mon, 30 Mar 2020, Sebastian Andrzej Siewior wrote:

> I'm pleased to announce the v5.4.28-rt19 patch set. 

First, I don't believe this is necessarily a regression coming with this 
particular version, but this is the first kernel where I tried this and it 
crashed.

When running this kernel in KVM guest, and flooding its 8250-based ttyS0 
serial console quickly leads to

[   52.674358] 000: irq 4: nobody cared (try booting with the "irqpoll" option)
[   52.674381] 000: CPU: 0 PID: 155 Comm: irq/4-ttyS0 Not tainted 5.4.28-rt19+ #7
[   52.674389] 000: Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-0-ga698c89-rebuilt.suse.com 04/01/2014
[   52.674398] 000: Call Trace:
[   52.674417] 000:  <IRQ>
[   52.674426] 000:  dump_stack+0x50/0x6b
[   52.674446] 000:  __report_bad_irq+0x2b/0xb0
[   52.674454] 000:  note_interrupt+0x22e/0x280
[   52.674457] 000:  ? io_serial_out+0x11/0x20
[   52.674463] 000:  handle_irq_event_percpu+0x6e/0x90
[   52.674471] 000:  handle_irq_event+0x48/0x90
[   52.674473] 000:  handle_edge_irq+0x95/0x1f0
[   52.674476] 000:  do_IRQ+0x6c/0x120
[   52.674487] 000:  common_interrupt+0xf/0xf
[   52.674496] 000:  </IRQ>
[   52.674497] 000: RIP: 0010:io_serial_out+0x11/0x20
[   52.674503] 000: Code: 8b 57 30 d3 e6 01 f2 ec 0f b6 c0 c3 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f b6 8f e1 00 00 00 89 d0 8b 57 30 d3 e6 01 f2 ee <c3> 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 00 0f b6 87 e2 00 00 00
[   52.674507] 000: RSP: 0018:ffffa860c03a7de8 EFLAGS: 00000202 ORIG_RAX: ffffffffffffffdb
[   52.674508] 000: RAX: 0000000000000034 RBX: ffffffffb05a3bc0 RCX: 0000000000000000
[   52.674510] 000: RDX: 00000000000003f8 RSI: 0000000000000000 RDI: ffffffffb05a3bc0
[   52.674514] 000: RBP: ffff98c1b9718000 R08: ffffffffafa0b8e0 R09: ffffa860c03a7d80
[   52.674515] 000: R10: ffffa860c03a7c40 R11: 0000000000000000 R12: 0000000000000009
[   52.674516] 000: R13: 0000000000000020 R14: 0000000000000001 R15: ffff98c13714a8b8
[   52.674524] 000:  ? io_serial_in+0x20/0x20
[   52.674529] 000:  serial8250_tx_chars+0x121/0x310
[   52.674535] 000:  ? migrate_disable+0x33/0x90
1339913399133991[   52.674540] 000:  
serial8250_handle_irq.part.27+0xc2/0xe0
[   52.674543] 000:  serial8250_default_handle_irq+0x45/0x50
[   52.674545] 000:  serial8250_interrupt+0x51/0xa0
[   52.674548] 000:  ? irq_finalize_oneshot.part.46+0xd0/0xd0
[   52.674550] 000:  irq_forced_thread_fn+0x2b/0x70
[   52.674552] 000:  irq_thread+0xd8/0x170
[   52.674554] 000:  ? wake_threads_waitq+0x30/0x30
[   52.674556] 000:  ? irq_thread_dtor+0x90/0x90
[   52.674558] 000:  kthread+0x10e/0x130
[   52.674563] 000:  ? kthread_park+0x80/0x80
[   52.674565] 000:  ret_from_fork+0x35/0x40
[   52.674571] 000: handlers:
[   52.674579] 000: [<00000000d13c0583>] irq_default_primary_handler threaded [<00000000f172db97>] serial8250_interrupt
[   52.674580] 000: Disabling IRQ #4

(and things go downhill afterwards with RCU stall on the affected CPU, 
etc).

Reproducing it is as easy as firing up qemu with '-serial stdio', booting 
with console set to ttyS0, and then doing e.g.

	i=0; while test $i -lt 100000; do /usr/bin/echo $i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i$i; i=$((i+1)); done

in the serial console terminal. It explodes within a few tens of seconds.

Now, whenever this happens, we always seem to be interrupted in 
0010:io_serial_out+0x11/0x20, which is this in Code

	... d3 e6 01 f2 ee <c3> 66 66 2e 0f ...

where '0xee' is the actual outb. So it looks like outb would take so long 
that 99k IRQ4s trigger in the meantime, which would seem to point to KVM / 
qemu 
issue. 

*However* after either

- turning CONFIG_PREEMPT_RT off 

	or

- starting qemu-kvm without any '-smp' args (i.e. starting the guest as 
  UP)

this issue never reproduces any more; the reason for it almost certainly 
never triggering with !CONFIG_PREEMPT_RT is that the above pattern 
obviously can't happen, because we never interrupt that io_serial_out, 
because the spin_lock_irqsave() in serial8250_handle_irq() disables IRQ in 
this region. Why it doesn't reproduce on UP I don't understand.

I haven't done much debugging on this yet, but reporting this right away 
in case anyone has any idea what's happening here.

Thanks,

-- 
Jiri Kosina
SUSE Labs

  reply	other threads:[~2020-04-23  8:52 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-03-30 14:47 [ANNOUNCE] v5.4.28-rt19 Sebastian Andrzej Siewior
2020-04-23  8:51 ` Jiri Kosina [this message]
2020-04-23  9:12   ` [PREEMPT_RT] 8250 IRQ lockup when flooding serial console (was Re: [ANNOUNCE] v5.4.28-rt19) Jiri Kosina
2020-04-23 10:45     ` Sebastian Andrzej Siewior
2020-04-23 13:06       ` Steven Rostedt
2020-04-23 18:34         ` Steven Rostedt
2020-04-23 18:47           ` Sebastian Andrzej Siewior
2020-04-23 16:07       ` Sebastian Andrzej Siewior
2020-04-23 16:20         ` [PATCH RT] Revert "rt: Improve the serial console PASS_LIMIT" Sebastian Andrzej Siewior
2020-04-23 16:21           ` Sebastian Andrzej Siewior
2020-04-23 19:48         ` [PREEMPT_RT] 8250 IRQ lockup when flooding serial console (was Re: [ANNOUNCE] v5.4.28-rt19) Jiri Kosina
2020-04-23 20:15           ` Steven Rostedt
2020-04-24 19:19         ` Sebastian Andrzej Siewior
2020-04-24 20:54           ` Jiri Kosina
2020-04-27  9:17             ` Sebastian Andrzej Siewior

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=nycvar.YEU.7.76.2004231017470.4730@gjva.wvxbf.pm \
    --to=jikos@kernel.org \
    --cc=bigeasy@linutronix.de \
    --cc=dwagner@suse.de \
    --cc=fweisbec@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=matt@codeblueprint.co.uk \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).