From: Steven Rostedt <rostedt@goodmis.org>
To: Jiri Kosina <jikos@kernel.org>
Cc: Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
Thomas Gleixner <tglx@linutronix.de>,
LKML <linux-kernel@vger.kernel.org>,
linux-rt-users <linux-rt-users@vger.kernel.org>,
Frederic Weisbecker <fweisbec@gmail.com>,
Matt Fleming <matt@codeblueprint.co.uk>,
Daniel Wagner <dwagner@suse.de>
Subject: Re: [PREEMPT_RT] 8250 IRQ lockup when flooding serial console (was Re: [ANNOUNCE] v5.4.28-rt19)
Date: Thu, 23 Apr 2020 16:15:41 -0400 [thread overview]
Message-ID: <20200423161541.49ca0ab3@gandalf.local.home> (raw)
In-Reply-To: <nycvar.YFH.7.76.2004232141590.19713@cbobk.fhfr.pm>
On Thu, 23 Apr 2020 21:48:20 +0200 (CEST)
Jiri Kosina <jikos@kernel.org> wrote:
> On Thu, 23 Apr 2020, Sebastian Andrzej Siewior wrote:
>
> > The IRQ4 is edge and in charge of ttyS0. It is handled by
> > handle_edge_irq() and after ->irq_ack(), the thread is woken up and then
> > we get another ->handle_edge_irq() for IRQ4. With larger PASS_LIMIT the
> > thread runs longer so note_interrupt() will make less IRQ_HANDLED based
> > on ->threads_handled_last. If it observes 100 handled within 100000
> > interrupts then the counters are reset again. On !RT it usually manages
> > to get >100 per 100000 interrupts so it appears good. On RT it gets less
> > and the interrupt gets disabled.
> >
> > So it is not RT related, but RT triggers it more reliably (also the
> > PASS_LIMIT change can vanish). I can't tell if this is a qemu bug in
> > emulating the HW or not. I can't reproduce it real HW. I see a second
> > edge interrupt only after the thread completed. I can't tell if this is
> > because it is a real UART and the data is flowing slower or because the
> > edge-IRQ is not triggered repeatedly.
>
> Yeah, it's all strange. In the hope of understanding the issue a little
> bit better, I tried to disable IRQs in serial8250_handle_irq() to mimic
> what !PREEMPT_RT spinlock would do; the idea was that this is some kind of
> strange race / memory ordering (missed ack?) between the threaded irq4
> handler and the do_IRQ() -> handle_edge_irq() -> ... path.
>
> So I did this:
>
> ---
> drivers/tty/serial/8250/8250_port.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/tty/serial/8250/8250_port.c b/drivers/tty/serial/8250/8250_port.c
> index e31217e8dce6..1a577305e174 100644
> --- a/drivers/tty/serial/8250/8250_port.c
> +++ b/drivers/tty/serial/8250/8250_port.c
> @@ -1813,12 +1813,13 @@ static bool handle_rx_dma(struct uart_8250_port *up, unsigned int iir)
> int serial8250_handle_irq(struct uart_port *port, unsigned int iir)
> {
> unsigned char status;
> - unsigned long flags;
> + unsigned long flags, f;
> struct uart_8250_port *up = up_to_u8250p(port);
>
> if (iir & UART_IIR_NO_INT)
> return 0;
>
> + local_irq_save(f);
> spin_lock_irqsave(&port->lock, flags);
Note, this would break if there ever was a contention, as the spin lock is
a mutex and would sleep. And we don't want to do that with interrupts
disabled!
>
> status = serial_port_in(port, UART_LSR);
> @@ -1833,6 +1834,7 @@ int serial8250_handle_irq(struct uart_port *port, unsigned int iir)
> serial8250_tx_chars(up);
>
> uart_unlock_and_check_sysrq(port, flags);
> + local_irq_restore(f);
> return 1;
> }
>
> But curiously enough, that exploded in the oposite order (so first there
> was CPU stall, and only later the disabling of IRQ4 due to spurious
> storm):
Now what may be interesting to try is to enable tracing and
ftrace_dump_on_opps, and set panic_on_warning, as well as
traceoff_on_warning.
# echo 1 > /proc/sys/kernel/ftrace_dump_on_oops
# echo 1 > /proc/sys/kernel/panic_on_warn
# echo 1 > /proc/sys/kernel/traceoff_on_warning
# echo 1 > /sys/kernel/tracing/events/enable
Enabling all events will include interrupt events and wake ups, and perhaps
give you an idea what interrupts are happening after the uart thread is
woken.
-- Steve
next prev parent reply other threads:[~2020-04-23 20:15 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-03-30 14:47 [ANNOUNCE] v5.4.28-rt19 Sebastian Andrzej Siewior
2020-04-23 8:51 ` [PREEMPT_RT] 8250 IRQ lockup when flooding serial console (was Re: [ANNOUNCE] v5.4.28-rt19) Jiri Kosina
2020-04-23 9:12 ` Jiri Kosina
2020-04-23 10:45 ` Sebastian Andrzej Siewior
2020-04-23 13:06 ` Steven Rostedt
2020-04-23 18:34 ` Steven Rostedt
2020-04-23 18:47 ` Sebastian Andrzej Siewior
2020-04-23 16:07 ` Sebastian Andrzej Siewior
2020-04-23 16:20 ` [PATCH RT] Revert "rt: Improve the serial console PASS_LIMIT" Sebastian Andrzej Siewior
2020-04-23 16:21 ` Sebastian Andrzej Siewior
2020-04-23 19:48 ` [PREEMPT_RT] 8250 IRQ lockup when flooding serial console (was Re: [ANNOUNCE] v5.4.28-rt19) Jiri Kosina
2020-04-23 20:15 ` Steven Rostedt [this message]
2020-04-24 19:19 ` Sebastian Andrzej Siewior
2020-04-24 20:54 ` Jiri Kosina
2020-04-27 9:17 ` Sebastian Andrzej Siewior
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200423161541.49ca0ab3@gandalf.local.home \
--to=rostedt@goodmis.org \
--cc=bigeasy@linutronix.de \
--cc=dwagner@suse.de \
--cc=fweisbec@gmail.com \
--cc=jikos@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rt-users@vger.kernel.org \
--cc=matt@codeblueprint.co.uk \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).