All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mikulas Patocka <mpatocka@redhat.com>
To: Petr Mladek <pmladek@suse.com>
Cc: Nigel Croxon <ncroxon@redhat.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	dm-devel@redhat.com, linux-serial@vger.kernel.org
Subject: Re: Serial console is causing system lock-up
Date: Wed, 6 Mar 2019 11:07:55 -0500 (EST)	[thread overview]
Message-ID: <alpine.LRH.2.02.1903061031420.16905@file01.intranet.prod.int.rdu2.redhat.com> (raw)
In-Reply-To: <20190306152218.eocv4zulf7tv2mkc@pathway.suse.cz>



On Wed, 6 Mar 2019, Petr Mladek wrote:

> On Wed 2019-03-06 09:27:13, Mikulas Patocka wrote:
> > Hi
> > 
> > I was debugging some kernel lockup with storage drivers and it turned out 
> > that the lockup is caused by the serial console subsystem. If we use 
> > serial console and if we write to it excessively, the kernel sometimes 
> > lockup, sometimes reports rcu stalls and NMI backtraces. Sometimes it will 
> > just print the console messages without donig anything else.
> 
> This is a very old problem that we have been trying to solve for
> years. There are two conflicting requirements on printk():
> be fast and reliable.
> 
> The historical solution is that printk() callers store the messages
> into the log buffer and then just _try_ to take the console lock.
> The winner who succeeds is responsible for flushing all
> pending messages to the console. As a result a random victim
> might get blocked by the console handling for a long time.

This bug only happens if we select large logbuffer (millions of 
characters). With smaller log buffer, there are messages "** X printk 
messages dropped", but there's no lockup.

The kernel apparently puts 2 million characters into a console log buffer, 
then takes some lock and than tries to write all of them to a slow serial 
line.

> An obvious solution is offloading the console handling. But
> it is against the reliability. There are no guarantees that
> the offload mechanism (kthread, irq) would happen when the
> system is on their knees.
> 
> Anyway, which kernel version are you using, please?

RHEL8-4.18, Debian-4.19, Upstream 5.0. I didn't try older versions.

> I wonder if you already have the dbdda842fe96f8932 ("printk: Add
> console owner and waiter logic to load balance console writes").
> It improves the situation a lot. There was a hope that it would
> be enough in the real life.

Yes - this patch is present in the kernels that I tried.

> > This program tests the issue - on framebuffer console, the system is 
> > sluggish, but it is possible to unload the module with rmmod. On serial 
> > console, it locks up to the point that unloading the module is not 
> > possible.
> 
> Is there any chance to send us logs from the original (real life)
> problem, please?
> 
> Best regards,
> Petr

I uploaded the logs here: 
http://people.redhat.com/~mpatocka/testcases/console-lockup/

Mikulas

  reply	other threads:[~2019-03-06 16:07 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-06 14:27 Serial console is causing system lock-up Mikulas Patocka
2019-03-06 15:22 ` Petr Mladek
2019-03-06 16:07   ` Mikulas Patocka [this message]
2019-03-06 16:30     ` Theodore Y. Ts'o
2019-03-06 17:11       ` Mikulas Patocka
2019-03-06 22:19         ` Steven Rostedt
2019-03-06 22:43           ` John Ogness
2019-03-07  2:22             ` Sergey Senozhatsky
2019-03-07  8:17               ` John Ogness
2019-03-07  8:25                 ` Sergey Senozhatsky
2019-03-07  8:34                   ` John Ogness
2019-03-07  9:17                     ` Sergey Senozhatsky
2019-03-07 10:37                       ` John Ogness
2019-03-07 12:26                         ` Sergey Senozhatsky
2019-03-07 12:54                           ` Mikulas Patocka
2019-03-07 14:21                           ` John Ogness
2019-03-07 15:35                             ` Petr Mladek
2019-03-12  2:32                             ` Sergey Senozhatsky
2019-03-12  8:17                               ` John Ogness
2019-03-12  8:59                                 ` Sergey Senozhatsky
2019-03-12 10:05                                 ` Mikulas Patocka
2019-03-12 13:19                                   ` John Ogness
2019-03-12 13:44                                     ` Petr Mladek
2019-03-12 12:08                                 ` Petr Mladek
2019-03-12 15:19                                   ` John Ogness
2019-03-13  2:38                                   ` Sergey Senozhatsky
2019-03-13  8:43                                     ` John Ogness
2019-03-14 10:30                                       ` Sergey Senozhatsky
2019-03-07 14:08             ` John Stoffel
2019-03-07 14:26               ` Mikulas Patocka
2019-03-08  1:22                 ` Sergey Senozhatsky
2019-03-08  1:39                   ` Sergey Senozhatsky
2019-03-08  2:36                     ` John Ogness
2019-03-07 15:16         ` Petr Mladek
2019-03-07  1:56     ` Sergey Senozhatsky
2019-03-07 13:12       ` Mikulas Patocka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LRH.2.02.1903061031420.16905@file01.intranet.prod.int.rdu2.redhat.com \
    --to=mpatocka@redhat.com \
    --cc=dm-devel@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-serial@vger.kernel.org \
    --cc=ncroxon@redhat.com \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=sergey.senozhatsky@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.