All of lore.kernel.org
 help / color / mirror / Atom feed
From: Mikulas Patocka <mpatocka@redhat.com>
To: "Theodore Y. Ts'o" <tytso@mit.edu>
Cc: Petr Mladek <pmladek@suse.com>, Nigel Croxon <ncroxon@redhat.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	dm-devel@redhat.com, linux-serial@vger.kernel.org
Subject: Re: Serial console is causing system lock-up
Date: Wed, 6 Mar 2019 12:11:10 -0500 (EST)	[thread overview]
Message-ID: <alpine.LRH.2.02.1903061157530.3129@file01.intranet.prod.int.rdu2.redhat.com> (raw)
In-Reply-To: <20190306163003.GA31858@mit.edu>



On Wed, 6 Mar 2019, Theodore Y. Ts'o wrote:

> On Wed, Mar 06, 2019 at 11:07:55AM -0500, Mikulas Patocka wrote:
> > This bug only happens if we select large logbuffer (millions of 
> > characters). With smaller log buffer, there are messages "** X printk 
> > messages dropped", but there's no lockup.
> > 
> > The kernel apparently puts 2 million characters into a console log buffer, 
> > then takes some lock and than tries to write all of them to a slow serial 
> > line.
> 
> What are the messages; from what kernel subsystem?  Why are you seeing
> so many log messages?
> 
> 					- Ted

The dm-integity subsystem (drivers/md/dm-integrity.c) can be attached to a 
block device to provide checksum protection. It will return -EILSEQ and 
print a message to a log for every corrupted block.

Nigel Croxon was testing MD-RAID recovery capabilities in such a way that 
he activated RAID-5 array with one leg replaced by a dm-integrity block 
device that had all checksums invalid.

The MD-RAID is supposed to recalculate data for the corrupted device and 
bring it back to life. However, scrubbing the MD-RAID device resulted in a 
lot of reads from the device with bad checksums, these were reported to 
the log and killed the machine.


I made a patch to dm-integrity to rate-limit the error messages. But 
anyway - killing the machine in case of too many log messages seems bad. 
If the log messages are produced faster than the kernel can write them, 
the kernel should discard some of them, not kill itself.

Mikulas

  reply	other threads:[~2019-03-06 17:11 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-06 14:27 Serial console is causing system lock-up Mikulas Patocka
2019-03-06 15:22 ` Petr Mladek
2019-03-06 16:07   ` Mikulas Patocka
2019-03-06 16:30     ` Theodore Y. Ts'o
2019-03-06 17:11       ` Mikulas Patocka [this message]
2019-03-06 22:19         ` Steven Rostedt
2019-03-06 22:43           ` John Ogness
2019-03-07  2:22             ` Sergey Senozhatsky
2019-03-07  8:17               ` John Ogness
2019-03-07  8:25                 ` Sergey Senozhatsky
2019-03-07  8:34                   ` John Ogness
2019-03-07  9:17                     ` Sergey Senozhatsky
2019-03-07 10:37                       ` John Ogness
2019-03-07 12:26                         ` Sergey Senozhatsky
2019-03-07 12:54                           ` Mikulas Patocka
2019-03-07 14:21                           ` John Ogness
2019-03-07 15:35                             ` Petr Mladek
2019-03-12  2:32                             ` Sergey Senozhatsky
2019-03-12  8:17                               ` John Ogness
2019-03-12  8:59                                 ` Sergey Senozhatsky
2019-03-12 10:05                                 ` Mikulas Patocka
2019-03-12 13:19                                   ` John Ogness
2019-03-12 13:44                                     ` Petr Mladek
2019-03-12 12:08                                 ` Petr Mladek
2019-03-12 15:19                                   ` John Ogness
2019-03-13  2:38                                   ` Sergey Senozhatsky
2019-03-13  8:43                                     ` John Ogness
2019-03-14 10:30                                       ` Sergey Senozhatsky
2019-03-07 14:08             ` John Stoffel
2019-03-07 14:26               ` Mikulas Patocka
2019-03-08  1:22                 ` Sergey Senozhatsky
2019-03-08  1:39                   ` Sergey Senozhatsky
2019-03-08  2:36                     ` John Ogness
2019-03-07 15:16         ` Petr Mladek
2019-03-07  1:56     ` Sergey Senozhatsky
2019-03-07 13:12       ` Mikulas Patocka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LRH.2.02.1903061157530.3129@file01.intranet.prod.int.rdu2.redhat.com \
    --to=mpatocka@redhat.com \
    --cc=dm-devel@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=linux-serial@vger.kernel.org \
    --cc=ncroxon@redhat.com \
    --cc=pmladek@suse.com \
    --cc=rostedt@goodmis.org \
    --cc=sergey.senozhatsky@gmail.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.